Density estimation on low-dimensional manifolds

Horvat, Christian (2022). Density estimation on low-dimensional manifolds (Unpublished). (Dissertation)

Text
horvat_christian_PhDThesis.pdf - Other
Restricted to registered users only
Available under License BORIS Standard License.
Download (18MB) | Request a copy

Machine learning models large datasets in potentially high dimensions using the mathematical rigor of probability theory. A fundamental assumption is that there is a latent variable $Z\in \mathbb{R}^{d}$, latent density $\pi(z)$, and a generator mapping $f$ such that the data are realizations of the random variable $f(Z)=X \in \mathbb{R}^{D}$ with density $p(x)$. A special case of that setting is where $f$ is an embedding, i.e. a continuously differentiable mapping with a continuously differentiable inverse. If $d<D$, this special case is often referred to as manifold hypothesis, i.e. high dimensional data populate a low dimensional manifold in the embedding space. Normalizing Flows (NFs) are bijective neural networks which can be used to learn any $p(x)$ with support diffeomorphic to $\mathbb{R}^{D}$, i.e. NFs learn $f$ exactly when $d=D$. However, when $d<D$, standard NFs fail to learn $f$ and therefore $p(x)$. In this thesis, we show how we can overcome this topological constraint of standard NF (first main result). We prove that by adding a specific noise in the manifold's normal space, we can still learn $p(x)$ exactly using a standard NF. When using standard Gaussian instead of a Gaussian in the manifold's normal space, our method can be used to approximate any density $p(x)$ supported on an unknown low-dimensional manifold. Based on this theoretical foundation, we will show that we can not only learn $f$ and $p(x)$, but also the inverse $f^{-1}$ which allows us to compress the data into low dimensions (second main result). The method, coined denoising normalizing flow (DNF), learns a denoising mapping after inflating the data with standard Gaussian noise and is trained such that the first $d$ latent variables are noise insensitive and thus encode the manifold. However, this requires knowing $d$ a priori which limits the applicability of the DNF in real-world scenarios where this number is unknown. Existing methods to estimate $d$ do not scale to large dimensions. We provide a new method able to estimate $d$ also in this high-dimensional case (third main result).

Item Type:	Thesis (Dissertation)
Division/Institute:	04 Faculty of Medicine > Pre-clinic Human Medicine > Institute of Physiology
Graduate School:	Graduate School for Cellular and Biomedical Sciences (GCB)
UniBE Contributor:	Horvat, Christian
Subjects:	600 Technology > 610 Medicine & health
Funders:	[4] Swiss National Science Foundation
Language:	English
Submitter:	Christian Horvat
Date Deposited:	28 Dec 2023 07:12
Last Modified:	28 Dec 2023 07:12
BORIS DOI:	10.48350/190721
URI:	https://boris.unibe.ch/id/eprint/190721

Actions (login required)

Edit item

Density estimation on low-dimensional manifolds

Interest & Impact

Downloads

Citations

Search

Services

Actions (login required)

Item Type:

Division/Institute:

Graduate School:

UniBE Contributor:

Subjects:

Funders:

Language:

Submitter:

Date Deposited:

Last Modified:

BORIS DOI:

URI:

Actions (login required)