CAM Colloquium: Ziv Goldfeld (ECE, Cornell) - Gaussian-Smoothed Optimal Transport: Metric Structure and Statistical Efficiency

Location

Frank H. T. Rhodes Hall 655

Description

Abstract: Optimal transport (OT), and in particular the Wasserstein distance, has seen a surge of interest and applications in machine learning (ML). This popularity is driven by many advantageous properties of the Wasserstein distance, such as its robustness to mismatched supports, the fact it is a metric on the space of probability measures, and that it metrizes weak* convergence. At the heart of ML is the ability to empirically approximate a probability measure using data generated from it. Unfortunately, empirical approximation under Wasserstein distances suffers from a severe `curse of dimensionality', namely, convergence at rate n^{-1/d}, which drastically deteriorate with dimension. Given high dimensionality of modern ML tasks, this is highly problematic. As a result, entropically regularized OT has become a common workaround, but while it enjoys fast algorithms and better statistical properties, it looses the Wasserstein metric structure. This talk proposes a novel Gaussian-smoothed OT (GOT) framework, that achieves the best of both worlds: preserving the Wasserstein metric structure while alleviating the empirical approximation curse of dimensionality. GOT is simply the Wasserstein distance between the measures of interest after each is smoothed by (convolved with) an isotropic Gaussian kernel. It inherits all the metric properties of classic Wasserstein distances, and is a well-behaved (continuous and monotonic) function of the Gaussian-smoothing parameter. Furthermore, as the smoothing parameter shrinks to zero, GOT $\Gamma$-converges towards classic OT (with convergence of optimizers), thus serving as a natural extension. For the statistical point of view, empirical approximation under GOT attains a O(n^{-1/2}) convergence rate in all dimensions, thus alleviating the curse of dimensionality. A discussion of how this GOT fast convergence enables measuring information flows in deep neural networks and applications to generative modeling will follow. Bio: Ziv Goldfeld is an assistant professor in the School of Electrical and Computer Engineering at Cornell University. Before that, he was a postdoctoral researcher in the Laboratory for Information and Decision Systems (LIDS) at MIT. Ziv received his B.Sc., M.Sc. and Ph.D. in the Department of Electrical and Computer Engineering at Ben Gurion University of the Negev, Israel. Ziv's research interests include statistical machine learning, information theory, high-dimensional and nonparametric statistic, applied probability and interacting particle systems. Recently, he focused on information-theoretic analysis of deep neural networks (DNNs), convergence rates of empirical measures smoothed by Gaussian kernels, and data storage in stochastic Ising models. Other interest include physical layer security, cooperation in multiuser information-theoretic problems and multiuser channel and source duality. Ziv is a recipient of the Rothschild postdoctoral fellowship, the Ben Gurion postdoctoral fellowship, the Feder Award, the best student paper award in the IEEE 28th Convention of Electrical and Electronics Engineers in Israel, the Lev-Zion fellowship and the Minerva Short-Term Research Grant.