53

The Physics of Data and Tasks: Theories of Locality and Compositionality in Deep Learning

Main:12 Pages
92 Figures
3 Tables
Appendix:321 Pages
Abstract

Deep neural networks have achieved remarkable success, yet our understanding of how they learn remains limited. These models can learn high-dimensional tasks, which is generally statistically intractable due to the curse of dimensionality. This apparent paradox suggests that learnable data must have an underlying latent structure. What is the nature of this structure? How do neural networks encode and exploit it, and how does it quantitatively impact performance - for instance, how does generalization improve with the number of training examples? This thesis addresses these questions by studying the roles of locality and compositionality in data, tasks, and deep learning representations.

View on arXiv
Comments on this paper