Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

12 November 2018

Papers citing "Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers"

50 / 498 papers shown

Title
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels Sina Alemohammad Hossein Babaei Randall Balestriero Matt Y. Cheung Ahmed Imtiaz Humayun ... Naiming Liu Lorenzo Luzi Jasper Tan Zichao Wang Richard G. Baraniuk 9 4 0 27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 33 17 0 25 Oct 2020
On Convergence and Generalization of Dropout Training Poorya Mianjy R. Arora 37 30 0 23 Oct 2020
Train simultaneously, generalize better: Stability of gradient-based minimax learners Farzan Farnia Asuman Ozdaglar 31 47 0 23 Oct 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime Andrea Agazzi Jianfeng Lu 13 15 0 22 Oct 2020
Deep Learning is Singular, and That's Good Daniel Murfet Susan Wei Biwei Huang Hui Li Jesse Gell-Redman T. Quella UQCV 24 26 0 22 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery Xiaoxiao Li Yangsibo Huang Binghui Peng Zhao Song Keqin Li MIACV 30 1 0 22 Oct 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition Xiang Wang Chenwei Wu J. Lee Tengyu Ma Rong Ge 16 14 0 22 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers Preetum Nakkiran Behnam Neyshabur Hanie Sedghi OffRL 29 11 0 16 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 30 86 0 30 Sep 2020
Learning Deep ReLU Networks Is Fixed-Parameter Tractable Sitan Chen Adam R. Klivans Raghu Meka 22 36 0 28 Sep 2020
Small Data, Big Decisions: Model Selection in the Small-Data Regime J. Bornschein Francesco Visin Simon Osindero 21 36 0 26 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks Keyulu Xu Mozhi Zhang Jingling Li S. Du Ken-ichi Kawarabayashi Stefanie Jegelka MLT 25 306 0 24 Sep 2020
Tensor Programs III: Neural Matrix Laws Greg Yang 19 44 0 22 Sep 2020
Distributional Generalization: A New Kind of Generalization Preetum Nakkiran Yamini Bansal OOD 29 41 0 17 Sep 2020
Deep Networks and the Multiple Manifold Problem Sam Buchanan D. Gilboa John N. Wright 166 39 0 25 Aug 2020
Nonparametric Learning of Two-Layer ReLU Residual Units Zhunxuan Wang Linyun He Chunchuan Lyu Shay B. Cohen MLT OffRL 33 1 0 17 Aug 2020
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network Based Vector-to-Vector Regression Jun Qi Jun Du Sabato Marco Siniscalchi Xiaoli Ma Chin-Hui Lee 30 41 0 04 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu Zhuoran Yang Zhaoran Wang 21 42 0 02 Aug 2020
Finite Versus Infinite Neural Networks: an Empirical Study Jaehoon Lee S. Schoenholz Jeffrey Pennington Ben Adlam Lechao Xiao Roman Novak Jascha Narain Sohl-Dickstein 28 208 0 31 Jul 2020
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training Andrea Montanari Yiqiao Zhong 49 95 0 25 Jul 2020
Understanding Implicit Regularization in Over-Parameterized Single Index Model Jianqing Fan Zhuoran Yang Mengxin Yu 24 16 0 16 Jul 2020
From deep to Shallow: Equivalent Forms of Deep Networks in Reproducing Kernel Krein Space and Indefinite Support Vector Machines A. Shilton Sunil Gupta Santu Rana Svetha Venkatesh 19 0 0 15 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth J. Lee Nathan Srebro Daniel Soudry 35 85 0 13 Jul 2020
Maximum-and-Concatenation Networks Xingyu Xie Hao Kong Jianlong Wu Wayne Zhang Guangcan Liu Zhouchen Lin 83 2 0 09 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK Yuanzhi Li Tengyu Ma Hongyang R. Zhang MLT 20 28 0 09 Jul 2020
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to Ridgelet Spectrum Sho Sonoda Isao Ishikawa Masahiro Ikeda MLT 14 0 0 07 Jul 2020
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr Xingjian Li Haoyi Xiong Haozhe An Chengzhong Xu Dejing Dou ODL 20 39 0 07 Jul 2020
Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding Michela Paganini Jessica Zosa Forde UQCV 14 6 0 06 Jul 2020
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks Cong Fang J. Lee Pengkun Yang Tong Zhang OOD FedML 9 57 0 03 Jul 2020
The Global Landscape of Neural Networks: An Overview Ruoyu Sun Dawei Li Shiyu Liang Tian Ding R. Srikant 22 84 0 02 Jul 2020
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach Luofeng Liao You-Lin Chen Zhuoran Yang Bo Dai Zhaoran Wang Mladen Kolar 30 33 0 02 Jul 2020
A Revision of Neural Tangent Kernel-based Approaches for Neural Networks Kyungsu Kim A. Lozano Eunho Yang AAML 40 0 0 02 Jul 2020
Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution Hadi Pouransari Mojan Javaheripi Vinay Sharma Oncel Tuzel 14 5 0 30 Jun 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders Yibo Jiang Cengiz Pehlevan 19 13 0 30 Jun 2020
Is SGD a Bayesian sampler? Well, almost Chris Mingard Guillermo Valle Pérez Joar Skalse A. Louis BDL 23 51 0 26 Jun 2020
Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural Nets Haoxiang Wang Ruoyu Sun Bo Li MLT AI4CE 30 14 0 25 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture Greg Yang 58 135 0 25 Jun 2020
Towards Understanding Hierarchical Learning: Benefits of Neural Representations Minshuo Chen Yu Bai J. Lee T. Zhao Huan Wang Caiming Xiong R. Socher SSL 20 48 0 24 Jun 2020
On the Global Optimality of Model-Agnostic Meta-Learning Lingxiao Wang Qi Cai Zhuoran Yang Zhaoran Wang 22 43 0 23 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time Jan van den Brand Binghui Peng Zhao Song Omri Weinstein ODL 29 82 0 20 Jun 2020
The Recurrent Neural Tangent Kernel Sina Alemohammad Zichao Wang Randall Balestriero Richard Baraniuk AAML 11 77 0 18 Jun 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks Umut Simsekli Ozan Sener George Deligiannidis Murat A. Erdogdu 44 55 0 16 Jun 2020
PAC-Bayesian Generalization Bounds for MultiLayer Perceptrons Xinjie Lan Xin Guo Kenneth Barner 19 3 0 16 Jun 2020
CNN Acceleration by Low-rank Approximation with Quantized Factors Nikolay Kozyrskiy Anh-Huy Phan MQ 33 3 0 16 Jun 2020
Minimax Estimation of Conditional Moment Models Nishanth Dikkala Greg Lewis Lester W. Mackey Vasilis Syrgkanis 27 99 0 12 Jun 2020
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers Yonatan Dukler Quanquan Gu Guido Montúfar 14 30 0 11 Jun 2020
Knowledge Distillation: A Survey Jianping Gou B. Yu Stephen J. Maybank Dacheng Tao VLM 21 2,851 0 09 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 144 11 0 08 Jun 2020
Hardness of Learning Neural Networks with Natural Weights Amit Daniely Gal Vardi 6 19 0 05 Jun 2020