A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 354 papers shown

Title
Dash: Semi-Supervised Learning with Dynamic Thresholding Yi Tian Xu Lei Shang Jinxing Ye Qi Qian Yu-Feng Li Baigui Sun Hao Li Rong Jin 47 218 0 01 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou Yuan Cao Yuanzhi Li Quanquan Gu MLT AI4CE 47 39 0 25 Aug 2021
Fast Sketching of Polynomial Kernels of Polynomial Degree Zhao Song David P. Woodruff Zheng Yu Lichen Zhang 21 40 0 21 Aug 2021
Learning Transferable Parameters for Unsupervised Domain Adaptation Zhongyi Han Haoliang Sun Yilong Yin OOD 35 45 0 13 Aug 2021
A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions Arnulf Jentzen Adrian Riekert 33 13 0 10 Aug 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach Haotian Gu Xin Guo Xiaoli Wei Renyuan Xu OOD 40 36 0 05 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games Baihe Huang Jason D. Lee Zhaoran Wang Zhuoran Yang 33 47 0 30 Jul 2021
Rethinking Hard-Parameter Sharing in Multi-Domain Learning Lijun Zhang Qizheng Yang Xiao Liu Hui Guan OOD 31 14 0 23 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition Minghao Chen Houwen Peng Jianlong Fu Haibin Ling ViT 38 259 0 01 Jul 2021
The Values Encoded in Machine Learning Research Abeba Birhane Pratyusha Kalluri Dallas Card William Agnew Ravit Dotan Michelle Bao 41 274 0 29 Jun 2021
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation Haoxiang Wang Han Zhao Bo-wen Li 37 88 0 16 Jun 2021
A Neural Tangent Kernel Perspective of GANs Jean-Yves Franceschi Emmanuel de Bézenac Ibrahim Ayed Mickaël Chen Sylvain Lamprier Patrick Gallinari 37 26 0 10 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization Mufan Li Mihai Nica Daniel M. Roy 32 33 0 07 Jun 2021
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning Zixin Wen Yuanzhi Li SSL MLT 32 131 0 31 May 2021
Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving Jingliang Duan Dongjie Yu Shengbo Eben Li Wenxuan Wang Yangang Ren Ziyu Lin B. Cheng 30 10 0 24 May 2021
Improved OOD Generalization via Adversarial Training and Pre-training Mingyang Yi Lu Hou Jiacheng Sun Lifeng Shang Xin Jiang Qun Liu Zhi-Ming Ma VLM 31 83 0 24 May 2021
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions Ameya Dilip Jagtap Yeonjong Shin Kenji Kawaguchi George Karniadakis ODL 45 131 0 20 May 2021
Global Convergence of Three-layer Neural Networks in the Mean Field Regime H. Pham Phan-Minh Nguyen MLT AI4CE 41 19 0 11 May 2021
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis Baihe Huang Xiaoxiao Li Zhao Song Xin Yang FedML 31 16 0 11 May 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features Zhihui Zhu Tianyu Ding Jinxin Zhou Xiao Li Chong You Jeremias Sulam Qing Qu 40 196 0 06 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization Saurabh Garg Sivaraman Balakrishnan J. Zico Kolter Zachary Chase Lipton 30 30 0 01 May 2021
Generalization Guarantees for Neural Architecture Search with Train-Validation Split Samet Oymak Mingchen Li Mahdi Soltanolkotabi AI4CE OOD 36 13 0 29 Apr 2021
Understanding Overparameterization in Generative Adversarial Networks Yogesh Balaji M. Sajedi Neha Kalibhat Mucong Ding Dominik Stöger Mahdi Soltanolkotabi S. Feizi AI4CE 22 21 0 12 Apr 2021
A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity Seo Taek Kong Soomin Jeon Dongbin Na Jaewon Lee Honglak Lee Kyu-Hwan Jung 23 6 0 08 Apr 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions Arnulf Jentzen Adrian Riekert MLT 34 13 0 01 Apr 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation Q. Du Yiqi Gu Haizhao Yang Chao Zhou 26 20 0 21 Mar 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions Patrick Cheridito Arnulf Jentzen Florian Rossmannek 24 10 0 19 Mar 2021
GPT Understands, Too Xiao Liu Yanan Zheng Zhengxiao Du Ming Ding Yujie Qian Zhilin Yang Jie Tang VLM 87 1,146 0 18 Mar 2021
Experiments with Rich Regime Training for Deep Learning Xinyan Li A. Banerjee 32 2 0 26 Feb 2021
Entanglement Diagnostics for Efficient Quantum Computation Joonho Kim Yaron Oz 29 10 0 24 Feb 2021
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) Zhiyuan Li Sadhika Malladi Sanjeev Arora 44 78 0 24 Feb 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization Tianyi Liu Yan Li S. Wei Enlu Zhou T. Zhao 21 13 0 24 Feb 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases Arnulf Jentzen T. Kröger ODL 28 7 0 23 Feb 2021
Deep ReLU Networks Preserve Expected Length Boris Hanin Ryan Jeong David Rolnick 29 14 0 21 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions Patrick Cheridito Arnulf Jentzen Adrian Riekert Florian Rossmannek 28 24 0 19 Feb 2021
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization Xiaoxiao Li Meirui Jiang Xiaofei Zhang Michael Kamp Qi Dou OOD FedML 168 790 0 15 Feb 2021
A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network Mo Zhou Rong Ge Chi Jin 76 45 0 04 Feb 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training Cong Fang Hangfeng He Qi Long Weijie J. Su FAtt 130 167 0 29 Jan 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 45 48 0 24 Jan 2021
Reproducing Activation Function for Deep Learning Senwei Liang Liyao Lyu Chunmei Wang Haizhao Yang 36 21 0 13 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks Asaf Noy Yi Tian Xu Y. Aflalo Lihi Zelnik-Manor Rong Jin 41 3 0 12 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise Spencer Frei Yuan Cao Quanquan Gu FedML MLT 64 19 0 04 Jan 2021
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training Theodoros Tsiligkaridis Jay Roberts AAML 22 11 0 22 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks Quynh N. Nguyen Marco Mondelli Guido Montúfar 25 81 0 21 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 60 356 0 17 Dec 2020
Estimation of the Mean Function of Functional Data via Deep Neural Networks Shuoyang Wang Guanqun Cao Zuofeng Shang 35 20 0 08 Dec 2020
Gradient Starvation: A Learning Proclivity in Neural Networks Mohammad Pezeshki Sekouba Kaba Yoshua Bengio Aaron Courville Doina Precup Guillaume Lajoie MLT 50 258 0 18 Nov 2020
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting Zeke Xie Fengxiang He Shaopeng Fu Issei Sato Dacheng Tao Masashi Sugiyama 21 60 0 12 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces Zhuoran Yang Chi Jin Zhaoran Wang Mengdi Wang Michael I. Jordan 39 18 0 09 Nov 2020
Are wider nets better given the same number of parameters? A. Golubeva Behnam Neyshabur Guy Gur-Ari 27 44 0 27 Oct 2020