The Two Regimes of Deep Network Training

24 February 2020

Papers citing "The Two Regimes of Deep Network Training"

9 / 9 papers shown

Title
Can Optimization Trajectories Explain Multi-Task Transfer? David Mueller Mark Dredze Nicholas Andrews 61 1 0 26 Aug 2024
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 40 121 0 03 May 2022
Improved architectures and training algorithms for deep operator networks Sizhuang He Hanwen Wang P. Perdikaris AI4CE 52 105 0 04 Oct 2021
Learning to Optimize: A Primer and A Benchmark Tianlong Chen Xiaohan Chen Wuyang Chen Howard Heaton Jialin Liu Zhangyang Wang W. Yin 40 225 0 23 Mar 2021
Provable Super-Convergence with a Large Cyclical Learning Rate Samet Oymak 33 12 0 22 Feb 2021
Deep Networks and the Multiple Manifold Problem Sam Buchanan D. Gilboa John N. Wright 166 39 0 25 Aug 2020
Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities Alina Ene Huy Le Nguyen Adrian Vladu ODL 30 28 0 17 Jul 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 234 0 04 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,890 0 15 Sep 2016