On the different regimes of Stochastic Gradient Descent

19 September 2023

Papers citing "On the different regimes of Stochastic Gradient Descent"

10 / 10 papers shown

Title
Class Imbalance in Anomaly Detection: Learning from an Exactly Solvable Model F.S. Pezzicoli V. Ros F.P. Landes M. Baity-Jesi 42 1 0 20 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism Tim Tsz-Kit Lau Weijian Li Chenwei Xu Han Liu Mladen Kolar 147 0 0 30 Dec 2024
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon C. Pehlevan 43 2 0 06 Oct 2024
A spring-block theory of feature learning in deep neural networks Chengzhi Shi Liming Pan Ivan Dokmanić AI4CE 40 1 0 28 Jul 2024
Stochastic weight matrix dynamics during learning and Dyson Brownian motion Gert Aarts B. Lucini Chanju Park 23 1 0 23 Jul 2024
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs Luca Arnaboldi Yatin Dandi Florent Krzakala Bruno Loureiro Luca Pesce Ludovic Stephan 47 1 0 04 Jun 2024
An effective theory of collective deep learning Lluís Arola-Fernández Lucas Lacasa FedML AI4CE 18 2 0 19 Oct 2023
Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion Chengli Tan Jiang Zhang Junmin Liu 35 1 0 09 Jun 2022
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 234 0 04 Mar 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 281 2,889 0 15 Sep 2016