Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.11371
Cited By
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
25 August 2021
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization"
9 / 9 papers shown
Title
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
59
0
0
11 Apr 2025
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
31
1
0
12 Apr 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
38
10
0
06 Feb 2024
Revisiting Knowledge Distillation under Distribution Shift
Songming Zhang
Ziyu Lyu
Xiaofeng Chen
26
1
0
25 Dec 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
29
5
0
20 Jun 2023
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
27
53
0
04 Aug 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
27
34
0
12 May 2022
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
56
143
0
05 Mar 2020
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
V. Papyan
Yaniv Romano
Michael Elad
56
284
0
27 Jul 2016
1