Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.08680
Cited By
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
15 June 2020
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Shape Matters: Understanding the Implicit Bias of the Noise Covariance"
18 / 68 papers shown
Title
Self-supervised Learning is More Robust to Dataset Imbalance
Hong Liu
Jeff Z. HaoChen
Adrien Gaidon
Tengyu Ma
OOD
SSL
31
157
0
11 Oct 2021
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
83
72
0
29 Sep 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
41
8
0
14 Jul 2021
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
10
17
0
09 Jul 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
22
98
0
17 Jun 2021
Label Noise SGD Provably Prefers Flat Global Minimizers
Alexandru Damian
Tengyu Ma
Jason D. Lee
NoLa
18
113
0
11 Jun 2021
Towards Understanding Generalization via Decomposing Excess Risk Dynamics
Jiaye Teng
Jianhao Ma
Yang Yuan
21
4
0
11 Jun 2021
Why Do Local Methods Solve Nonconvex Problems?
Tengyu Ma
13
13
0
24 Mar 2021
Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem
Francesca Mignacco
Pierfrancesco Urbani
Lenka Zdeborová
14
33
0
08 Mar 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
17
13
0
24 Feb 2021
Strength of Minibatch Noise in SGD
Liu Ziyin
Kangqiao Liu
Takashi Mori
Masakuni Ueda
ODL
MLT
11
34
0
10 Feb 2021
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
6
18
0
04 Nov 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
20
28
0
09 Jul 2020
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Zeke Xie
Xinrui Wang
Huishuai Zhang
Issei Sato
Masashi Sugiyama
ODL
27
45
0
29 Jun 2020
Dynamic of Stochastic Gradient Descent with State-Dependent Noise
Qi Meng
Shiqi Gong
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
6
16
0
24 Jun 2020
Dropout: Explicit Forms and Capacity Control
R. Arora
Peter L. Bartlett
Poorya Mianjy
Nathan Srebro
58
37
0
06 Mar 2020
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
110
146
0
06 Nov 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
Previous
1
2