ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.08680
  4. Cited By
Shape Matters: Understanding the Implicit Bias of the Noise Covariance

Shape Matters: Understanding the Implicit Bias of the Noise Covariance

15 June 2020
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
ArXivPDFHTML

Papers citing "Shape Matters: Understanding the Implicit Bias of the Noise Covariance"

18 / 68 papers shown
Title
Self-supervised Learning is More Robust to Dataset Imbalance
Self-supervised Learning is More Robust to Dataset Imbalance
Hong Liu
Jeff Z. HaoChen
Adrien Gaidon
Tengyu Ma
OOD
SSL
31
157
0
11 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
83
72
0
29 Sep 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
41
8
0
14 Jul 2021
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
10
17
0
09 Jul 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of
  Stochasticity
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity
Scott Pesme
Loucas Pillaud-Vivien
Nicolas Flammarion
22
98
0
17 Jun 2021
Label Noise SGD Provably Prefers Flat Global Minimizers
Label Noise SGD Provably Prefers Flat Global Minimizers
Alexandru Damian
Tengyu Ma
Jason D. Lee
NoLa
18
113
0
11 Jun 2021
Towards Understanding Generalization via Decomposing Excess Risk
  Dynamics
Towards Understanding Generalization via Decomposing Excess Risk Dynamics
Jiaye Teng
Jianhao Ma
Yang Yuan
21
4
0
11 Jun 2021
Why Do Local Methods Solve Nonconvex Problems?
Why Do Local Methods Solve Nonconvex Problems?
Tengyu Ma
13
13
0
24 Mar 2021
Stochasticity helps to navigate rough landscapes: comparing
  gradient-descent-based algorithms in the phase retrieval problem
Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem
Francesca Mignacco
Pierfrancesco Urbani
Lenka Zdeborová
14
33
0
08 Mar 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix
  Factorization
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
17
13
0
24 Feb 2021
Strength of Minibatch Noise in SGD
Strength of Minibatch Noise in SGD
Liu Ziyin
Kangqiao Liu
Takashi Mori
Masakuni Ueda
ODL
MLT
11
34
0
10 Feb 2021
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent
  with Moderate Learning Rate
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
6
18
0
04 Nov 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
20
28
0
09 Jul 2020
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate
  and Momentum
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Zeke Xie
Xinrui Wang
Huishuai Zhang
Issei Sato
Masashi Sugiyama
ODL
27
45
0
29 Jun 2020
Dynamic of Stochastic Gradient Descent with State-Dependent Noise
Dynamic of Stochastic Gradient Descent with State-Dependent Noise
Qi Meng
Shiqi Gong
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
6
16
0
24 Jun 2020
Dropout: Explicit Forms and Capacity Control
Dropout: Explicit Forms and Capacity Control
R. Arora
Peter L. Bartlett
Poorya Mianjy
Nathan Srebro
58
37
0
06 Mar 2020
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent
  Estimates
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
110
146
0
06 Nov 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
Previous
12