ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08947
  4. Cited By
Exploring Generalization in Deep Learning

Exploring Generalization in Deep Learning

27 June 2017
Behnam Neyshabur
Srinadh Bhojanapalli
David A. McAllester
Nathan Srebro
    FAtt
ArXivPDFHTML

Papers citing "Exploring Generalization in Deep Learning"

50 / 766 papers shown
Title
Lookbehind-SAM: k steps back, 1 step forward
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
45
1
0
31 Jul 2023
High Probability Analysis for Non-Convex Stochastic Optimization with
  Clipping
High Probability Analysis for Non-Convex Stochastic Optimization with Clipping
Shaojie Li
Yong Liu
35
2
0
25 Jul 2023
Iterative Robust Visual Grounding with Masked Reference based
  Centerpoint Supervision
Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision
Menghao Li
Chunlei Wang
W. Feng
Shuchang Lyu
Guangliang Cheng
Xiangtai Li
Binghao Liu
Qi Zhao
33
5
0
23 Jul 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To
  Achieve Better Generalization
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
38
26
0
20 Jul 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Gonçalo Mordido
A. Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Simon Lacoste-Julien
Razvan Pascanu
Sarath Chandar
ODL
33
1
0
18 Jul 2023
Sharpness-Aware Graph Collaborative Filtering
Sharpness-Aware Graph Collaborative Filtering
Huiyuan Chen
Chin-Chia Michael Yeh
Yujie Fan
Yan Zheng
Junpeng Wang
Vivian Lai
Mahashweta Das
Hao Yang
31
5
0
18 Jul 2023
Towards Optimal Neural Networks: the Role of Sample Splitting in
  Hyperparameter Selection
Towards Optimal Neural Networks: the Role of Sample Splitting in Hyperparameter Selection
Shijin Gong
Xinyu Zhang
13
0
0
15 Jul 2023
Sparsity-aware generalization theory for deep neural networks
Sparsity-aware generalization theory for deep neural networks
Ramchandran Muthukumar
Jeremias Sulam
MLT
24
4
0
01 Jul 2023
Systematic Investigation of Sparse Perturbed Sharpness-Aware
  Minimization Optimizer
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Tianshuo Xu
Xiaoshuai Sun
Tongliang Liu
Rongrong Ji
Dacheng Tao
AAML
37
2
0
30 Jun 2023
Variance-Covariance Regularization Improves Representation Learning
Variance-Covariance Regularization Improves Representation Learning
Jiachen Zhu
Katrina Evtimova
Yubei Chen
Ravid Shwartz-Ziv
Yann LeCun
SSL
25
7
0
23 Jun 2023
Predicting Grokking Long Before it Happens: A look into the loss
  landscape of models which grok
Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
Pascal Junior Tikeng Notsawo
Hattie Zhou
Mohammad Pezeshki
Irina Rish
G. Dumas
25
23
0
23 Jun 2023
The Inductive Bias of Flatness Regularization for Deep Matrix
  Factorization
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization
Khashayar Gatmiry
Zhiyuan Li
Ching-Yao Chuang
Sashank J. Reddi
Tengyu Ma
Stefanie Jegelka
ODL
25
11
0
22 Jun 2023
Limits for Learning with Language Models
Limits for Learning with Language Models
Nicholas M. Asher
Swarnadeep Bhar
Akshay Chaturvedi
Julie Hunter
Soumya Paul
19
22
0
21 Jun 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to
  Optima
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
Dongkuk Si
Chulhee Yun
28
15
0
16 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias
  for Correlated Inputs
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
36
12
0
10 Jun 2023
Adaptive Contextual Perception: How to Generalize to New Backgrounds and
  Ambiguous Objects
Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects
Zhuofan Ying
Peter Hase
Joey Tianyi Zhou
31
1
0
09 Jun 2023
Gibbs-Based Information Criteria and the Over-Parameterized Regime
Gibbs-Based Information Criteria and the Over-Parameterized Regime
Haobo Chen
Yuheng Bu
Greg Wornell
27
1
0
08 Jun 2023
Boosting Adversarial Transferability by Achieving Flat Local Maxima
Boosting Adversarial Transferability by Achieving Flat Local Maxima
Zhijin Ge
Hongying Liu
Xiaosen Wang
Fanhua Shang
Yuanyuan Liu
AAML
14
40
0
08 Jun 2023
Catapults in SGD: spikes in the training loss and their impact on
  generalization through feature learning
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
32
14
0
07 Jun 2023
Optimal Transport Model Distributional Robustness
Optimal Transport Model Distributional Robustness
Van-Anh Nguyen
Trung Le
Anh Tuan Bui
Thanh-Toan Do
Dinh Q. Phung
OOD
30
3
0
07 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically
  Equivalent
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Mingli Song
Dacheng Tao
34
15
0
05 Jun 2023
On Feature Diversity in Energy-based Models
On Feature Diversity in Energy-based Models
Firas Laakom
Jenni Raitoharju
Alexandros Iosifidis
Moncef Gabbouj
29
7
0
02 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear
  Networks
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
40
17
0
01 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning
  via RKHS Approximation and Regression
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
28
9
0
01 Jun 2023
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement
  Discrepancy
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy
Elan Rosenfeld
Saurabh Garg
UQCV
37
4
0
01 Jun 2023
Online-to-PAC Conversions: Generalization Bounds via Regret Analysis
Online-to-PAC Conversions: Generalization Bounds via Regret Analysis
Gábor Lugosi
Gergely Neu
35
11
0
31 May 2023
Instance-dependent Noisy-label Learning with Graphical Model Based
  Noise-rate Estimation
Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation
Arpit Garg
Cuong C. Nguyen
Rafael Felix
Thanh-Toan Do
G. Carneiro
NoLa
41
1
0
31 May 2023
Quantifying Overfitting: Evaluating Neural Network Performance through
  Analysis of Null Space
Quantifying Overfitting: Evaluating Neural Network Performance through Analysis of Null Space
Hossein Rezaei
Mohammad Sabokrou
18
3
0
30 May 2023
Reducing Communication for Split Learning by Randomized Top-k
  Sparsification
Reducing Communication for Split Learning by Randomized Top-k Sparsification
Fei Zheng
Chaochao Chen
Lingjuan Lyu
Binhui Yao
FedML
26
10
0
29 May 2023
The Implicit Regularization of Dynamical Stability in Stochastic
  Gradient Descent
The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent
Lei Wu
Weijie J. Su
MLT
30
21
0
27 May 2023
Stochastic Modified Equations and Dynamics of Dropout Algorithm
Stochastic Modified Equations and Dynamics of Dropout Algorithm
Zhongwang Zhang
Yuqing Li
Tao Luo
Z. Xu
19
6
0
25 May 2023
How to escape sharp minima with random perturbations
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
34
6
0
25 May 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization
The Crucial Role of Normalization in Sharpness-Aware Minimization
Yan Dai
Kwangjun Ahn
S. Sra
21
17
0
24 May 2023
Sharpness-Aware Data Poisoning Attack
Sharpness-Aware Data Poisoning Attack
Pengfei He
Han Xu
J. Ren
Yingqian Cui
Hui Liu
Charu C. Aggarwal
Jiliang Tang
AAML
44
7
0
24 May 2023
On progressive sharpening, flat minima and generalisation
On progressive sharpening, flat minima and generalisation
L. MacDonald
Jack Valmadre
Simon Lucey
27
4
0
24 May 2023
Exploring the Complexity of Deep Neural Networks through Functional
  Equivalence
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
Guohao Shen
30
2
0
19 May 2023
Generalization Bounds for Neural Belief Propagation Decoders
Generalization Bounds for Neural Belief Propagation Decoders
Sudarshan Adiga
Xin Xiao
Ravi Tandon
Bane V. Vasic
Tamal Bose
BDL
AI4CE
24
4
0
17 May 2023
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree
  Spectral Bias of Neural Networks
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree Spectral Bias of Neural Networks
Ali Gorji
Andisheh Amrollahi
A. Krause
11
4
0
16 May 2023
Sharpness-Aware Minimization Alone can Improve Adversarial Robustness
Sharpness-Aware Minimization Alone can Improve Adversarial Robustness
Zeming Wei
Jingyu Zhu
Yihao Zhang
AAML
32
10
0
09 May 2023
Maintaining Stability and Plasticity for Predictive Churn Reduction
Maintaining Stability and Plasticity for Predictive Churn Reduction
George Adam
B. Haibe-Kains
Anna Goldenberg
28
1
0
06 May 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
83
31
0
28 Apr 2023
Fundamental Tradeoffs in Learning with Prior Information
Fundamental Tradeoffs in Learning with Prior Information
Anirudha Majumdar
32
0
0
26 Apr 2023
K-means Clustering Based Feature Consistency Alignment for Label-free
  Model Evaluation
K-means Clustering Based Feature Consistency Alignment for Label-free Model Evaluation
Shuyu Miao
Lin Zheng
Jiaheng Liu
and Hong Jin
36
5
0
17 Apr 2023
MLOps Spanning Whole Machine Learning Life Cycle: A Survey
MLOps Spanning Whole Machine Learning Life Cycle: A Survey
Fang Zhengxin
Yuan Yi
Zhang Jingyu
Liu Yue
Mu Yuechen
...
Xu Xiwei
Wang Jeff
Wang Chen
Zhang Shuai
Chen Shiping
24
4
0
13 Apr 2023
On the Importance of Feature Separability in Predicting
  Out-Of-Distribution Error
On the Importance of Feature Separability in Predicting Out-Of-Distribution Error
Renchunzi Xie
Hongxin Wei
Lei Feng
Yuzhou Cao
Bo An
OODD
OOD
22
10
0
27 Mar 2023
Generalization Matters: Loss Minima Flattening via Parameter
  Hybridization for Efficient Online Knowledge Distillation
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Mingli Song
Mingli Song
28
5
0
26 Mar 2023
PASS: Peer-Agreement based Sample Selection for training with Noisy
  Labels
PASS: Peer-Agreement based Sample Selection for training with Noisy Labels
Arpit Garg
Cuong C. Nguyen
Rafael Felix
Thanh-Toan Do
G. Carneiro
22
2
0
20 Mar 2023
Randomized Adversarial Training via Taylor Expansion
Randomized Adversarial Training via Taylor Expansion
Gao Jin
Xinping Yi
Dengyu Wu
Ronghui Mu
Xiaowei Huang
AAML
44
34
0
19 Mar 2023
Bayes Complexity of Learners vs Overfitting
Bayes Complexity of Learners vs Overfitting
Grzegorz Gluch
R. Urbanke
UQCV
BDL
11
0
0
13 Mar 2023
Generalizing and Decoupling Neural Collapse via Hyperspherical
  Uniformity Gap
Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap
Weiyang Liu
L. Yu
Adrian Weller
Bernhard Schölkopf
37
18
0
11 Mar 2023
Previous
12345...141516
Next