ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10654
  4. Cited By
On the Maximum Hessian Eigenvalue and Generalization
v1v2v3 (latest)

On the Maximum Hessian Eigenvalue and Generalization

21 June 2022
Simran Kaur
Jérémy E. Cohen
Zachary Chase Lipton
ArXiv (abs)PDFHTML

Papers citing "On the Maximum Hessian Eigenvalue and Generalization"

36 / 36 papers shown
Title
Do We Need All the Synthetic Data? Towards Targeted Synthetic Image Augmentation via Diffusion Models
Do We Need All the Synthetic Data? Towards Targeted Synthetic Image Augmentation via Diffusion Models
Dang Nguyen
Jiping Li
Jinghao Zheng
Baharan Mirzasoleiman
DiffM
20
0
0
27 May 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
146
7
0
24 Feb 2025
Can Stability be Detrimental? Better Generalization through Gradient
  Descent Instabilities
Can Stability be Detrimental? Better Generalization through Gradient Descent Instabilities
Lawrence Wang
Stephen J. Roberts
91
0
0
23 Dec 2024
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li
Liansheng Zhuang
Xiao Long
Minghong Yao
Shafei Wang
521
1
0
18 Dec 2024
Meta Curvature-Aware Minimization for Domain Generalization
Meta Curvature-Aware Minimization for Domain Generalization
Zhaoyu Chen
Yiwen Ye
Feilong Tang
Yongsheng Pan
Yong-quan Xia
BDL
449
1
0
16 Dec 2024
Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of
  Expectation in the Loss Landscape
Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape
Jed A. Duersch
Tommie A. Catanach
Alexander Safonov
Jeremy Wendt
157
0
0
25 Nov 2024
Where Do Large Learning Rates Lead Us?
Where Do Large Learning Rates Lead Us?
Ildus Sadrtdinov
M. Kodryan
Eduard Pokonechny
E. Lobacheva
Dmitry Vetrov
AI4CE
104
1
0
29 Oct 2024
Transformer-Based Approaches for Sensor-Based Human Activity
  Recognition: Opportunities and Challenges
Transformer-Based Approaches for Sensor-Based Human Activity Recognition: Opportunities and Challenges
Clayton Frederick Souza Leite
Henry Mauranen
Aziza Zhanabatyrova
Yu Xiao
69
2
0
17 Oct 2024
Building a Multivariate Time Series Benchmarking Datasets Inspired by
  Natural Language Processing (NLP)
Building a Multivariate Time Series Benchmarking Datasets Inspired by Natural Language Processing (NLP)
Mohammad Asif Ibna Mustafa
Ferdinand Heinrich
AI4TS
135
0
0
14 Oct 2024
Bilateral Sharpness-Aware Minimization for Flatter Minima
Bilateral Sharpness-Aware Minimization for Flatter Minima
Jiaxin Deng
Junbiao Pang
Baochang Zhang
Qingming Huang
AAML
452
0
0
20 Sep 2024
Can Optimization Trajectories Explain Multi-Task Transfer?
Can Optimization Trajectories Explain Multi-Task Transfer?
David Mueller
Mark Dredze
Nicholas Andrews
140
1
0
26 Aug 2024
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM
  Dynamics
Forget Sharpness: Perturbed Forgetting of Model Biases Within SAM Dynamics
Ankit Vani
Frederick Tung
Gabriel L. Oliveira
Hossein Sharifi-Noghabi
AAML
83
0
0
10 Jun 2024
Sharpness-Aware Minimization Enhances Feature Quality via Balanced
  Learning
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Mitchell Springer
Vaishnavh Nagarajan
Aditi Raghunathan
120
6
0
30 May 2024
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Pranshu Malviya
Jerry Huang
A. Baratin
Quentin Fournier
Sarath Chandar
77
0
0
24 May 2024
Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian
  Eigenvalue Regularization
Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization
Zirui Zhu
Yong Liu
Zangwei Zheng
Huifeng Guo
Yang You
45
0
0
23 Feb 2024
Why are Sensitive Functions Hard for Transformers?
Why are Sensitive Functions Hard for Transformers?
Michael Hahn
Mark Rofin
101
29
0
15 Feb 2024
CR-SAM: Curvature Regularized Sharpness-Aware Minimization
CR-SAM: Curvature Regularized Sharpness-Aware Minimization
Tao Wu
Tie Luo
D. C. Wunsch
62
3
0
21 Dec 2023
On The Fairness Impacts of Hardware Selection in Machine Learning
On The Fairness Impacts of Hardware Selection in Machine Learning
Sree Harsha Nelaturu
Nishaanth Kanna Ravichandran
Cuong Tran
Sara Hooker
Ferdinando Fioretto
83
3
0
06 Dec 2023
The instabilities of large learning rate training: a loss landscape view
The instabilities of large learning rate training: a loss landscape view
Lawrence Wang
Stephen J. Roberts
27
2
0
22 Jul 2023
Flatness-Aware Minimization for Domain Generalization
Flatness-Aware Minimization for Domain Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Yancheng Dong
Pengfei Tian
Peng Cu
85
22
0
20 Jul 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Gonçalo Mordido
A. Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Simon Lacoste-Julien
Razvan Pascanu
Sarath Chandar
ODL
45
1
0
18 Jul 2023
The Interpolating Information Criterion for Overparameterized Models
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
75
9
0
15 Jul 2023
Towards a Better Understanding of Learning with Multiagent Teams
Towards a Better Understanding of Learning with Multiagent Teams
David Radke
Kate Larson
Timothy B. Brecht
Kyle Tilbury
LLMAG
79
3
0
28 Jun 2023
PLASTIC: Improving Input and Label Plasticity for Sample Efficient
  Reinforcement Learning
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning
Hojoon Lee
Hanseul Cho
Hyunseung Kim
Daehoon Gwak
Joonkee Kim
Jaegul Choo
Se-Young Yun
Chulhee Yun
OffRL
157
30
0
19 Jun 2023
Early Weight Averaging meets High Learning Rates for LLM Pre-training
Early Weight Averaging meets High Learning Rates for LLM Pre-training
Sunny Sanyal
A. Neerkaje
Jean Kaddour
Abhishek Kumar
Sujay Sanghavi
MoMe
102
19
0
05 Jun 2023
SANE: The phases of gradient descent through Sharpness Adjusted Number
  of Effective parameters
SANE: The phases of gradient descent through Sharpness Adjusted Number of Effective parameters
Lawrence Wang
Stephen J. Roberts
116
0
0
29 May 2023
Neural Sculpting: Uncovering hierarchically modular task structure in
  neural networks through pruning and network analysis
Neural Sculpting: Uncovering hierarchically modular task structure in neural networks through pruning and network analysis
S. M. Patil
Loizos Michael
C. Dovrolis
70
0
0
28 May 2023
How to escape sharp minima with random perturbations
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
123
8
0
25 May 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization
The Crucial Role of Normalization in Sharpness-Aware Minimization
Yan Dai
Kwangjun Ahn
S. Sra
120
19
0
24 May 2023
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves
  Generalization
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Hao Zou
Peng Cui
77
41
0
03 Mar 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
68
7
0
19 Feb 2023
A Modern Look at the Relationship between Sharpness and Generalization
A Modern Look at the Relationship between Sharpness and Generalization
Maksym Andriushchenko
Francesco Croce
Maximilian Müller
Matthias Hein
Nicolas Flammarion
3DH
135
63
0
14 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
99
10
0
03 Feb 2023
Catapult Dynamics and Phase Transitions in Quadratic Nets
Catapult Dynamics and Phase Transitions in Quadratic Nets
David Meltzer
Junyu Liu
67
9
0
18 Jan 2023
SGD with Large Step Sizes Learns Sparse Features
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
143
60
0
11 Oct 2022
Linear Connectivity Reveals Generalization Strategies
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
331
48
0
24 May 2022
1