ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.00719
  4. Cited By
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

1 June 2020
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
    ODL
ArXivPDFHTML

Papers citing "ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning"

50 / 150 papers shown
Title
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
66
8
0
23 Feb 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural
  Architectures
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
32
0
0
19 Feb 2024
Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model
Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model
Jialiang Wang
Weiling Li
Yurong Zhong
Xin Luo
25
0
0
19 Feb 2024
Stochastic Hessian Fittings with Lie Groups
Stochastic Hessian Fittings with Lie Groups
Xi-Lin Li
40
1
0
19 Feb 2024
Preconditioners for the Stochastic Training of Implicit Neural
  Representations
Preconditioners for the Stochastic Training of Implicit Neural Representations
Shin-Fang Chng
Hemanth Saratchandran
Simon Lucey
26
0
0
13 Feb 2024
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Alexander Soen
Ke Sun
24
1
0
08 Feb 2024
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
Omead Brandon Pooladzandi
Xi-Lin Li
38
4
0
07 Feb 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant
  Stochastic Algorithms
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
43
2
0
28 Dec 2023
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for
  Preconditioning Matrix
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning Matrix
Yun Yue
Zhiling Ye
Jiadi Jiang
Yongchao Liu
Ke Zhang
ODL
26
1
0
04 Dec 2023
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network
  Training
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou
Tianyu Pang
Keqin Liu
Charles H. Martin
Michael W. Mahoney
Yaoqing Yang
42
7
0
01 Dec 2023
Data-efficient operator learning for solving high Mach number fluid flow
  problems
Data-efficient operator learning for solving high Mach number fluid flow problems
Noah Ford
Victor J. Leon
Honest Mrema
Jeffrey Gilbert
Alexander New
AI4CE
24
0
0
28 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
34
0
0
06 Nov 2023
Kronecker-Factored Approximate Curvature for Modern Neural Network
  Architectures
Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures
Runa Eschenhagen
Alexander Immer
Richard Turner
Frank Schneider
Philipp Hennig
61
21
0
01 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based
  Optimization
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
15
0
0
31 Oct 2023
AdaSub: Stochastic Optimization Using Second-Order Information in
  Low-Dimensional Subspaces
AdaSub: Stochastic Optimization Using Second-Order Information in Low-Dimensional Subspaces
João Victor Galvão da Mata
Martin S. Andersen
11
1
0
30 Oct 2023
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Ross M. Clarke
José Miguel Hernández-Lobato
46
2
0
23 Oct 2023
Towards Hyperparameter-Agnostic DNN Training via Dynamical System
  Insights
Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights
Carmel Fiscko
Aayushya Agarwal
Yihan Ruan
S. Kar
L. Pileggi
Bruno Sinopoli
18
0
0
21 Oct 2023
Stochastic Gradient Descent with Preconditioned Polyak Step-size
Stochastic Gradient Descent with Preconditioned Polyak Step-size
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Martin Takáč
31
5
0
03 Oct 2023
Eva: A General Vectorized Approximation Framework for Second-order
  Optimization
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
S. Shi
Bo-wen Li
28
1
0
04 Aug 2023
Flatness-Aware Minimization for Domain Generalization
Flatness-Aware Minimization for Domain Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Yancheng Dong
Pengfei Tian
Peng Cu
32
20
0
20 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to
  Adaptive and Non-adaptive Momentum Optimizers
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
33
4
0
02 Jul 2023
G-TRACER: Expected Sharpness Optimization
G-TRACER: Expected Sharpness Optimization
John R. Williams
Stephen J. Roberts
35
0
0
24 Jun 2023
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix
  Categories and Test Code Repair
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
Sakina Fatima
Hadi Hemmati
Lionel C. Briand
34
4
0
21 Jun 2023
Understanding Optimization of Deep Learning via Jacobian Matrix and
  Lipschitz Constant
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Xianbiao Qi
Jianan Wang
Lei Zhang
18
0
0
15 Jun 2023
Error Feedback Can Accurately Compress Preconditioners
Error Feedback Can Accurately Compress Preconditioners
Ionut-Vlad Modoranu
A. Kalinov
Eldar Kurtic
Elias Frantar
Dan Alistarh
ODL
14
4
0
09 Jun 2023
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional
  Backtracking
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Frederik Kunstner
V. S. Portella
Mark W. Schmidt
Nick Harvey
26
8
0
05 Jun 2023
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Yu Yang
Hao Kang
Baharan Mirzasoleiman
36
34
0
02 Jun 2023
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned
  Stochastic Optimization
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned Stochastic Optimization
Jonathan Mei
Alexander Moreno
Luke Walters
ODL
29
1
0
30 May 2023
Minibatching Offers Improved Generalization Performance for Second Order
  Optimizers
Minibatching Offers Improved Generalization Performance for Second Order Optimizers
Eric Silk
Swarnita Chakraborty
N. Dasgupta
Anand D. Sarwate
A. Lumsdaine
Tony Chiang
ODL
11
0
0
26 May 2023
SING: A Plug-and-Play DNN Learning Technique
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
36
0
0
25 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
55
132
0
23 May 2023
Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning
Achraf Bahamou
D. Goldfarb
ODL
36
0
0
23 May 2023
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
Kazuki Osawa
Satoki Ishikawa
Rio Yokota
Shigang Li
Torsten Hoefler
ODL
38
14
0
08 May 2023
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Felix Petersen
Tobias Sutter
Christian Borgelt
Dongsung Huh
Hilde Kuehne
Yuekai Sun
Oliver Deussen
ODL
31
5
0
01 May 2023
Loss-Curvature Matching for Dataset Selection and Condensation
Loss-Curvature Matching for Dataset Selection and Condensation
Seung-Jae Shin
Heesun Bae
DongHyeok Shin
Weonyoung Joo
Il-Chul Moon
DD
49
24
0
08 Mar 2023
FOSI: Hybrid First and Second Order Optimization
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
34
2
0
16 Feb 2023
Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering
Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering
Rui Zhu
Di Tang
Siyuan Tang
Guanhong Tao
Shiqing Ma
Xiaofeng Wang
Haixu Tang
DD
23
3
0
29 Jan 2023
Projective Integral Updates for High-Dimensional Variational Inference
Projective Integral Updates for High-Dimensional Variational Inference
J. Duersch
35
1
0
20 Jan 2023
Task Weighting in Meta-learning with Trajectory Optimisation
Task Weighting in Meta-learning with Trajectory Optimisation
Cuong C. Nguyen
Thanh-Toan Do
G. Carneiro
31
3
0
04 Jan 2023
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design
  Framework
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework
Shikhar Tuli
Chia-Hao Li
Ritvik Sharma
N. Jha
36
13
0
07 Dec 2022
A survey of deep learning optimizers -- first and second order methods
A survey of deep learning optimizers -- first and second order methods
Rohan Kashyap
ODL
37
6
0
28 Nov 2022
On the Effectiveness of Parameter-Efficient Fine-Tuning
On the Effectiveness of Parameter-Efficient Fine-Tuning
Z. Fu
Haoran Yang
Anthony Man-Cho So
Wai Lam
Lidong Bing
Nigel Collier
27
156
0
28 Nov 2022
Black Box Lie Group Preconditioners for SGD
Black Box Lie Group Preconditioners for SGD
Xi-Lin Li
13
8
0
08 Nov 2022
Adaptive scaling of the learning rate by second order automatic
  differentiation
Adaptive scaling of the learning rate by second order automatic differentiation
F. Gournay
Alban Gossard
ODL
31
1
0
26 Oct 2022
An Efficient Nonlinear Acceleration method that Exploits Symmetry of the
  Hessian
An Efficient Nonlinear Acceleration method that Exploits Symmetry of the Hessian
Huan He
Shifan Zhao
Z. Tang
Joyce C. Ho
Y. Saad
Yuanzhe Xi
26
3
0
22 Oct 2022
HesScale: Scalable Computation of Hessian Diagonals
HesScale: Scalable Computation of Hessian Diagonals
Mohamed Elsayed
A. R. Mahmood
22
7
0
20 Oct 2022
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural
  Networks on Coupled Ordinary Differential Equations
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations
Alexander New
B. Eng
A. Timm
A. Gearhart
20
4
0
14 Oct 2022
Exploring Contextual Representation and Multi-Modality for End-to-End
  Autonomous Driving
Exploring Contextual Representation and Multi-Modality for End-to-End Autonomous Driving
Shoaib Azam
Farzeen Munir
Ville Kyrki
M. Jeon
Witold Pedrycz
56
1
0
13 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
Learning to Optimize Quasi-Newton Methods
Learning to Optimize Quasi-Newton Methods
Isaac Liao
Rumen Dangovski
Jakob N. Foerster
Marin Soljacic
38
4
0
11 Oct 2022
Previous
123
Next