ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.00719
  4. Cited By
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

1 June 2020
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
    ODL
ArXivPDFHTML

Papers citing "ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning"

50 / 150 papers shown
Title
Policy Gradient with Second Order Momentum
Policy Gradient with Second Order Momentum
Tianyu Sun
2
0
0
16 May 2025
Trial and Trust: Addressing Byzantine Attacks with Comprehensive Defense Strategy
Trial and Trust: Addressing Byzantine Attacks with Comprehensive Defense Strategy
Gleb Molodtsov
Daniil Medyakov
Sergey Skorik
Nikolas Khachaturov
Shahane Tigranyan
Vladimir Aletov
A. Avetisyan
Martin Takáč
Aleksandr Beznosikov
AAML
35
0
0
12 May 2025
The effects of Hessian eigenvalue spectral density type on the applicability of Hessian analysis to generalization capability assessment of neural networks
The effects of Hessian eigenvalue spectral density type on the applicability of Hessian analysis to generalization capability assessment of neural networks
Nikita Gabdullin
18
0
0
24 Apr 2025
Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions
Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions
Tahmid Hasan Prato
Seijoon Kim
Lizhong Chen
Sanghyun Hong
AAML
38
0
0
02 Apr 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
50
0
0
02 Apr 2025
Fuzzy Cluster-Aware Contrastive Clustering for Time Series
Fuzzy Cluster-Aware Contrastive Clustering for Time Series
Congyu Wang
Mingjing Du
Xiang Jiang
Yongquan Dong
AI4TS
42
0
0
28 Mar 2025
Structured Preconditioners in Adaptive Optimization: A Unified Analysis
Shuo Xie
Tianhao Wang
Sashank J. Reddi
Sanjiv Kumar
Zhiyuan Li
45
1
0
13 Mar 2025
FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic Optimization
Zhanhong Jiang
Md Zahid Hasan
Aditya Balu
Joshua R. Waite
Genyi Huang
S. Sarkar
52
0
0
06 Mar 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
Dahun Shin
Dongyeop Lee
Jinseok Chung
Namhoon Lee
ODL
AAML
186
0
0
25 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation
X-Dancer: Expressive Music to Human Dance Video Generation
Zeyuan Chen
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiusi Chen
Chao Wang
Di Chang
Linjie Luo
VGen
43
0
0
24 Feb 2025
The impact of allocation strategies in subset learning on the expressive power of neural networks
Ofir Schlisselberg
Ran Darshan
93
0
0
10 Feb 2025
A Hessian-informed hyperparameter optimization for differential learning rate
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
39
1
0
12 Jan 2025
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox
  Tewahedo Church Chants
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox Tewahedo Church Chants
Mequanent Argaw Muluneh
Yan-Tsung Peng
Li Su
49
0
0
25 Dec 2024
LossLens: Diagnostics for Machine Learning through Loss Landscape Visual
  Analytics
LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics
Tiankai Xie
Jiaqing Chen
Yaoqing Yang
Caleb Geniesse
Ge Shi
...
J. Cava
Michael W. Mahoney
Talita Perciano
Gunther H. Weber
Ross Maciejewski
77
0
0
17 Dec 2024
A Method for Enhancing Generalization of Adam by Multiple Integrations
A Method for Enhancing Generalization of Adam by Multiple Integrations
Long Jin
Han Nong
Liangming Chen
Zhenming Su
70
0
0
17 Dec 2024
Meta Curvature-Aware Minimization for Domain Generalization
Meta Curvature-Aware Minimization for Domain Generalization
Zhaoyu Chen
Yiwen Ye
Feilong Tang
Yongsheng Pan
Yong-quan Xia
BDL
209
1
0
16 Dec 2024
Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of
  Expectation in the Loss Landscape
Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape
Jed A. Duersch
Tommie A. Catanach
Alexander Safonov
Jeremy Wendt
84
0
0
25 Nov 2024
Adaptive Consensus Gradients Aggregation for Scaled Distributed Training
Adaptive Consensus Gradients Aggregation for Scaled Distributed Training
Yoni Choukroun
Shlomi Azoulay
P. Kisilev
39
0
0
06 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
48
0
0
04 Nov 2024
Data movement limits to frontier model training
Data movement limits to frontier model training
Ege Erdil
David Schneider-Joseph
41
1
0
02 Nov 2024
CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex
  Neural Networks
CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex Neural Networks
Miria Feng
Zachary Frangella
Mert Pilanci
BDL
48
1
0
02 Nov 2024
Stochastic diagonal estimation with adaptive parameter selection
Stochastic diagonal estimation with adaptive parameter selection
Zongyuan Han
Wenhao Li
Shengxin Zhu
25
0
0
15 Oct 2024
A second-order-like optimizer with adaptive gradient scaling for deep
  learning
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
30
0
0
08 Oct 2024
A Dynamic Weighting Strategy to Mitigate Worker Node Failure in
  Distributed Deep Learning
A Dynamic Weighting Strategy to Mitigate Worker Node Failure in Distributed Deep Learning
Yuesheng Xu
Arielle Carr
32
0
0
14 Sep 2024
Second-Order Forward-Mode Automatic Differentiation for Optimization
Second-Order Forward-Mode Automatic Differentiation for Optimization
Adam D. Cobb
Atılım Güneş Baydin
Barak A. Pearlmutter
Susmit Jha
ODL
41
1
0
19 Aug 2024
Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
Haiquan Lu
Xiaotian Liu
Yefan Zhou
Qunli Li
Kurt Keutzer
Michael W. Mahoney
Yujun Yan
Huanrui Yang
Yaoqing Yang
45
1
0
17 Jul 2024
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators
P. DÁlberto
Taehee Jeong
Akshai Jain
Shreyas Manjunath
Mrinal Sarmah
Samuel Hsu Yaswanth Raparti
Nitesh Pipralia
42
2
0
12 Jul 2024
Empirical Tests of Optimization Assumptions in Deep Learning
Empirical Tests of Optimization Assumptions in Deep Learning
Hoang Tran
Qinzi Zhang
Ashok Cutkosky
41
1
0
01 Jul 2024
Recent and Upcoming Developments in Randomized Numerical Linear Algebra
  for Machine Learning
Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning
Michał Dereziński
Michael W. Mahoney
28
5
0
17 Jun 2024
Fed-Sophia: A Communication-Efficient Second-Order Federated Learning
  Algorithm
Fed-Sophia: A Communication-Efficient Second-Order Federated Learning Algorithm
Ahmed Elbakary
Chaouki Ben Issaid
Mohammad Shehab
Karim G. Seddik
Tamer A. ElBatt
Mehdi Bennis
39
2
0
10 Jun 2024
Revisiting Scalable Hessian Diagonal Approximations for Applications in
  Reinforcement Learning
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
Mohamed Elsayed
Homayoon Farrahi
Felix Dangel
A. Rupam Mahmood
32
3
0
05 Jun 2024
Local Methods with Adaptivity via Scaling
Local Methods with Adaptivity via Scaling
Saveliy Chezhegov
Sergey Skorik
Nikolas Khachaturov
Danil Shalagin
A. Avetisyan
Aleksandr Beznosikov
Martin Takáč
Yaroslav Kholodov
Alexander Gasnikov
58
2
0
02 Jun 2024
Bayesian Online Natural Gradient (BONG)
Bayesian Online Natural Gradient (BONG)
Matt Jones
Peter Chang
Kevin P. Murphy
BDL
45
3
0
30 May 2024
4-bit Shampoo for Memory-Efficient Network Training
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
41
5
0
28 May 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
76
2
0
26 May 2024
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and
  Provable Convergence
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
Ionut-Vlad Modoranu
M. Safaryan
Grigory Malinovsky
Eldar Kurtic
Thomas Robert
Peter Richtárik
Dan Alistarh
MQ
42
12
0
24 May 2024
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Shuaipeng Li
Penghao Zhao
Hailin Zhang
Xingwu Sun
Hao Wu
...
Zheng Fang
Jinbao Xue
Yangyu Tao
Bin Cui
Di Wang
38
6
0
23 May 2024
Exact Gauss-Newton Optimization for Training Deep Neural Networks
Exact Gauss-Newton Optimization for Training Deep Neural Networks
Mikalai Korbit
Adeyemi Damilare Adeoye
Alberto Bemporad
Mario Zanon
ODL
33
0
0
23 May 2024
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
S. Reifenstein
T. Leleu
Yoshihisa Yamamoto
48
1
0
02 May 2024
Label-free Anomaly Detection in Aerial Agricultural Images with Masked
  Image Modeling
Label-free Anomaly Detection in Aerial Agricultural Images with Masked Image Modeling
Sambal Shikhar
Anupam Sobti
40
1
0
13 Apr 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A
  Comprehensive Survey
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
37
6
0
09 Apr 2024
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Implicit Bias of AdamW: ℓ∞\ell_\inftyℓ∞​ Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
47
13
0
05 Apr 2024
AI and Memory Wall
AI and Memory Wall
A. Gholami
Z. Yao
Sehoon Kim
Coleman Hooper
Michael W. Mahoney
Kurt Keutzer
27
141
0
21 Mar 2024
PETScML: Second-order solvers for training regression problems in
  Scientific Machine Learning
PETScML: Second-order solvers for training regression problems in Scientific Machine Learning
Stefano Zampini
Umberto Zerbinati
George Turkyyiah
David E. Keyes
43
4
0
18 Mar 2024
Fuzzy hyperparameters update in a second order optimization
Fuzzy hyperparameters update in a second order optimization
Abdelaziz Bensadok
Muhammad Zeeshan Babar
21
0
0
08 Mar 2024
Inverse-Free Fast Natural Gradient Descent Method for Deep Learning
Inverse-Free Fast Natural Gradient Descent Method for Deep Learning
Xinwei Ou
Ce Zhu
Xiaolin Huang
Yipeng Liu
ODL
48
0
0
06 Mar 2024
SOFIM: Stochastic Optimization Using Regularized Fisher Information
  Matrix
SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix
Mrinmay Sen
A. K. Qin
Gayathri C
Raghu Kishore N
Yen-Wei Chen
Balasubramanian Raman
37
1
0
05 Mar 2024
SGD with Partial Hessian for Deep Neural Networks Optimization
SGD with Partial Hessian for Deep Neural Networks Optimization
Ying Sun
Hongwei Yong
Lei Zhang
ODL
28
0
0
05 Mar 2024
From Zero to Hero: How local curvature at artless initial conditions
  leads away from bad minima
From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima
Tony Bonnaire
Giulio Biroli
C. Cammarota
42
0
0
04 Mar 2024
Variational Learning is Effective for Large Deep Networks
Variational Learning is Effective for Large Deep Networks
Yuesong Shen
Nico Daheim
Bai Cong
Peter Nickl
Gian Maria Marconi
...
Rio Yokota
Iryna Gurevych
Daniel Cremers
Mohammad Emtiyaz Khan
Thomas Möllenhoff
43
22
0
27 Feb 2024
123
Next