ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

1 June 2020

Papers citing "ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning"

50 / 150 papers shown

Title
Policy Gradient with Second Order Momentum Tianyu Sun 2 0 0 16 May 2025
Trial and Trust: Addressing Byzantine Attacks with Comprehensive Defense Strategy Gleb Molodtsov Daniil Medyakov Sergey Skorik Nikolas Khachaturov Shahane Tigranyan Vladimir Aletov A. Avetisyan Martin Takáč Aleksandr Beznosikov AAML 35 0 0 12 May 2025
The effects of Hessian eigenvalue spectral density type on the applicability of Hessian analysis to generalization capability assessment of neural networks Nikita Gabdullin 18 0 0 24 Apr 2025
Hessian-aware Training for Enhancing DNNs Resilience to Parameter Corruptions Tahmid Hasan Prato Seijoon Kim Lizhong Chen Sanghyun Hong AAML 38 0 0 02 Apr 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design Mohan Zhang Pingzhi Li Jie Peng Mufan Qiu Tianlong Chen MoE 50 0 0 02 Apr 2025
Fuzzy Cluster-Aware Contrastive Clustering for Time Series Congyu Wang Mingjing Du Xiang Jiang Yongquan Dong AI4TS 42 0 0 28 Mar 2025
Structured Preconditioners in Adaptive Optimization: A Unified Analysis Shuo Xie Tianhao Wang Sashank J. Reddi Sanjiv Kumar Zhiyuan Li 45 1 0 13 Mar 2025
FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic Optimization Zhanhong Jiang Md Zahid Hasan Aditya Balu Joshua R. Waite Genyi Huang S. Sarkar 52 0 0 06 Mar 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation Dahun Shin Dongyeop Lee Jinseok Chung Namhoon Lee ODL AAML 186 0 0 25 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation Zeyuan Chen Hongyi Xu Guoxian Song You Xie Chenxu Zhang Xiusi Chen Chao Wang Di Chang Linjie Luo VGen 43 0 0 24 Feb 2025
The impact of allocation strategies in subset learning on the expressive power of neural networks Ofir Schlisselberg Ran Darshan 93 0 0 10 Feb 2025
A Hessian-informed hyperparameter optimization for differential learning rate Shiyun Xu Zhiqi Bu Yiliang Zhang Ian Barnett 39 1 0 12 Jan 2025
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox Tewahedo Church Chants Mequanent Argaw Muluneh Yan-Tsung Peng Li Su 49 0 0 25 Dec 2024
LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics Tiankai Xie Jiaqing Chen Yaoqing Yang Caleb Geniesse Ge Shi ... J. Cava Michael W. Mahoney Talita Perciano Gunther H. Weber Ross Maciejewski 77 0 0 17 Dec 2024
A Method for Enhancing Generalization of Adam by Multiple Integrations Long Jin Han Nong Liangming Chen Zhenming Su 70 0 0 17 Dec 2024
Meta Curvature-Aware Minimization for Domain Generalization Zhaoyu Chen Yiwen Ye Feilong Tang Yongsheng Pan Yong-quan Xia BDL 209 1 0 16 Dec 2024
Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape Jed A. Duersch Tommie A. Catanach Alexander Safonov Jeremy Wendt 84 0 0 25 Nov 2024
Adaptive Consensus Gradients Aggregation for Scaled Distributed Training Yoni Choukroun Shlomi Azoulay P. Kisilev 39 0 0 06 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks Jim Zhao Sidak Pal Singh Aurelien Lucchi AI4CE 48 0 0 04 Nov 2024
Data movement limits to frontier model training Ege Erdil David Schneider-Joseph 41 1 0 02 Nov 2024
CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex Neural Networks Miria Feng Zachary Frangella Mert Pilanci BDL 48 1 0 02 Nov 2024
Stochastic diagonal estimation with adaptive parameter selection Zongyuan Han Wenhao Li Shengxin Zhu 25 0 0 15 Oct 2024
A second-order-like optimizer with adaptive gradient scaling for deep learning Jérôme Bolte Ryan Boustany Edouard Pauwels Andrei Purica ODL 30 0 0 08 Oct 2024
A Dynamic Weighting Strategy to Mitigate Worker Node Failure in Distributed Deep Learning Yuesheng Xu Arielle Carr 32 0 0 14 Sep 2024
Second-Order Forward-Mode Automatic Differentiation for Optimization Adam D. Cobb Atılım Güneş Baydin Barak A. Pearlmutter Susmit Jha ODL 41 1 0 19 Aug 2024
Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance Haiquan Lu Xiaotian Liu Yefan Zhou Qunli Li Kurt Keutzer Michael W. Mahoney Yujun Yan Huanrui Yang Yaoqing Yang 45 1 0 17 Jul 2024
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators P. DÁlberto Taehee Jeong Akshai Jain Shreyas Manjunath Mrinal Sarmah Samuel Hsu Yaswanth Raparti Nitesh Pipralia 42 2 0 12 Jul 2024
Empirical Tests of Optimization Assumptions in Deep Learning Hoang Tran Qinzi Zhang Ashok Cutkosky 41 1 0 01 Jul 2024
Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning Michał Dereziński Michael W. Mahoney 28 5 0 17 Jun 2024
Fed-Sophia: A Communication-Efficient Second-Order Federated Learning Algorithm Ahmed Elbakary Chaouki Ben Issaid Mohammad Shehab Karim G. Seddik Tamer A. ElBatt Mehdi Bennis 39 2 0 10 Jun 2024
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning Mohamed Elsayed Homayoon Farrahi Felix Dangel A. Rupam Mahmood 32 3 0 05 Jun 2024
Local Methods with Adaptivity via Scaling Saveliy Chezhegov Sergey Skorik Nikolas Khachaturov Danil Shalagin A. Avetisyan Aleksandr Beznosikov Martin Takáč Yaroslav Kholodov Alexander Gasnikov 58 2 0 02 Jun 2024
Bayesian Online Natural Gradient (BONG) Matt Jones Peter Chang Kevin P. Murphy BDL 45 3 0 30 May 2024
4-bit Shampoo for Memory-Efficient Network Training Sike Wang Jia Li Pan Zhou Hua Huang MQ 41 5 0 28 May 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information Damien Martins Gomes Yanlei Zhang Eugene Belilovsky Guy Wolf Mahdi S. Hosseini ODL 76 2 0 26 May 2024
MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence Ionut-Vlad Modoranu M. Safaryan Grigory Malinovsky Eldar Kurtic Thomas Robert Peter Richtárik Dan Alistarh MQ 42 12 0 24 May 2024
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling Shuaipeng Li Penghao Zhao Hailin Zhang Xingwu Sun Hao Wu ... Zheng Fang Jinbao Xue Yangyu Tao Bin Cui Di Wang 38 6 0 23 May 2024
Exact Gauss-Newton Optimization for Training Deep Neural Networks Mikalai Korbit Adeyemi Damilare Adeoye Alberto Bemporad Mario Zanon ODL 33 0 0 23 May 2024
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization S. Reifenstein T. Leleu Yoshihisa Yamamoto 48 1 0 02 May 2024
Label-free Anomaly Detection in Aerial Agricultural Images with Masked Image Modeling Sambal Shikhar Anupam Sobti 40 1 0 13 Apr 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey Feng Liang Zhen Zhang Haifeng Lu Victor C. M. Leung Yanyi Guo Xiping Hu GNN 37 6 0 09 Apr 2024
$Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization$ Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization Shuo Xie Zhiyuan Li OffRL 47 13 0 05 Apr 2024
AI and Memory Wall A. Gholami Z. Yao Sehoon Kim Coleman Hooper Michael W. Mahoney Kurt Keutzer 27 141 0 21 Mar 2024
PETScML: Second-order solvers for training regression problems in Scientific Machine Learning Stefano Zampini Umberto Zerbinati George Turkyyiah David E. Keyes 43 4 0 18 Mar 2024
Fuzzy hyperparameters update in a second order optimization Abdelaziz Bensadok Muhammad Zeeshan Babar 21 0 0 08 Mar 2024
Inverse-Free Fast Natural Gradient Descent Method for Deep Learning Xinwei Ou Ce Zhu Xiaolin Huang Yipeng Liu ODL 48 0 0 06 Mar 2024
SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix Mrinmay Sen A. K. Qin Gayathri C Raghu Kishore N Yen-Wei Chen Balasubramanian Raman 37 1 0 05 Mar 2024
SGD with Partial Hessian for Deep Neural Networks Optimization Ying Sun Hongwei Yong Lei Zhang ODL 28 0 0 05 Mar 2024
From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima Tony Bonnaire Giulio Biroli C. Cammarota 42 0 0 04 Mar 2024
Variational Learning is Effective for Large Deep Networks Yuesong Shen Nico Daheim Bai Cong Peter Nickl Gian Maria Marconi ... Rio Yokota Iryna Gurevych Daniel Cremers Mohammad Emtiyaz Khan Thomas Möllenhoff 43 22 0 27 Feb 2024