Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 645 papers shown
Title
Addressing the Inconsistency in Bayesian Deep Learning via Generalized Laplace Approximation
Yinsong Chen
Samson S. Yu
Zhong Li
Chee Peng Lim
BDL
80
0
0
01 Jul 2025
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Roger Creus Castanyer
J. Obando-Ceron
Lu Li
Pierre-Luc Bacon
Glen Berseth
Aaron Courville
Pablo Samuel Castro
24
0
0
18 Jun 2025
Rethinking LLM Training through Information Geometry and Quantum Metrics
Riccardo Di Sipio
34
0
0
18 Jun 2025
ResNets Are Deeper Than You Think
Christian H.X. Ali Mehmeti-Göpel
Michael Wand
23
0
0
17 Jun 2025
Improving LoRA with Variational Learning
Bai Cong
Nico Daheim
Yuesong Shen
Rio Yokota
Mohammad Emtiyaz Khan
Thomas Möllenhoff
33
0
0
17 Jun 2025
Distributional Training Data Attribution
Bruno Mlodozeniec
Isaac Reid
Sam Power
David M. Krueger
Murat Erdogdu
Richard E. Turner
Roger B. Grosse
TDI
OOD
47
0
0
15 Jun 2025
An Adaptive Method Stabilizing Activations for Enhanced Generalization
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
32
0
0
10 Jun 2025
Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
Evan Markou
Thalaiyasingam Ajanthan
Stephen Gould
35
0
0
10 Jun 2025
NysAct: A Scalable Preconditioned Gradient Descent using Nystrom Approximation
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
34
0
0
10 Jun 2025
A Stable Whitening Optimizer for Efficient Neural Network Training
Kevin Frans
Sergey Levine
Pieter Abbeel
37
0
0
08 Jun 2025
pFedSOP : Accelerating Training Of Personalized Federated Learning Using Second-Order Optimization
Mrinmay Sen
C Krishna Mohan
20
0
0
08 Jun 2025
Stacey: Promoting Stochastic Steepest Descent via Accelerated
ℓ
p
\ell_p
ℓ
p
-Smooth Nonconvex Optimization
Xinyu Luo
Cedar Site Bai
Bolian Li
Petros Drineas
Ruqi Zhang
Brian Bullins
36
0
0
07 Jun 2025
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen
Aaron Defazio
Tsung-Hsien Lee
Richard Turner
Hao-Jun Michael Shi
95
0
0
04 Jun 2025
Models of Heavy-Tailed Mechanistic Universality
Liam Hodgkinson
Zhichao Wang
Michael W. Mahoney
80
1
0
04 Jun 2025
IF-GUIDE: Influence Function-Guided Detoxification of LLMs
Zachary Coalson
Juhan Bae
Nicholas Carlini
Sanghyun Hong
TDI
86
0
0
02 Jun 2025
On the Convergence Analysis of Muon
Wei Shen
Ruichuan Huang
Minhui Huang
Cong Shen
Jiawei Zhang
56
0
0
29 May 2025
Matryoshka Model Learning for Improved Elastic Student Models
Chetan Verma
Aditya Srinivas Timmaraju
Cho-Jui Hsieh
Suyash Damle
Ngot Bui
Y. Zhang
Wen Chen
Xin Liu
Prateek Jain
Inderjit S Dhillon
113
0
0
29 May 2025
GraSS: Scalable Influence Function with Sparse Gradient Compression
Pingbang Hu
Joseph Melkonian
Weijing Tang
Han Zhao
Jiaqi W. Ma
TDI
280
0
0
25 May 2025
Deterministic Bounds and Random Estimates of Metric Tensors on Neuromanifolds
Ke Sun
35
0
0
19 May 2025
Learning by solving differential equations
Benoit Dherin
Michael Munn
Hanna Mazzawi
Michael Wunder
Sourabh Medapati
Javier Gonzalvo
60
0
0
19 May 2025
IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
Chenlin Ming
Chendi Qu
Mengzhang Cai
Qizhi Pei
Zhuoshi Pan
Yu Li
Xiaoming Duan
Lijun Wu
Zeang Sheng
69
0
0
19 May 2025
Policy Gradient with Second Order Momentum
Tianyu Sun
61
0
0
16 May 2025
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
Jinuk Kim
Marwa El Halabi
W. Park
Clemens JS Schaefer
Deokjae Lee
Yeonhong Park
Jae W. Lee
Hyun Oh Song
MQ
148
1
0
11 May 2025
More Optimal Fractional-Order Stochastic Gradient Descent for Non-Convex Optimization Problems
Mohammad Partohaghighi
Roummel Marcia
YangQuan Chen
92
0
0
05 May 2025
Towards Quantifying the Hessian Structure of Neural Networks
Zhaorui Dong
Yushun Zhang
Zhi-Quan Luo
Jianfeng Yao
Ruoyu Sun
77
1
0
05 May 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Shunxian Gu
Chaoqun You
Bangbang Ren
Lailong Luo
Junxu Xia
Deke Guo
72
0
0
02 May 2025
Wasserstein Policy Optimization
David Pfau
Ian Davies
Diana Borsa
Joao G. M. Araujo
Brendan D. Tracey
H. V. Hasselt
83
1
0
01 May 2025
MAGIC: Near-Optimal Data Attribution for Deep Learning
Andrew Ilyas
Logan Engstrom
TDI
118
1
0
23 Apr 2025
Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods
Andres Fernandez
Frank Schneider
Maren Mahsereci
Philipp Hennig
118
0
0
20 Apr 2025
Second-order Optimization of Gaussian Splats with Importance Sampling
Hamza Pehlivan
Andrea Boscolo Camiletto
Lin Geng Foo
Marc Habermann
Christian Theobalt
3DGS
101
0
0
17 Apr 2025
Self-Controlled Dynamic Expansion Model for Continual Learning
RunQing Wu
KaiHui Huang
HanYi Zhang
Fei Ye
CLL
VLM
77
0
0
14 Apr 2025
Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning
Nikhil Shivakumar Nayak
Krishnateja Killamsetty
Ligong Han
Abhishek Bhandwaldar
Prateek Chanda
...
Hao Wang
Aldo Pareja
Oleg Silkin
Mustafa Eyceoz
Akash Srivastava
CLL
86
0
0
09 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
162
2
0
06 Apr 2025
Image Coding for Machines via Feature-Preserving Rate-Distortion Optimization
Samuel Fernández-Menduiña
Eduardo Pavez
Antonio Ortega
94
0
0
03 Apr 2025
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
Zhuguanyu Wu
Jiayi Zhang
Jiaxin Chen
Jinyang Guo
Di Huang
Yunhong Wang
MQ
123
1
0
03 Apr 2025
ASGO: Adaptive Structured Gradient Optimization
Kang An
Yuxing Liu
Boyao Wang
Shiqian Ma
Shiqian Ma
Tong Zhang
Tong Zhang
ODL
155
5
0
26 Mar 2025
Continual Learning With Quasi-Newton Methods
Steven Vander Eeckt
Hugo Van hamme
CLL
BDL
133
0
0
25 Mar 2025
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
Xuan Shen
Weize Ma
Jing Liu
Changdi Yang
Rui Ding
...
Wei Niu
Yanzhi Wang
Pu Zhao
Jun Lin
Jiuxiang Gu
MQ
99
0
0
20 Mar 2025
FedBEns: One-Shot Federated Learning based on Bayesian Ensemble
Jacopo Talpini
Marco Savi
Giovanni Neglia
FedML
Presented at
ResearchTrend Connect | FedML
on
07 May 2025
151
0
0
19 Mar 2025
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Yaxiong Chen
Yujie Wang
Zixuan Zheng
Jingliang Hu
Yilei Shi
Shengwu Xiong
Xiao Xiang Zhu
Lichao Mou
148
1
0
18 Mar 2025
Text-Guided Image Invariant Feature Learning for Robust Image Watermarking
Muhammad Ahtesham
Xin Zhong
105
1
0
18 Mar 2025
Effective Dimension Aware Fractional-Order Stochastic Gradient Descent for Convex Optimization Problems
Mohammad Partohaghighi
Roummel Marcia
YangQuan Chen
97
0
0
17 Mar 2025
Structured Preconditioners in Adaptive Optimization: A Unified Analysis
Shuo Xie
Tianhao Wang
Sashank J. Reddi
Sanjiv Kumar
Zhiyuan Li
82
4
0
13 Mar 2025
Efficient Membership Inference Attacks by Bayesian Neural Network
Zhenlong Liu
Wenyu Jiang
Feng Zhou
Hongxin Wei
MIALM
102
1
0
10 Mar 2025
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
177
4
0
26 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
154
2
0
26 Feb 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
Dahun Shin
Dongyeop Lee
Jinseok Chung
Namhoon Lee
ODL
AAML
517
0
0
25 Feb 2025
Function-Space Learning Rates
Edward Milsom
Ben Anson
Laurence Aitchison
157
1
0
24 Feb 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
411
3
0
24 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
209
3
0
21 Feb 2025
1
2
3
4
...
11
12
13
Next