Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 645 papers shown
Title
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
209
3
0
21 Feb 2025
Spectral-factorized Positive-definite Curvature Learning for NN Training
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Roger B. Grosse
175
0
0
10 Feb 2025
Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schr\"odinger Equation
Kevin Han Huang
Ni Zhan
Elif Ertekin
Peter Orbanz
Ryan P. Adams
86
0
0
07 Feb 2025
Is attention all you need to solve the correlated electron problem?
Max Geier
Khachatur Nazaryan
Timothy Zaklama
Liang Fu
117
3
0
07 Feb 2025
Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks
Josua Faller
Jörg Martin
BDL
163
0
0
04 Feb 2025
Online Curvature-Aware Replay: Leveraging
2
n
d
\mathbf{2^{nd}}
2
nd
Order Information for Online Continual Learning
Edoardo Urettini
Antonio Carta
102
0
0
03 Feb 2025
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Sizhuang He
Ananyae Kumar Bhartari
Bowen Li
P. Perdikaris
PINN
137
12
0
02 Feb 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
135
4
0
31 Jan 2025
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Akiyoshi Tomihari
Issei Sato
ODL
171
3
0
31 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Xavier Gonzalez
Andrew Warrington
Jimmy T.H. Smith
Scott W. Linderman
290
11
0
17 Jan 2025
Incrementally Learning Multiple Diverse Data Domains via Multi-Source Dynamic Expansion Model
RunQing Wu
Fei Ye
QiHe Liu
Guoxi Huang
Jinyu Guo
Rongyao Hu
CLL
433
0
0
15 Jan 2025
Knowledge Distillation with Adapted Weight
Sirong Wu
Xi Luo
Junjie Liu
Yuhui Deng
136
0
0
06 Jan 2025
Functional Risk Minimization
Ferran Alet
Clement Gehring
Tomás Lozano-Pérez
Kenji Kawaguchi
Joshua B. Tenenbaum
Leslie Pack Kaelbling
OffRL
130
0
0
31 Dec 2024
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao Song
ODL
213
3
0
22 Dec 2024
Holistic Adversarially Robust Pruning
Qi Zhao
Christian Wressnegger
133
10
0
19 Dec 2024
Krony-PT: GPT2 compressed with Kronecker Products
M. Ayoub Ben Ayad
Jelena Mitrović
Michael Granitzer
102
0
0
16 Dec 2024
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Nilo Schwencke
Cyril Furtlehner
167
1
0
14 Dec 2024
Streamlining Prediction in Bayesian Deep Learning
Marcus Klasson
Talal Alrawajfeh
Mikko Heikkilä
Martin Trapp
UQCV
BDL
234
2
0
27 Nov 2024
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
261
9
0
25 Nov 2024
Don't Be So Positive: Negative Step Sizes in Second-Order Methods
Betty Shea
Mark Schmidt
ODL
60
1
0
18 Nov 2024
A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations
Shu Liu
Stanley Osher
Wuchen Li
92
1
0
09 Nov 2024
A Bayesian Approach to Data Point Selection
Xinnuo Xu
Minyoung Kim
Royson Lee
Brais Martínez
Timothy M. Hospedales
92
0
0
06 Nov 2024
Stein Variational Newton Neural Network Ensembles
Klemens Flöge
Mohammed Abdul Moeed
Vincent Fortuin
BDL
UQCV
117
0
0
04 Nov 2024
Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation
Satoki Ishikawa
Rio Yokota
Ryo Karakida
131
0
0
04 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
150
0
0
04 Nov 2024
Data movement limits to frontier model training
Ege Erdil
David Schneider-Joseph
92
1
0
02 Nov 2024
TrAct: Making First-layer Pre-Activations Trainable
Felix Petersen
Christian Borgelt
Stefano Ermon
68
0
0
31 Oct 2024
Fast Deep Hedging with Second-Order Optimization
Konrad Mueller
Amira Akkari
Lukas Gonon
Ben Wood
ODL
115
0
0
29 Oct 2024
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
Jui-Nan Yen
Si Si
Zhao Meng
Felix X. Yu
Sai Surya Duvvuri
Inderjit Dhillon
Cho-Jui Hsieh
Sanjiv Kumar
75
5
0
27 Oct 2024
Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections
M. Miani
Hrittik Roy
Søren Hauberg
UQCV
BDL
114
0
0
22 Oct 2024
Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning
Arijit Das
36
2
0
21 Oct 2024
Streaming Deep Reinforcement Learning Finally Works
Mohamed Elsayed
Gautham Vasan
A. R. Mahmood
OffRL
116
6
0
18 Oct 2024
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Lukas Tatzel
Bálint Mucsányi
Osane Hackel
Philipp Hennig
115
0
0
18 Oct 2024
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
DiffM
TDI
164
7
0
17 Oct 2024
Second-Order Min-Max Optimization with Lazy Hessians
Lesi Chen
Chengchang Liu
Jingzhao Zhang
120
3
0
12 Oct 2024
Adversarial Vulnerability as a Consequence of On-Manifold Inseparibility
Rajdeep Haldar
Yue Xing
Qifan Song
Guang Lin
56
0
0
09 Oct 2024
Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning
Joanna Sliwa
Frank Schneider
Nathanael Bosch
Agustinus Kristiadi
Philipp Hennig
BDL
CLL
80
2
0
09 Oct 2024
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
67
0
0
08 Oct 2024
Reducing Variance in Meta-Learning via Laplace Approximation for Regression Tasks
Alfredo Reichlin
Gustaf Tegnér
Miguel Vasco
Hang Yin
Mårten Björkman
Danica Kragic
82
0
0
02 Oct 2024
Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models
Ji Liu
Jiaxiang Ren
Ruoming Jin
Zijie Zhang
Yang Zhou
P. Valduriez
Dejing Dou
FedML
91
6
0
30 Sep 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
Zhehao Huang
Xinwen Cheng
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
MU
110
9
0
29 Sep 2024
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
Chi Zhang
Huaping Zhong
Kuan Zhang
Chengliang Chai
Rui Wang
...
Lei Cao
Ju Fan
Ye Yuan
Guoren Wang
Conghui He
TDI
105
10
0
25 Sep 2024
Is All Learning (Natural) Gradient Descent?
Lucas Shoji
Kenta Suzuki
Leo Kozachkov
52
1
0
24 Sep 2024
Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient
Shaoqi Wang
Chunjie Yang
Siwei Lou
47
1
0
23 Sep 2024
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
167
38
0
17 Sep 2024
Highly Accurate Real-space Electron Densities with Neural Networks
Lixue Cheng
P. Szabó
Zeno Schätzle
Derk Kooi
Jonas Köhler
Klaas J. H. Giesbertz
Frank Noé
J. Hermann
Paola Gori-Giorgi
Adam Foster
86
7
0
02 Sep 2024
Simultaneous Training of First- and Second-Order Optimizers in Population-Based Reinforcement Learning
Felix Pfeiffer
Shahram Eivazi
62
0
0
27 Aug 2024
Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang
Bo Liu
Lizhang Chen
Qiang Liu
75
15
0
23 Aug 2024
Tracing Privacy Leakage of Language Models to Training Data via Adjusted Influence Functions
Jinxin Liu
Zao Yang
53
1
0
20 Aug 2024
Narrowing the Focus: Learned Optimizers for Pretrained Models
Gus Kristiansen
Mark Sandler
A. Zhmoginov
Nolan Miller
Anirudh Goyal
Jihwan Lee
Max Vladymyrov
90
1
0
17 Aug 2024
Previous
1
2
3
4
5
...
11
12
13
Next