Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 645 papers shown
Title
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Alexander Soen
Ke Sun
85
3
0
08 Feb 2024
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
Omead Brandon Pooladzandi
Xi-Lin Li
88
8
0
07 Feb 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard Turner
Alireza Makhzani
ODL
163
13
0
05 Feb 2024
Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
ODL
57
1
0
05 Feb 2024
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
124
7
0
19 Jan 2024
A Kaczmarz-inspired approach to accelerate the optimization of neural network wavefunctions
Gil Goldshlager
Nilin Abrahamsen
Lin Lin
106
14
0
18 Jan 2024
The LLM Surgeon
Tycho F. A. van der Ouderaa
Markus Nagel
M. V. Baalen
Yuki Markus Asano
Tijmen Blankevoort
114
18
0
28 Dec 2023
On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width
Satoki Ishikawa
Ryo Karakida
110
2
0
19 Dec 2023
Unveiling Empirical Pathologies of Laplace Approximation for Uncertainty Estimation
Maksim Zhdanov
Stanislav Dereka
Sergey Kolesnikov
21
0
0
16 Dec 2023
Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC
Wu Lin
Felix Dangel
Runa Eschenhagen
Kirill Neklyudov
Agustinus Kristiadi
Richard Turner
Alireza Makhzani
65
4
0
09 Dec 2023
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
109
12
0
07 Dec 2023
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives
Pierre Wolinski
ODL
161
0
0
06 Dec 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent
Frederik Köhne
Leonie Kreis
Anton Schiela
Roland A. Herzog
89
2
0
28 Nov 2023
Frobenius-Type Norms and Inner Products of Matrices and Linear Maps with Applications to Neural Network Training
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
18
4
0
26 Nov 2023
Leveraging Function Space Aggregation for Federated Learning at Scale
Nikita Dhawan
Nicole Mitchell
Zachary B. Charles
Zachary Garrett
Gintare Karolina Dziugaite
FedML
84
3
0
17 Nov 2023
A Computationally Efficient Sparsified Online Newton Method
Fnu Devvrit
Sai Surya Duvvuri
Rohan Anil
Vineet Gupta
Cho-Jui Hsieh
Inderjit Dhillon
53
0
0
16 Nov 2023
Riemannian Laplace Approximation with the Fisher Metric
Hanlin Yu
Marcelo Hartmann
Bernardo Williams
Mark Girolami
Arto Klami
109
3
0
05 Nov 2023
Simplifying Transformer Blocks
Bobby He
Thomas Hofmann
109
36
0
03 Nov 2023
Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures
Runa Eschenhagen
Alexander Immer
Richard Turner
Frank Schneider
Philipp Hennig
135
24
0
01 Nov 2023
Efficient Numerical Algorithm for Large-Scale Damped Natural Gradient Descent
Yixiao Chen
Hao Xie
Han Wang
13
2
0
26 Oct 2023
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Ross M. Clarke
José Miguel Hernández-Lobato
123
2
0
23 Oct 2023
Series of Hessian-Vector Products for Tractable Saddle-Free Newton Optimisation of Neural Networks
E. T. Oldewage
Ross M. Clarke
José Miguel Hernández-Lobato
ODL
54
1
0
23 Oct 2023
Jorge: Approximate Preconditioning for GPU-efficient Second-order Optimization
Siddharth Singh
Zack Sating
A. Bhatele
ODL
72
0
0
18 Oct 2023
Optimising Distributions with Natural Gradient Surrogates
Jonathan So
Richard Turner
43
1
0
18 Oct 2023
Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic System Identification with Application to Audio Processing
Karim Helwani
Erfan Soltanmohammadi
Michael M. Goodwin
55
0
0
10 Oct 2023
Learning Layer-wise Equivariances Automatically using Gradients
Tycho F. A. van der Ouderaa
Alexander Immer
Mark van der Wilk
MLT
108
14
0
09 Oct 2023
A Meta-Learning Perspective on Transformers for Causal Language Modeling
Xinbo Wu
Lav Varshney
80
7
0
09 Oct 2023
FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation
Xiang Liu
Liangxi Liu
Feiyang Ye
Yunheng Shen
Xia Li
Linshan Jiang
Jialin Li
106
6
0
30 Sep 2023
On the Disconnect Between Theory and Practice of Neural Networks: Limits of the NTK Perspective
Jonathan Wenger
Felix Dangel
Agustinus Kristiadi
99
0
0
29 Sep 2023
Bringing the Discussion of Minima Sharpness to the Audio Domain: a Filter-Normalised Evaluation for Acoustic Scene Classification
M. Milling
Andreas Triantafyllopoulos
Iosif Tsangko
Simon Rampp
F. Schlüter
118
3
0
28 Sep 2023
A Primer on Bayesian Neural Networks: Review and Debates
Federico Danieli
Konstantinos Pitas
M. Vladimirova
Vincent Fortuin
BDL
AAML
105
20
0
28 Sep 2023
A Theoretical and Empirical Study on the Convergence of Adam with an "Exact" Constant Step Size in Non-Convex Settings
Alokendu Mazumder
Rishabh Sabharwal
Manan Tayal
Bhartendu Kumar
Punit Rathore
46
0
0
15 Sep 2023
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Hao-Jun Michael Shi
Tsung-Hsien Lee
Shintaro Iwasaki
Jose Gallego-Posada
Zhijing Li
Kaushik Rangadurai
Dheevatsa Mudigere
Michael Rabbat
ODL
98
27
0
12 Sep 2023
The fine print on tempered posteriors
Konstantinos Pitas
Julyan Arbel
72
1
0
11 Sep 2023
CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra
Andres Potapczynski
Marc Finzi
Geoff Pleiss
Andrew Gordon Wilson
57
9
0
06 Sep 2023
Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence
Liyuan Wang
Xingxing Zhang
Qian Li
Mingtian Zhang
Hang Su
Jun Zhu
Yi Zhong
95
57
0
29 Aug 2023
Towards Accelerated Model Training via Bayesian Data Selection
Zhijie Deng
Peng Cui
Jun Zhu
89
5
0
21 Aug 2023
Dual Gauss-Newton Directions for Deep Learning
Vincent Roulet
Mathieu Blondel
ODL
54
0
0
17 Aug 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
Shaoshuai Shi
Yue Liu
79
1
0
04 Aug 2023
mL-BFGS: A Momentum-based L-BFGS for Distributed Large-Scale Neural Network Optimization
Yue Niu
Zalan Fabian
Sunwoo Lee
Mahdi Soltanolkotabi
Salman Avestimehr
ODL
34
2
0
25 Jul 2023
Modify Training Directions in Function Space to Reduce Generalization Error
Yi Yu
Wenlian Lu
Boyu Chen
78
0
0
25 Jul 2023
Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions
Michael Scherbela
Leon Gerard
Philipp Grohs
68
7
0
15 Jul 2023
Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks
Dominik Schnaus
Jongseok Lee
Zorah Lähner
Rudolph Triebel
UQCV
BDL
78
1
0
15 Jul 2023
Robust scalable initialization for Bayesian variational inference with multi-modal Laplace approximations
Wyatt Bridgman
Reese E. Jones
Mohammad Khalil
66
1
0
12 Jul 2023
Self-Expanding Neural Networks
Rupert Mitchell
Robin Menzenbach
Kristian Kersting
Martin Mundt
110
9
0
10 Jul 2023
Wasserstein Quantum Monte Carlo: A Novel Approach for Solving the Quantum Many-Body Schrödinger Equation
Kirill Neklyudov
J. Nys
Luca Thiede
Juan Carrasquilla
Qiang Liu
Max Welling
Alireza Makhzani
44
13
0
06 Jul 2023
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Tianshuo Xu
Xiaoshuai Sun
Tongliang Liu
Rongrong Ji
Dacheng Tao
AAML
73
2
0
30 Jun 2023
Efficient Backdoor Removal Through Natural Gradient Fine-tuning
Nazmul Karim
Abdullah Al Arafat
Umar Khalid
Zhishan Guo
Naznin Rahnavard
AAML
68
1
0
30 Jun 2023
Riemannian Laplace approximations for Bayesian neural networks
Federico Bergamin
Pablo Moreno-Muñoz
Søren Hauberg
Georgios Arvanitidis
BDL
81
7
0
12 Jun 2023
Error Feedback Can Accurately Compress Preconditioners
Ionut-Vlad Modoranu
A. Kalinov
Eldar Kurtic
Elias Frantar
Dan Alistarh
ODL
107
5
0
09 Jun 2023
Previous
1
2
3
4
5
...
11
12
13
Next