Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.01407
Cited By
A Kronecker-factored approximate Fisher matrix for convolution layers
3 February 2016
Roger C. Grosse
James Martens
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Kronecker-factored approximate Fisher matrix for convolution layers"
47 / 47 papers shown
Title
FedBEns: One-Shot Federated Learning based on Bayesian Ensemble
Jacopo Talpini
Marco Savi
Giovanni Neglia
FedML
Presented at
ResearchTrend Connect | FedML
on
07 May 2025
76
0
0
19 Mar 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
58
3
0
31 Jan 2025
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Nilo Schwencke
Cyril Furtlehner
64
1
0
14 Dec 2024
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Lukas Tatzel
Bálint Mucsányi
Osane Hackel
Philipp Hennig
43
0
0
18 Oct 2024
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
TDI
DiffM
75
4
0
17 Oct 2024
Second-Order Min-Max Optimization with Lazy Hessians
Lesi Chen
Chengchang Liu
Jingzhao Zhang
43
1
0
12 Oct 2024
Scalable Bayesian Learning with posteriors
Samuel Duffield
Kaelan Donatella
Johnathan Chiu
Phoebe Klett
Daniel Simpson
BDL
UQCV
62
1
0
31 May 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
76
2
0
26 May 2024
Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC
Wu Lin
Felix Dangel
Runa Eschenhagen
Kirill Neklyudov
Agustinus Kristiadi
Richard E. Turner
Alireza Makhzani
22
3
0
09 Dec 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
S. Shi
Bo-wen Li
15
1
0
04 Aug 2023
Modify Training Directions in Function Space to Reduce Generalization Error
Yi Yu
Wenlian Lu
Boyu Chen
21
0
0
25 Jul 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
32
128
0
23 May 2023
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Felix Petersen
Tobias Sutter
Christian Borgelt
Dongsung Huh
Hilde Kuehne
Yuekai Sun
Oliver Deussen
ODL
31
5
0
01 May 2023
Efficient Activation Function Optimization through Surrogate Modeling
G. Bingham
Risto Miikkulainen
16
2
0
13 Jan 2023
Brand New K-FACs: Speeding up K-FAC with Online Decomposition Updates
C. Puiu
14
2
0
16 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
11
1
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
27
69
0
11 Oct 2022
Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning
Lin Zhang
S. Shi
Wei Wang
Bo-wen Li
28
10
0
30 Jun 2022
Information Geometry of Dropout Training
Masanari Kimura
H. Hino
11
2
0
22 Jun 2022
Debugging using Orthogonal Gradient Descent
Narsimha Chilkuri
C. Eliasmith
21
1
0
17 Jun 2022
Amortized Proximal Optimization
Juhan Bae
Paul Vicol
Jeff Z. HaoChen
Roger C. Grosse
ODL
25
14
0
28 Feb 2022
Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations
Alexander Immer
Tycho F. A. van der Ouderaa
Gunnar Rätsch
Vincent Fortuin
Mark van der Wilk
BDL
31
44
0
22 Feb 2022
A Geometric Understanding of Natural Gradient
Qinxun Bai
S. Rosenberg
Wei Xu
19
2
0
13 Feb 2022
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
Frederik Benzing
ODL
37
23
0
28 Jan 2022
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
27
351
0
18 Nov 2021
Kronecker Factorization for Preventing Catastrophic Forgetting in Large-scale Medical Entity Linking
Denis Jered McInerney
Luyang Kong
Kristjan Arumae
Byron C. Wallace
Parminder Bhatia
CLL
24
1
0
11 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
24
14
0
01 Nov 2021
Nys-Newton: Nyström-Approximated Curvature for Stochastic Optimization
Dinesh Singh
Hardik Tankaria
M. Yamada
ODL
32
2
0
16 Oct 2021
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
S. Shi
Lin Zhang
Bo-wen Li
26
9
0
14 Jul 2021
M-FAC: Efficient Matrix-Free Approximations of Second-Order Information
Elias Frantar
Eldar Kurtic
Dan Alistarh
13
57
0
07 Jul 2021
A Survey of Uncertainty in Deep Neural Networks
J. Gawlikowski
Cedrique Rovile Njieutcheu Tassi
Mohsin Ali
Jongseo Lee
Matthias Humt
...
R. Roscher
Muhammad Shahzad
Wen Yang
R. Bamler
Xiaoxiang Zhu
BDL
UQCV
OOD
32
1,109
0
07 Jul 2021
Robust Out-of-Distribution Detection on Deep Probabilistic Generative Models
Jaemoo Choi
Changyeon Yoon
Jeongwoo Bae
Myung-joo Kang
OODD
27
4
0
15 Jun 2021
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion
Saeed Soori
Bugra Can
Baourun Mu
Mert Gurbuzbalaban
M. Dehnavi
21
10
0
07 Jun 2021
LQF: Linear Quadratic Fine-Tuning
Alessandro Achille
Aditya Golatkar
Avinash Ravichandran
M. Polito
Stefano Soatto
19
27
0
21 Dec 2020
A Trace-restricted Kronecker-Factored Approximation to Natural Gradient
Kai-Xin Gao
Xiaolei Liu
Zheng-Hai Huang
Min Wang
Zidong Wang
Dachuan Xu
F. Yu
24
11
0
21 Nov 2020
Transform Quantization for CNN (Convolutional Neural Network) Compression
Sean I. Young
Wang Zhe
David S. Taubman
B. Girod
MQ
29
69
0
02 Sep 2020
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
21
13
0
17 Aug 2020
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
21
2
0
17 Jul 2020
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
34
32
0
18 Jun 2020
Continual Learning with Extended Kronecker-factored Approximate Curvature
Janghyeon Lee
H. Hong
Donggyu Joo
Junmo Kim
CLL
20
52
0
16 Apr 2020
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
21
57
0
28 May 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Fisher Information and Natural Gradient Learning of Random Deep Networks
S. Amari
Ryo Karakida
Masafumi Oizumi
14
34
0
22 Aug 2018
Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis
Thomas George
César Laurent
Xavier Bouthillier
Nicolas Ballas
Pascal Vincent
ODL
26
150
0
11 Jun 2018
Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training
Boyu Chen
Wenlian Lu
Ernest Fokoue
21
1
0
22 May 2018
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
20
622
0
17 Aug 2017
Kronecker Recurrent Units
C. Jose
Moustapha Cissé
F. Fleuret
ODL
24
45
0
29 May 2017
1