ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.01407
  4. Cited By
A Kronecker-factored approximate Fisher matrix for convolution layers

A Kronecker-factored approximate Fisher matrix for convolution layers

3 February 2016
Roger C. Grosse
James Martens
    ODL
ArXivPDFHTML

Papers citing "A Kronecker-factored approximate Fisher matrix for convolution layers"

47 / 47 papers shown
Title
FedBEns: One-Shot Federated Learning based on Bayesian Ensemble
FedBEns: One-Shot Federated Learning based on Bayesian Ensemble
Jacopo Talpini
Marco Savi
Giovanni Neglia
FedML
Presented at ResearchTrend Connect | FedML on 07 May 2025
76
0
0
19 Mar 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
58
3
0
31 Jan 2025
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Nilo Schwencke
Cyril Furtlehner
64
1
0
14 Dec 2024
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Lukas Tatzel
Bálint Mucsányi
Osane Hackel
Philipp Hennig
43
0
0
18 Oct 2024
Influence Functions for Scalable Data Attribution in Diffusion Models
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
TDI
DiffM
75
4
0
17 Oct 2024
Second-Order Min-Max Optimization with Lazy Hessians
Second-Order Min-Max Optimization with Lazy Hessians
Lesi Chen
Chengchang Liu
Jingzhao Zhang
43
1
0
12 Oct 2024
Scalable Bayesian Learning with posteriors
Scalable Bayesian Learning with posteriors
Samuel Duffield
Kaelan Donatella
Johnathan Chiu
Phoebe Klett
Daniel Simpson
BDL
UQCV
62
1
0
31 May 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
76
2
0
26 May 2024
Structured Inverse-Free Natural Gradient: Memory-Efficient &
  Numerically-Stable KFAC
Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC
Wu Lin
Felix Dangel
Runa Eschenhagen
Kirill Neklyudov
Agustinus Kristiadi
Richard Turner
Alireza Makhzani
22
3
0
09 Dec 2023
Eva: A General Vectorized Approximation Framework for Second-order
  Optimization
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
S. Shi
Bo-wen Li
28
1
0
04 Aug 2023
Modify Training Directions in Function Space to Reduce Generalization
  Error
Modify Training Directions in Function Space to Reduce Generalization Error
Yi Yu
Wenlian Lu
Boyu Chen
24
0
0
25 Jul 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
46
128
0
23 May 2023
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Felix Petersen
Tobias Sutter
Christian Borgelt
Dongsung Huh
Hilde Kuehne
Yuekai Sun
Oliver Deussen
ODL
31
5
0
01 May 2023
Efficient Activation Function Optimization through Surrogate Modeling
Efficient Activation Function Optimization through Surrogate Modeling
G. Bingham
Risto Miikkulainen
16
2
0
13 Jan 2023
Brand New K-FACs: Speeding up K-FAC with Online Decomposition Updates
Brand New K-FACs: Speeding up K-FAC with Online Decomposition Updates
C. Puiu
14
2
0
16 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network
  Optimization
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
13
1
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
  Approach
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
27
69
0
11 Oct 2022
Scalable K-FAC Training for Deep Neural Networks with Distributed
  Preconditioning
Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning
Lin Zhang
S. Shi
Wei Wang
Bo-wen Li
36
10
0
30 Jun 2022
Information Geometry of Dropout Training
Information Geometry of Dropout Training
Masanari Kimura
H. Hino
14
2
0
22 Jun 2022
Debugging using Orthogonal Gradient Descent
Debugging using Orthogonal Gradient Descent
Narsimha Chilkuri
C. Eliasmith
21
1
0
17 Jun 2022
Amortized Proximal Optimization
Amortized Proximal Optimization
Juhan Bae
Paul Vicol
Jeff Z. HaoChen
Roger C. Grosse
ODL
25
14
0
28 Feb 2022
Invariance Learning in Deep Neural Networks with Differentiable Laplace
  Approximations
Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations
Alexander Immer
Tycho F. A. van der Ouderaa
Gunnar Rätsch
Vincent Fortuin
Mark van der Wilk
BDL
36
44
0
22 Feb 2022
A Geometric Understanding of Natural Gradient
A Geometric Understanding of Natural Gradient
Qinxun Bai
S. Rosenberg
Wei Xu
21
2
0
13 Feb 2022
Gradient Descent on Neurons and its Link to Approximate Second-Order
  Optimization
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
Frederik Benzing
ODL
40
23
0
28 Jan 2022
Merging Models with Fisher-Weighted Averaging
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
29
351
0
18 Nov 2021
Kronecker Factorization for Preventing Catastrophic Forgetting in
  Large-scale Medical Entity Linking
Kronecker Factorization for Preventing Catastrophic Forgetting in Large-scale Medical Entity Linking
Denis Jered McInerney
Luyang Kong
Kristjan Arumae
Byron C. Wallace
Parminder Bhatia
CLL
24
1
0
11 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
24
14
0
01 Nov 2021
Nys-Newton: Nyström-Approximated Curvature for Stochastic Optimization
Nys-Newton: Nyström-Approximated Curvature for Stochastic Optimization
Dinesh Singh
Hardik Tankaria
M. Yamada
ODL
42
2
0
16 Oct 2021
Accelerating Distributed K-FAC with Smart Parallelism of Computing and
  Communication Tasks
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
S. Shi
Lin Zhang
Bo-wen Li
40
9
0
14 Jul 2021
M-FAC: Efficient Matrix-Free Approximations of Second-Order Information
M-FAC: Efficient Matrix-Free Approximations of Second-Order Information
Elias Frantar
Eldar Kurtic
Dan Alistarh
13
57
0
07 Jul 2021
A Survey of Uncertainty in Deep Neural Networks
A Survey of Uncertainty in Deep Neural Networks
J. Gawlikowski
Cedrique Rovile Njieutcheu Tassi
Mohsin Ali
Jongseo Lee
Matthias Humt
...
R. Roscher
Muhammad Shahzad
Wen Yang
R. Bamler
Xiaoxiang Zhu
BDL
UQCV
OOD
35
1,109
0
07 Jul 2021
Robust Out-of-Distribution Detection on Deep Probabilistic Generative
  Models
Robust Out-of-Distribution Detection on Deep Probabilistic Generative Models
Jaemoo Choi
Changyeon Yoon
Jeongwoo Bae
Myung-joo Kang
OODD
30
4
0
15 Jun 2021
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block
  Inversion
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion
Saeed Soori
Bugra Can
Baourun Mu
Mert Gurbuzbalaban
M. Dehnavi
24
10
0
07 Jun 2021
LQF: Linear Quadratic Fine-Tuning
LQF: Linear Quadratic Fine-Tuning
Alessandro Achille
Aditya Golatkar
Avinash Ravichandran
M. Polito
Stefano Soatto
26
27
0
21 Dec 2020
A Trace-restricted Kronecker-Factored Approximation to Natural Gradient
A Trace-restricted Kronecker-Factored Approximation to Natural Gradient
Kai-Xin Gao
Xiaolei Liu
Zheng-Hai Huang
Min Wang
Zidong Wang
Dachuan Xu
F. Yu
24
11
0
21 Nov 2020
Transform Quantization for CNN (Convolutional Neural Network)
  Compression
Transform Quantization for CNN (Convolutional Neural Network) Compression
Sean I. Young
Wang Zhe
David S. Taubman
B. Girod
MQ
29
69
0
02 Sep 2020
Whitening and second order optimization both make information in the
  dataset unusable during training, and can reduce or prevent generalization
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
24
13
0
17 Aug 2020
A Differential Game Theoretic Neural Optimizer for Training Residual
  Networks
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
24
2
0
17 Jul 2020
When Does Preconditioning Help or Hurt Generalization?
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
34
32
0
18 Jun 2020
Continual Learning with Extended Kronecker-factored Approximate
  Curvature
Continual Learning with Extended Kronecker-factored Approximate Curvature
Janghyeon Lee
H. Hong
Donggyu Joo
Junmo Kim
CLL
20
52
0
16 Apr 2020
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for
  Regression Problems
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
21
57
0
28 May 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with
  Structured Covariance Noise
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Fisher Information and Natural Gradient Learning of Random Deep Networks
Fisher Information and Natural Gradient Learning of Random Deep Networks
S. Amari
Ryo Karakida
Masafumi Oizumi
14
34
0
22 Aug 2018
Fast Approximate Natural Gradient Descent in a Kronecker-factored
  Eigenbasis
Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis
Thomas George
César Laurent
Xavier Bouthillier
Nicolas Ballas
Pascal Vincent
ODL
29
150
0
11 Jun 2018
Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training
Meta-Learning with Hessian-Free Approach in Deep Neural Nets Training
Boyu Chen
Wenlian Lu
Ernest Fokoue
21
1
0
22 May 2018
Scalable trust-region method for deep reinforcement learning using
  Kronecker-factored approximation
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
22
622
0
17 Aug 2017
Kronecker Recurrent Units
Kronecker Recurrent Units
C. Jose
Moustapha Cissé
F. Fleuret
ODL
24
45
0
29 May 2017
1