Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1503.05671
Cited By
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
19 March 2015
James Martens
Roger C. Grosse
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimizing Neural Networks with Kronecker-factored Approximate Curvature"
50 / 228 papers shown
Title
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
55
132
0
23 May 2023
Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning
Achraf Bahamou
D. Goldfarb
ODL
36
0
0
23 May 2023
What Matters in Reinforcement Learning for Tractography
Antoine Théberge
Christian Desrosiers
Maxime Descoteaux
Pierre-Marc Jodoin
OffRL
29
2
0
15 May 2023
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
Kazuki Osawa
Satoki Ishikawa
Rio Yokota
Shigang Li
Torsten Hoefler
ODL
38
14
0
08 May 2023
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Felix Petersen
Tobias Sutter
Christian Borgelt
Dongsung Huh
Hilde Kuehne
Yuekai Sun
Oliver Deussen
ODL
31
5
0
01 May 2023
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
D. Honegger
Konstantin Schurholt
Damian Borth
31
4
0
26 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
Scalable Bayesian Meta-Learning through Generalized Implicit Gradients
Yilang Zhang
Bingcong Li
Shi-Ji Gao
G. Giannakis
BDL
21
9
0
31 Mar 2023
Artificial intelligence for artificial materials: moiré atom
Di Luo
Aidan P. Reddy
T. Devakul
L. Fu
29
3
0
14 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
Structural Neural Additive Models: Enhanced Interpretable Machine Learning
Mattias Luber
Anton Thielmann
Benjamin Säfken
31
7
0
18 Feb 2023
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
34
2
0
16 Feb 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
32
11
0
14 Feb 2023
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
67
353
0
13 Feb 2023
Fixing Overconfidence in Dynamic Neural Networks
Lassi Meronen
Martin Trapp
Andrea Pilzer
Le Yang
Arno Solin
BDL
37
16
0
13 Feb 2023
Sketchy: Memory-efficient Adaptive Regularization with Frequent Directions
Vladimir Feinberg
Xinyi Chen
Y. Jennifer Sun
Rohan Anil
Elad Hazan
29
12
0
07 Feb 2023
Efficient Parametric Approximations of Neural Network Function Space Distance
Nikita Dhawan
Sicong Huang
Juhan Bae
Roger C. Grosse
16
5
0
07 Feb 2023
Learning Discretized Neural Networks under Ricci Flow
Jun Chen
Han Chen
Mengmeng Wang
Guang Dai
Ivor W. Tsang
Yong-Jin Liu
25
2
0
07 Feb 2023
Dropout Injection at Test Time for Post Hoc Uncertainty Quantification in Neural Networks
Emanuele Ledda
Giorgio Fumera
Fabio Roli
BDL
UQCV
38
14
0
06 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
18
7
0
03 Feb 2023
Efficient Activation Function Optimization through Surrogate Modeling
G. Bingham
Risto Miikkulainen
18
2
0
13 Jan 2023
Improving Levenberg-Marquardt Algorithm for Neural Networks
Omead Brandon Pooladzandi
Yiming Zhou
ODL
28
2
0
17 Dec 2022
Mirror descent of Hopfield model
Hyungjoon Soh
D. Kim
Juno Hwang
Junghyo Jo
25
0
0
29 Nov 2022
Exploring Temporal Information Dynamics in Spiking Neural Networks
Youngeun Kim
Yuhang Li
Hyoungseob Park
Yeshwanth Venkatesha
Anna Hambitzer
Priyadarshini Panda
19
32
0
26 Nov 2022
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices
Kazuki Osawa
Shigang Li
Torsten Hoefler
AI4CE
35
24
0
25 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
35
60
0
17 Nov 2022
Black Box Lie Group Preconditioners for SGD
Xi-Lin Li
13
8
0
08 Nov 2022
Adaptive scaling of the learning rate by second order automatic differentiation
F. Gournay
Alban Gossard
ODL
31
1
0
26 Oct 2022
Accelerated Linearized Laplace Approximation for Bayesian Deep Learning
Zhijie Deng
Feng Zhou
Jun Zhu
BDL
50
19
0
23 Oct 2022
HesScale: Scalable Computation of Hessian Diagonals
Mohamed Elsayed
A. R. Mahmood
22
7
0
20 Oct 2022
Brand New K-FACs: Speeding up K-FAC with Online Decomposition Updates
C. Puiu
14
2
0
16 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
15
1
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
32
69
0
11 Oct 2022
Rethinking Normalization Methods in Federated Learning
Zhixu Du
Jingwei Sun
Ang Li
Pin-Yu Chen
Jianyi Zhang
H. Li
Yiran Chen
FedML
29
28
0
07 Oct 2022
Reinforcement Learning Algorithms: An Overview and Classification
Fadi AlMahamid
Katarina Grolinger
16
40
0
29 Sep 2022
Random initialisations performing above chance and how to find them
Frederik Benzing
Simon Schug
Robert Meier
J. Oswald
Yassir Akram
Nicolas Zucchet
Laurence Aitchison
Angelika Steger
ODL
35
24
0
15 Sep 2022
Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection
Eric Enouen
Katja Mathesius
Sean Wang
Arielle K. Carr
Sihong Xie
20
2
0
15 Sep 2022
Deep Variational Free Energy Approach to Dense Hydrogen
H.-j. Xie
Ziqun Li
Han Wang
Linfeng Zhang
Lei Wang
35
9
0
13 Sep 2022
Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning
Lin Zhang
S. Shi
Wei Wang
Bo-wen Li
36
10
0
30 Jun 2022
Laplacian Autoencoders for Learning Stochastic Representations
M. Miani
Frederik Warburg
Pablo Moreno-Muñoz
Nicke Skafte Detlefsen
Søren Hauberg
UQCV
BDL
SSL
35
10
0
30 Jun 2022
Cold Posteriors through PAC-Bayes
Konstantinos Pitas
Julyan Arbel
26
5
0
22 Jun 2022
Information Geometry of Dropout Training
Masanari Kimura
H. Hino
21
2
0
22 Jun 2022
Adapting the Linearised Laplace Model Evidence for Modern Deep Learning
Javier Antorán
David Janz
J. Allingham
Erik A. Daxberger
Riccardo Barbano
Eric T. Nalisnick
José Miguel Hernández-Lobato
UQCV
BDL
30
28
0
17 Jun 2022
Fast Finite Width Neural Tangent Kernel
Roman Novak
Jascha Narain Sohl-Dickstein
S. Schoenholz
AAML
25
53
0
17 Jun 2022
O
(
N
2
)
O(N^2)
O
(
N
2
)
Universal Antisymmetry in Fermionic Neural Networks
Tianyu Pang
Shuicheng Yan
Min-Bin Lin
21
3
0
26 May 2022
Symmetry Teleportation for Accelerated Optimization
B. Zhao
Nima Dehmamy
Robin Walters
Rose Yu
ODL
23
20
0
21 May 2022
Deep Unlearning via Randomized Conditionally Independent Hessians
Ronak R. Mehta
Sourav Pal
Vikas Singh
Sathya Ravi
MU
27
81
0
15 Apr 2022
Rethinking Exponential Averaging of the Fisher
C. Puiu
23
1
0
10 Apr 2022
Practical tradeoffs between memory, compute, and performance in learned optimizers
Luke Metz
C. Freeman
James Harrison
Niru Maheswaranathan
Jascha Narain Sohl-Dickstein
38
32
0
22 Mar 2022
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Guodong Zhang
Aleksandar Botev
James Martens
OffRL
34
26
0
15 Mar 2022
Previous
1
2
3
4
5
Next