ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1503.05671
  4. Cited By
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
v1v2v3v4v5v6v7 (latest)

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

19 March 2015
James Martens
Roger C. Grosse
    ODL
ArXiv (abs)PDFHTML

Papers citing "Optimizing Neural Networks with Kronecker-factored Approximate Curvature"

50 / 645 papers shown
Title
Randomized Schur Complement Views for Graph Contrastive Learning
Randomized Schur Complement Views for Graph Contrastive Learning
Vignesh Kothapalli
122
2
0
06 Jun 2023
Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels
Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels
Alexander Immer
Tycho F. A. van der Ouderaa
Mark van der Wilk
Gunnar Rätsch
Bernhard Schölkopf
BDL
68
13
0
06 Jun 2023
MKOR: Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1
  Updates
MKOR: Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1 Updates
Mohammad Mozaffari
Sikan Li
Zhao Zhang
M. Dehnavi
74
4
0
02 Jun 2023
Understanding MLP-Mixer as a Wide and Sparse MLP
Understanding MLP-Mixer as a Wide and Sparse MLP
Tomohiro Hayase
Ryo Karakida
MoE
75
6
0
02 Jun 2023
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Yu Yang
Hao Kang
Baharan Mirzasoleiman
86
35
0
02 Jun 2023
Low-rank extended Kalman filtering for online learning of neural
  networks from streaming data
Low-rank extended Kalman filtering for online learning of neural networks from streaming data
Peter Chang
Gerardo Duran-Martín
Alexander Y. Shestopaloff
Matt Jones
Kevin P. Murphy
BDL
116
20
0
31 May 2023
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned
  Stochastic Optimization
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned Stochastic Optimization
Jonathan Mei
Alexander Moreno
Luke Walters
ODL
55
1
0
30 May 2023
Improving Neural Additive Models with Bayesian Principles
Improving Neural Additive Models with Bayesian Principles
Kouroche Bouchiat
Alexander Immer
Hugo Yèche
Gunnar Rätsch
Vincent Fortuin
BDLMedIm
105
6
0
26 May 2023
A Score-Based Model for Learning Neural Wavefunctions
A Score-Based Model for Learning Neural Wavefunctions
Xuan Zhang
Shenglong Xu
Shuiwang Ji
DiffM
61
1
0
25 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
142
149
0
23 May 2023
Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning
Achraf Bahamou
Shiqian Ma
ODL
108
0
0
23 May 2023
The Hessian perspective into the Nature of Convolutional Neural Networks
The Hessian perspective into the Nature of Convolutional Neural Networks
Sidak Pal Singh
Thomas Hofmann
Bernhard Schölkopf
98
11
0
16 May 2023
What Matters in Reinforcement Learning for Tractography
What Matters in Reinforcement Learning for Tractography
Antoine Théberge
Christian Desrosiers
Maxime Descoteaux
Pierre-Marc Jodoin
OffRL
46
2
0
15 May 2023
Curvature-Aware Training for Coordinate Networks
Curvature-Aware Training for Coordinate Networks
Hemanth Saratchandran
Shin-Fang Chng
Sameera Ramasinghe
L. MacDonald
Simon Lucey
135
5
0
15 May 2023
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
Kazuki Osawa
Satoki Ishikawa
Rio Yokota
Shigang Li
Torsten Hoefler
ODL
92
15
0
08 May 2023
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Felix Petersen
Tobias Sutter
Christian Borgelt
Dongsung Huh
Hilde Kuehne
Yuekai Sun
Oliver Deussen
ODL
91
5
0
01 May 2023
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on
  Transformers, but Sign Descent Might Be
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Frederik Kunstner
Jacques Chen
J. Lavington
Mark Schmidt
100
75
0
27 Apr 2023
Sparsified Model Zoo Twins: Investigating Populations of Sparsified
  Neural Network Models
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
D. Honegger
Konstantin Schurholt
Damian Borth
81
4
0
26 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
105
43
0
07 Apr 2023
Analysis and Comparison of Two-Level KFAC Methods for Training Deep
  Neural Networks
Analysis and Comparison of Two-Level KFAC Methods for Training Deep Neural Networks
Abdoulaye Koroko
A. Anciaux-Sedrakian
I. B. Gharbia
Valérie Garès
M. Haddou
Quang-Huy Tran
60
0
0
31 Mar 2023
Scalable Bayesian Meta-Learning through Generalized Implicit Gradients
Scalable Bayesian Meta-Learning through Generalized Implicit Gradients
Yilang Zhang
Bingcong Li
Shi-Ji Gao
G. Giannakis
BDL
82
11
0
31 Mar 2023
Towards a Foundation Model for Neural Network Wavefunctions
Towards a Foundation Model for Neural Network Wavefunctions
Michael Scherbela
Leon Gerard
Philipp Grohs
107
10
0
17 Mar 2023
Decentralized Riemannian natural gradient methods with Kronecker-product
  approximations
Decentralized Riemannian natural gradient methods with Kronecker-product approximations
Jiang Hu
Kangkang Deng
Na Li
Quanzheng Li
63
8
0
16 Mar 2023
Artificial intelligence for artificial materials: moiré atom
Artificial intelligence for artificial materials: moiré atom
Di Luo
Aidan P. Reddy
T. Devakul
L. Fu
74
3
0
14 Mar 2023
Uncertainty quantification in neural network classifiers -- a local
  linear approach
Uncertainty quantification in neural network classifiers -- a local linear approach
Magnus Malmström
Isaac Skog
Daniel Axehill
Fredrik K. Gustafsson
UQCV
60
1
0
10 Mar 2023
Scalable Stochastic Gradient Riemannian Langevin Dynamics in
  Non-Diagonal Metrics
Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics
Hanlin Yu
M. Hartmann
Bernardo Williams
Arto Klami
BDL
88
6
0
09 Mar 2023
Natural Gradient Methods: Perspectives, Efficient-Scalable
  Approximations, and Analysis
Natural Gradient Methods: Perspectives, Efficient-Scalable Approximations, and Analysis
Rajesh Shrestha
ODL
50
5
0
06 Mar 2023
Structured Pruning for Deep Convolutional Neural Networks: A survey
Structured Pruning for Deep Convolutional Neural Networks: A survey
Yang He
Lingao Xiao
3DPC
120
146
0
01 Mar 2023
Natural Gradient Hybrid Variational Inference with Application to Deep
  Mixed Models
Natural Gradient Hybrid Variational Inference with Application to Deep Mixed Models
Weiben Zhang
M. Smith
Worapree Maneesoonthorn
Rubén Loaiza-Maya
31
1
0
27 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
72
0
0
25 Feb 2023
Variational Linearized Laplace Approximation for Bayesian Deep Learning
Variational Linearized Laplace Approximation for Bayesian Deep Learning
Luis A. Ortega
Simón Rodríguez Santana
Daniel Hernández-Lobato
BDLUQCV
135
4
0
24 Feb 2023
Deep Transformers without Shortcuts: Modifying Self-attention for
  Faithful Signal Propagation
Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Bobby He
James Martens
Guodong Zhang
Aleksandar Botev
Andy Brock
Samuel L. Smith
Yee Whye Teh
85
30
0
20 Feb 2023
Image Reconstruction via Deep Image Prior Subspaces
Image Reconstruction via Deep Image Prior Subspaces
Riccardo Barbano
Javier Antorán
Johannes Leuschner
José Miguel Hernández-Lobato
Bangti Jin
vZeljko Kereta
108
1
0
20 Feb 2023
Simplifying Momentum-based Positive-definite Submanifold Optimization
  with Applications to Deep Learning
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
Wu Lin
Valentin Duruisseaux
Melvin Leok
Frank Nielsen
Mohammad Emtiyaz Khan
Mark Schmidt
104
10
0
20 Feb 2023
Nystrom Method for Accurate and Scalable Implicit Differentiation
Nystrom Method for Accurate and Scalable Implicit Differentiation
Ryuichiro Hataya
M. Yamada
ODL
85
9
0
20 Feb 2023
Structural Neural Additive Models: Enhanced Interpretable Machine
  Learning
Structural Neural Additive Models: Enhanced Interpretable Machine Learning
Mattias Luber
Anton Thielmann
Benjamin Säfken
76
8
0
18 Feb 2023
FOSI: Hybrid First and Second Order Optimization
FOSI: Hybrid First and Second Order Optimization
Hadar Sivan
Moshe Gabel
Assaf Schuster
ODL
74
2
0
16 Feb 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
73
14
0
14 Feb 2023
Symbolic Discovery of Optimization Algorithms
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
178
383
0
13 Feb 2023
Fixing Overconfidence in Dynamic Neural Networks
Fixing Overconfidence in Dynamic Neural Networks
Lassi Meronen
Martin Trapp
Andrea Pilzer
Le Yang
Arno Solin
BDL
127
16
0
13 Feb 2023
Generalizing Neural Wave Functions
Generalizing Neural Wave Functions
Nicholas Gao
Stephan Günnemann
69
24
0
08 Feb 2023
Sketchy: Memory-efficient Adaptive Regularization with Frequent
  Directions
Sketchy: Memory-efficient Adaptive Regularization with Frequent Directions
Vladimir Feinberg
Xinyi Chen
Y. Jennifer Sun
Rohan Anil
Elad Hazan
103
13
0
07 Feb 2023
Efficient Parametric Approximations of Neural Network Function Space
  Distance
Efficient Parametric Approximations of Neural Network Function Space Distance
Nikita Dhawan
Sicong Huang
Juhan Bae
Roger C. Grosse
83
5
0
07 Feb 2023
Learning Discretized Neural Networks under Ricci Flow
Learning Discretized Neural Networks under Ricci Flow
Jun Chen
Han Chen
Mengmeng Wang
Guang Dai
Ivor W. Tsang
Yang Liu
85
4
0
07 Feb 2023
Dropout Injection at Test Time for Post Hoc Uncertainty Quantification
  in Neural Networks
Dropout Injection at Test Time for Post Hoc Uncertainty Quantification in Neural Networks
Emanuele Ledda
Giorgio Fumera
Fabio Roli
BDLUQCV
111
18
0
06 Feb 2023
Rethinking Gauss-Newton for learning over-parameterized models
Rethinking Gauss-Newton for learning over-parameterized models
Michael Arbel
Romain Menegaux
Pierre Wolinski
AI4CE
100
6
0
06 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
105
10
0
03 Feb 2023
A Comprehensive Survey of Continual Learning: Theory, Method and
  Application
A Comprehensive Survey of Continual Learning: Theory, Method and Application
Liyuan Wang
Xingxing Zhang
Hang Su
Jun Zhu
KELMCLL
238
716
0
31 Jan 2023
Efficient Activation Function Optimization through Surrogate Modeling
Efficient Activation Function Optimization through Surrogate Modeling
G. Bingham
Risto Miikkulainen
92
2
0
13 Jan 2023
Improving Levenberg-Marquardt Algorithm for Neural Networks
Improving Levenberg-Marquardt Algorithm for Neural Networks
Omead Brandon Pooladzandi
Yiming Zhou
ODL
57
2
0
17 Dec 2022
Previous
123456...111213
Next