ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1503.05671
  4. Cited By
Optimizing Neural Networks with Kronecker-factored Approximate Curvature

Optimizing Neural Networks with Kronecker-factored Approximate Curvature

19 March 2015
James Martens
Roger C. Grosse
    ODL
ArXivPDFHTML

Papers citing "Optimizing Neural Networks with Kronecker-factored Approximate Curvature"

50 / 211 papers shown
Title
Policy Gradient with Second Order Momentum
Policy Gradient with Second Order Momentum
Tianyu Sun
0
0
0
16 May 2025
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
Jinuk Kim
Marwa El Halabi
W. Park
Clemens JS Schaefer
Deokjae Lee
Yeonhong Park
Jae W. Lee
Hyun Oh Song
MQ
34
0
0
11 May 2025
More Optimal Fractional-Order Stochastic Gradient Descent for Non-Convex Optimization Problems
More Optimal Fractional-Order Stochastic Gradient Descent for Non-Convex Optimization Problems
Mohammad Partohaghighi
Roummel Marcia
YangQuan Chen
19
0
0
05 May 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Shunxian Gu
Chaoqun You
Bangbang Ren
Lailong Luo
Junxu Xia
Deke Guo
44
0
0
02 May 2025
MAGIC: Near-Optimal Data Attribution for Deep Learning
MAGIC: Near-Optimal Data Attribution for Deep Learning
Andrew Ilyas
Logan Engstrom
TDI
39
0
0
23 Apr 2025
Self-Controlled Dynamic Expansion Model for Continual Learning
Self-Controlled Dynamic Expansion Model for Continual Learning
RunQing Wu
KaiHui Huang
HanYi Zhang
Fei Ye
CLL
VLM
50
0
0
14 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
72
1
0
06 Apr 2025
Continual Learning With Quasi-Newton Methods
Continual Learning With Quasi-Newton Methods
Steven Vander Eeckt
Hugo Van hamme
CLL
BDL
69
0
0
25 Mar 2025
FedBEns: One-Shot Federated Learning based on Bayesian Ensemble
FedBEns: One-Shot Federated Learning based on Bayesian Ensemble
Jacopo Talpini
Marco Savi
Giovanni Neglia
FedML
Presented at ResearchTrend Connect | FedML on 07 May 2025
79
0
0
19 Mar 2025
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Yaxiong Chen
Yujie Wang
Zixuan Zheng
Jingliang Hu
Yilei Shi
Shengwu Xiong
Xiao Xiang Zhu
Lichao Mou
54
1
0
18 Mar 2025
Effective Dimension Aware Fractional-Order Stochastic Gradient Descent for Convex Optimization Problems
Effective Dimension Aware Fractional-Order Stochastic Gradient Descent for Convex Optimization Problems
Mohammad Partohaghighi
Roummel Marcia
YangQuan Chen
46
0
0
17 Mar 2025
CAMEx: Curvature-aware Merging of Experts
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
104
2
0
26 Feb 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
125
1
0
24 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
128
2
0
21 Feb 2025
Spectral-factorized Positive-definite Curvature Learning for NN Training
Spectral-factorized Positive-definite Curvature Learning for NN Training
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Roger B. Grosse
47
0
0
10 Feb 2025
Is attention all you need to solve the correlated electron problem?
Is attention all you need to solve the correlated electron problem?
Max Geier
Khachatur Nazaryan
Timothy Zaklama
Liang Fu
48
3
0
07 Feb 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
58
3
0
31 Jan 2025
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Akiyoshi Tomihari
Issei Sato
ODL
61
1
0
31 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Xavier Gonzalez
Andrew Warrington
Jimmy T.H. Smith
Scott W. Linderman
93
8
0
17 Jan 2025
Incrementally Learning Multiple Diverse Data Domains via Multi-Source Dynamic Expansion Model
Incrementally Learning Multiple Diverse Data Domains via Multi-Source Dynamic Expansion Model
RunQing Wu
Fei Ye
QiHe Liu
Guoxi Huang
Jinyu Guo
Rongyao Hu
CLL
175
0
0
15 Jan 2025
Knowledge Distillation with Adapted Weight
Sirong Wu
Xi Luo
Junjie Liu
Yuhui Deng
40
0
0
06 Jan 2025
Functional Risk Minimization
Functional Risk Minimization
Ferran Alet
Clement Gehring
Tomás Lozano-Pérez
Kenji Kawaguchi
Joshua B. Tenenbaum
Leslie Pack Kaelbling
OffRL
60
0
0
31 Dec 2024
Grams: Gradient Descent with Adaptive Momentum Scaling
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao-quan Song
ODL
98
2
0
22 Dec 2024
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Nilo Schwencke
Cyril Furtlehner
69
1
0
14 Dec 2024
Streamlining Prediction in Bayesian Deep Learning
Streamlining Prediction in Bayesian Deep Learning
Rui Li
Marcus Klasson
Arno Solin
Martin Trapp
UQCV
BDL
97
2
0
27 Nov 2024
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
108
5
0
25 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
45
0
0
04 Nov 2024
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Lukas Tatzel
Bálint Mucsányi
Osane Hackel
Philipp Hennig
43
0
0
18 Oct 2024
Influence Functions for Scalable Data Attribution in Diffusion Models
Influence Functions for Scalable Data Attribution in Diffusion Models
Bruno Mlodozeniec
Runa Eschenhagen
Juhan Bae
Alexander Immer
David Krueger
Richard E. Turner
TDI
DiffM
75
4
0
17 Oct 2024
Second-Order Min-Max Optimization with Lazy Hessians
Second-Order Min-Max Optimization with Lazy Hessians
Lesi Chen
Chengchang Liu
Jingzhao Zhang
46
1
0
12 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
75
23
0
17 Sep 2024
An Improved Empirical Fisher Approximation for Natural Gradient Descent
An Improved Empirical Fisher Approximation for Natural Gradient Descent
Xiaodong Wu
Wenyi Yu
Chao Zhang
Philip Woodland
29
3
0
10 Jun 2024
Scalable Bayesian Learning with posteriors
Scalable Bayesian Learning with posteriors
Samuel Duffield
Kaelan Donatella
Johnathan Chiu
Phoebe Klett
Daniel Simpson
BDL
UQCV
62
1
0
31 May 2024
4-bit Shampoo for Memory-Efficient Network Training
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
41
5
0
28 May 2024
From Learning to Optimize to Learning Optimization Algorithms
From Learning to Optimize to Learning Optimization Algorithms
Camille Castera
Peter Ochs
65
1
0
28 May 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
76
2
0
26 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
84
1
0
30 Apr 2024
Variational Stochastic Gradient Descent for Deep Neural Networks
Variational Stochastic Gradient Descent for Deep Neural Networks
Haotian Chen
Anna Kuzina
Babak Esmaeili
Jakub M. Tomczak
52
0
0
09 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
M. Zeilinger
Michael Muehlebach
44
0
0
08 Apr 2024
Active Few-Shot Fine-Tuning
Active Few-Shot Fine-Tuning
Jonas Hübotter
Bhavya Sukhija
Lenart Treven
Yarden As
Andreas Krause
45
1
0
13 Feb 2024
The LLM Surgeon
The LLM Surgeon
Tycho F. A. van der Ouderaa
Markus Nagel
M. V. Baalen
Yuki Markus Asano
Tijmen Blankevoort
39
14
0
28 Dec 2023
Structured Inverse-Free Natural Gradient: Memory-Efficient &
  Numerically-Stable KFAC
Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC
Wu Lin
Felix Dangel
Runa Eschenhagen
Kirill Neklyudov
Agustinus Kristiadi
Richard Turner
Alireza Makhzani
22
3
0
09 Dec 2023
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives
Pierre Wolinski
ODL
29
0
0
06 Dec 2023
Eva: A General Vectorized Approximation Framework for Second-order
  Optimization
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
S. Shi
Bo-wen Li
28
1
0
04 Aug 2023
Modify Training Directions in Function Space to Reduce Generalization
  Error
Modify Training Directions in Function Space to Reduce Generalization Error
Yi Yu
Wenlian Lu
Boyu Chen
27
0
0
25 Jul 2023
Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural
  Wavefunctions
Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions
Michael Scherbela
Leon Gerard
Philipp Grohs
35
5
0
15 Jul 2023
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned
  Stochastic Optimization
KrADagrad: Kronecker Approximation-Domination Gradient Preconditioned Stochastic Optimization
Jonathan Mei
Alexander Moreno
Luke Walters
ODL
29
1
0
30 May 2023
A Score-Based Model for Learning Neural Wavefunctions
A Score-Based Model for Learning Neural Wavefunctions
Xuan Zhang
Shenglong Xu
Shuiwang Ji
DiffM
28
1
0
25 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
55
130
0
23 May 2023
What Matters in Reinforcement Learning for Tractography
What Matters in Reinforcement Learning for Tractography
Antoine Théberge
Christian Desrosiers
Maxime Descoteaux
Pierre-Marc Jodoin
OffRL
26
2
0
15 May 2023
12345
Next