ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1301.3584
  4. Cited By
Revisiting Natural Gradient for Deep Networks

Revisiting Natural Gradient for Deep Networks

16 January 2013
Razvan Pascanu
Yoshua Bengio
    ODL
ArXivPDFHTML

Papers citing "Revisiting Natural Gradient for Deep Networks"

50 / 229 papers shown
Title
Wasserstein Policy Optimization
Wasserstein Policy Optimization
David Pfau
Ian Davies
Diana Borsa
Joao G. M. Araujo
Brendan D. Tracey
H. V. Hasselt
29
0
0
01 May 2025
FISH-Tuning: Enhancing PEFT Methods with Fisher Information
FISH-Tuning: Enhancing PEFT Methods with Fisher Information
Kang Xue
Ming Dong
Xinhui Tu
Tingting He
28
0
0
05 Apr 2025
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Leonardo Iurada
Marco Ciccone
Tatiana Tommasi
KELM
MoMe
40
0
0
03 Apr 2025
FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation
FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation
Mohammadmahdi Honarmand
O. Mutlu
Parnian Azizian
Saimourya Surabhi
Dennis Paul Wall
TTA
75
0
0
29 Mar 2025
Structured Preconditioners in Adaptive Optimization: A Unified Analysis
Shuo Xie
Tianhao Wang
Sashank J. Reddi
Sanjiv Kumar
Zhiyuan Li
45
1
0
13 Mar 2025
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model
Wenke Huang
Jian Liang
Xianda Guo
Yiyang Fang
Guancheng Wan
...
Bin Yang
He Li
Jiawei Shao
Mang Ye
Bo Du
OffRL
LRM
MLLM
KELM
VLM
65
1
0
06 Mar 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
126
2
0
21 Feb 2025
Efficient Learning Under Density Shift in Incremental Settings Using Cramér-Rao-Based Regularization
Efficient Learning Under Density Shift in Incremental Settings Using Cramér-Rao-Based Regularization
Behraj Khan
Behroz Mirza
Nouman Durrani
T. Syed
60
0
0
18 Feb 2025
Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks
Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks
Josua Faller
Jörg Martin
BDL
73
0
0
04 Feb 2025
Gauss-Newton Dynamics for Neural Networks: A Riemannian Optimization
  Perspective
Gauss-Newton Dynamics for Neural Networks: A Riemannian Optimization Perspective
Semih Cayci
69
0
0
18 Dec 2024
Mitigating covariate shift in non-colocated data with learned parameter
  priors
Mitigating covariate shift in non-colocated data with learned parameter priors
Behraj Khan
Behroz Mirza
Nouman Durrani
T. Syed
22
0
0
10 Nov 2024
Adaptive Consensus Gradients Aggregation for Scaled Distributed Training
Adaptive Consensus Gradients Aggregation for Scaled Distributed Training
Yoni Choukroun
Shlomi Azoulay
P. Kisilev
29
0
0
06 Nov 2024
Constrained Diffusion Implicit Models
Constrained Diffusion Implicit Models
V. Jayaram
Ira Kemelmacher-Shlizerman
Steven M. Seitz
John Thickstun
DiffM
48
0
0
01 Nov 2024
Fisher Information-based Efficient Curriculum Federated Learning with
  Large Language Models
Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models
Ji Liu
Jiaxiang Ren
Ruoming Jin
Zijie Zhang
Yang Zhou
P. Valduriez
Dejing Dou
FedML
31
1
0
30 Sep 2024
Is All Learning (Natural) Gradient Descent?
Is All Learning (Natural) Gradient Descent?
Lucas Shoji
Kenta Suzuki
Leo Kozachkov
23
1
0
24 Sep 2024
Sequential Learning in the Dense Associative Memory
Sequential Learning in the Dense Associative Memory
Hayden McAlister
Anthony Robins
Lech Szymanski
CLL
134
1
0
24 Sep 2024
Root Cause Analysis Of Productivity Losses In Manufacturing Systems
  Utilizing Ensemble Machine Learning
Root Cause Analysis Of Productivity Losses In Manufacturing Systems Utilizing Ensemble Machine Learning
Stanislau Stankevich
Brandon K. Sai
Wojciech Dudek
19
1
0
31 Jul 2024
Cs2K: Class-specific and Class-shared Knowledge Guidance for Incremental
  Semantic Segmentation
Cs2K: Class-specific and Class-shared Knowledge Guidance for Incremental Semantic Segmentation
Wei Cong
Yang Cong
Yuyang Liu
Gan Sun
VLM
CLL
34
2
0
12 Jul 2024
Improving Knowledge Distillation in Transfer Learning with Layer-wise
  Learning Rates
Improving Knowledge Distillation in Transfer Learning with Layer-wise Learning Rates
Shirley Kokane
M. R. Uddin
Min Xu
24
1
0
05 Jul 2024
A New Perspective on Shampoo's Preconditioner
A New Perspective on Shampoo's Preconditioner
Depen Morwani
Itai Shapira
Nikhil Vyas
Eran Malach
Sham Kakade
Lucas Janson
27
7
0
25 Jun 2024
Hybrid Alignment Training for Large Language Models
Hybrid Alignment Training for Large Language Models
Chenglong Wang
Hang Zhou
Kaiyan Chang
Bei Li
Yongyu Mu
Tong Xiao
Tongran Liu
Jingbo Zhu
35
4
0
21 Jun 2024
Data-Free Generative Replay for Class-Incremental Learning on Imbalanced
  Data
Data-Free Generative Replay for Class-Incremental Learning on Imbalanced Data
Sohaib Younis
Bernhard Seeger
28
0
0
07 Jun 2024
Fisher Flow Matching for Generative Modeling over Discrete Data
Fisher Flow Matching for Generative Modeling over Discrete Data
Oscar Davis
Samuel Kessler
Mircea Petrache
.Ismail .Ilkan Ceylan
Michael M. Bronstein
A. Bose
40
16
0
23 May 2024
FAdam: Adam is a natural gradient optimizer using diagonal empirical
  Fisher information
FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information
Dongseong Hwang
ODL
37
4
0
21 May 2024
Regularized Gauss-Newton for Optimizing Overparameterized Neural
  Networks
Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
Adeyemi Damilare Adeoye
Philipp Christian Petersen
Alberto Bemporad
24
1
0
23 Apr 2024
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural
  Nets Toward Machine Precision
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
Zhuo Chen
Jacob McCarran
Esteban Vizcaino
Marin Soljacic
Di Luo
AI4CE
14
3
0
16 Apr 2024
Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active
  Image Classification
Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification
Denis Huseljic
Paul Hahn
M. Herde
Lukas Rauch
Bernhard Sick
25
1
0
13 Apr 2024
A Gauss-Newton Approach for Min-Max Optimization in Generative
  Adversarial Networks
A Gauss-Newton Approach for Min-Max Optimization in Generative Adversarial Networks
Neel Mishra
Bamdev Mishra
Pratik Jawanpuria
Pawan Kumar
GAN
22
1
0
10 Apr 2024
Fisher Mask Nodes for Language Model Merging
Fisher Mask Nodes for Language Model Merging
Thennal D K
Ganesh Nathan
Suchithra M S
MoMe
AI4CE
47
3
0
14 Mar 2024
Bias Mitigation in Fine-tuning Pre-trained Models for Enhanced Fairness
  and Efficiency
Bias Mitigation in Fine-tuning Pre-trained Models for Enhanced Fairness and Efficiency
Yixuan Zhang
Feng Zhou
21
3
0
01 Mar 2024
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
66
8
0
23 Feb 2024
LiRank: Industrial Large Scale Ranking Models at LinkedIn
LiRank: Industrial Large Scale Ranking Models at LinkedIn
Fedor Borisyuk
Mingzhou Zhou
Qingquan Song
Siyu Zhu
B. Tiwana
...
Chen-Chen Jiang
Haichao Wei
Maneesh Varshney
Amol Ghoting
Souvik Ghosh
24
1
0
10 Feb 2024
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Alexander Soen
Ke Sun
14
1
0
08 Feb 2024
Two Trades is not Baffled: Condensing Graph via Crafting Rational
  Gradient Matching
Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching
Tianle Zhang
Yuchen Zhang
Kun Wang
Kai Wang
Beining Yang
Kaipeng Zhang
Wenqi Shao
Ping Liu
Joey Tianyi Zhou
Yang You
DD
63
13
0
07 Feb 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A
  Second-Order Perspective
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Alireza Makhzani
ODL
54
12
0
05 Feb 2024
Neglected Hessian component explains mysteries in Sharpness
  regularization
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
32
7
0
19 Jan 2024
On the Parameterization of Second-Order Optimization Effective Towards
  the Infinite Width
On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width
Satoki Ishikawa
Ryo Karakida
24
2
0
19 Dec 2023
Weighted Ensemble Models Are Strong Continual Learners
Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine Marouf
Subhankar Roy
Enzo Tartaglione
Stéphane Lathuilière
CLL
27
16
0
14 Dec 2023
Unnatural Algorithms in Machine Learning
Unnatural Algorithms in Machine Learning
Christian Goodbrake
10
0
0
07 Dec 2023
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Ross M. Clarke
José Miguel Hernández-Lobato
38
2
0
23 Oct 2023
Model Merging by Uncertainty-Based Gradient Matching
Model Merging by Uncertainty-Based Gradient Matching
Nico Daheim
Thomas Möllenhoff
E. Ponti
Iryna Gurevych
Mohammad Emtiyaz Khan
MoMe
FedML
32
43
0
19 Oct 2023
Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
Seogkyu Jeon
Bei Liu
Pilhyeon Lee
Kibeom Hong
Jianlong Fu
H. Byun
40
1
0
21 Aug 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to
  Adaptive and Non-adaptive Momentum Optimizers
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
25
4
0
02 Jul 2023
Complementary Learning Subnetworks for Parameter-Efficient
  Class-Incremental Learning
Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning
Depeng Li
Zhigang Zeng
CLL
25
1
0
21 Jun 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
27
128
0
23 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language
  Understanding Aided by Speech Translation
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
36
4
0
16 May 2023
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch
Kazuki Osawa
Satoki Ishikawa
Rio Yokota
Shigang Li
Torsten Hoefler
ODL
28
14
0
08 May 2023
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous
  Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Zhiyuan Zhang
Deli Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
28
5
0
08 May 2023
Taxonomic Class Incremental Learning
Taxonomic Class Incremental Learning
Yuzhao Chen
Zonghua Li
Zhiyuan Hu
Nuno Vasconcelos
CLL
31
3
0
12 Apr 2023
Automatic Gradient Descent: Deep Learning without Hyperparameters
Automatic Gradient Descent: Deep Learning without Hyperparameters
Jeremy Bernstein
Chris Mingard
Kevin Huang
Navid Azizan
Yisong Yue
ODL
16
17
0
11 Apr 2023
12345
Next