ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1301.3584
  4. Cited By
Revisiting Natural Gradient for Deep Networks

Revisiting Natural Gradient for Deep Networks

16 January 2013
Razvan Pascanu
Yoshua Bengio
    ODL
ArXivPDFHTML

Papers citing "Revisiting Natural Gradient for Deep Networks"

50 / 229 papers shown
Title
Analysis and Comparison of Two-Level KFAC Methods for Training Deep
  Neural Networks
Analysis and Comparison of Two-Level KFAC Methods for Training Deep Neural Networks
Abdoulaye Koroko
A. Anciaux-Sedrakian
I. B. Gharbia
Valérie Garès
M. Haddou
Quang-Huy Tran
17
0
0
31 Mar 2023
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Nico Daheim
Nouha Dziri
Mrinmaya Sachan
Iryna Gurevych
E. Ponti
MoMe
28
30
0
30 Mar 2023
Achieving High Accuracy with PINNs via Energy Natural Gradients
Achieving High Accuracy with PINNs via Energy Natural Gradients
Johannes Müller
Marius Zeinhofer
13
4
0
25 Feb 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
26
11
0
14 Feb 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods
  for Transformer Language Models
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
19
0
0
08 Feb 2023
Efficient Parametric Approximations of Neural Network Function Space
  Distance
Efficient Parametric Approximations of Neural Network Function Space Distance
Nikita Dhawan
Sicong Huang
Juhan Bae
Roger C. Grosse
14
5
0
07 Feb 2023
On-the-fly Denoising for Data Augmentation in Natural Language
  Understanding
On-the-fly Denoising for Data Augmentation in Natural Language Understanding
Tianqing Fang
Wenxuan Zhou
Fangyu Liu
Hongming Zhang
Yangqiu Song
Muhao Chen
36
1
0
20 Dec 2022
PipeFisher: Efficient Training of Large Language Models Using Pipelining
  and Fisher Information Matrices
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices
Kazuki Osawa
Shigang Li
Torsten Hoefler
AI4CE
33
24
0
25 Nov 2022
Fine-Tuning Pre-Trained Language Models Effectively by Optimizing
  Subnetworks Adaptively
Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively
Haojie Zhang
Ge Li
Jia Li
Zhongjin Zhang
Yuqi Zhu
Zhi Jin
AI4CE
8
26
0
03 Nov 2022
Numerical Optimizations for Weighted Low-rank Estimation on Language
  Model
Numerical Optimizations for Weighted Low-rank Estimation on Language Model
Ting Hua
Yen-Chang Hsu
Felicity Wang
Qiang Lou
Yilin Shen
Hongxia Jin
13
13
0
02 Nov 2022
Correlation of the importances of neural network weights calculated by
  modern methods of overcoming catastrophic forgetting
Correlation of the importances of neural network weights calculated by modern methods of overcoming catastrophic forgetting
Alexey Kutalev
6
0
0
24 Oct 2022
Exclusive Supermask Subnetwork Training for Continual Learning
Exclusive Supermask Subnetwork Training for Continual Learning
Prateek Yadav
Mohit Bansal
CLL
22
5
0
18 Oct 2022
Optimisation & Generalisation in Networks of Neurons
Optimisation & Generalisation in Networks of Neurons
Jeremy Bernstein
AI4CE
16
2
0
18 Oct 2022
GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization
  for Improved Generalization
GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization
Zhiyuan Zhang
Ruixuan Luo
Qi Su
Xueting Sun
24
11
0
13 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network
  Optimization
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
11
1
0
11 Oct 2022
Continuous Diagnosis and Prognosis by Controlling the Update Process of
  Deep Neural Networks
Continuous Diagnosis and Prognosis by Controlling the Update Process of Deep Neural Networks
Chenxi Sun
Hongyan Li
Moxian Song
D. Cai
B. Zhang
linda Qiao
21
8
0
06 Oct 2022
DLCFT: Deep Linear Continual Fine-Tuning for General Incremental
  Learning
DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning
Hyounguk Shon
Janghyeon Lee
Seungwook Kim
Junmo Kim
CLL
21
11
0
17 Aug 2022
Empirical investigations on WVA structural issues
Empirical investigations on WVA structural issues
Alexey Kutalev
A. Lapina
CLL
13
1
0
11 Aug 2022
Language model compression with weighted low-rank factorization
Language model compression with weighted low-rank factorization
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
14
92
0
30 Jun 2022
Constrained Imitation Learning for a Flapping Wing Unmanned Aerial
  Vehicle
Constrained Imitation Learning for a Flapping Wing Unmanned Aerial Vehicle
T. K C
Taeyoung Lee
16
2
0
08 Jun 2022
Few-Shot Learning by Dimensionality Reduction in Gradient Space
Few-Shot Learning by Dimensionality Reduction in Gradient Space
M. Gauch
M. Beck
Thomas Adler
D. Kotsur
Stefan Fiel
...
Markus Holzleitner
Werner Zellinger
D. Klotz
Sepp Hochreiter
Sebastian Lehner
35
9
0
07 Jun 2022
Mitigating Dataset Bias by Using Per-sample Gradient
Mitigating Dataset Bias by Using Per-sample Gradient
Sumyeong Ahn
Seongyoon Kim
Se-Young Yun
43
20
0
31 May 2022
Learning to Accelerate by the Methods of Step-size Planning
Learning to Accelerate by the Methods of Step-size Planning
Hengshuai Yao
21
0
0
01 Apr 2022
Towards Exemplar-Free Continual Learning in Vision Transformers: an
  Account of Attention, Functional and Weight Regularization
Towards Exemplar-Free Continual Learning in Vision Transformers: an Account of Attention, Functional and Weight Regularization
Francesco Pelosin
Saurav Jha
A. Torsello
Bogdan Raducanu
Joost van de Weijer
CLL
21
28
0
24 Mar 2022
Half-Inverse Gradients for Physical Deep Learning
Half-Inverse Gradients for Physical Deep Learning
Patrick Schnell
Philipp Holl
Nils Thuerey
8
7
0
18 Mar 2022
Hierarchical Memory Learning for Fine-Grained Scene Graph Generation
Hierarchical Memory Learning for Fine-Grained Scene Graph Generation
Youming Deng
Yansheng Li
Yongjun Zhang
Xiang Xiang
Jian Wang
Jingdong Chen
Jiayi Ma
31
20
0
14 Mar 2022
Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based
  Optimization Problems
Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based Optimization Problems
L. Nurbekyan
Wanzhou Lei
Yunbo Yang
15
12
0
13 Feb 2022
A Geometric Understanding of Natural Gradient
A Geometric Understanding of Natural Gradient
Qinxun Bai
S. Rosenberg
Wei Xu
19
2
0
13 Feb 2022
DeepStability: A Study of Unstable Numerical Methods and Their Solutions
  in Deep Learning
DeepStability: A Study of Unstable Numerical Methods and Their Solutions in Deep Learning
Eliska Kloberdanz
Kyle G. Kloberdanz
Wei Le
14
15
0
07 Feb 2022
Gradient Descent on Neurons and its Link to Approximate Second-Order
  Optimization
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
Frederik Benzing
ODL
37
23
0
28 Jan 2022
Fast Moving Natural Evolution Strategy for High-Dimensional Problems
Fast Moving Natural Evolution Strategy for High-Dimensional Problems
Masahiro Nomura
I. Ono
9
6
0
27 Jan 2022
Efficient Approximations of the Fisher Matrix in Neural Networks using
  Kronecker Product Singular Value Decomposition
Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition
Abdoulaye Koroko
A. Anciaux-Sedrakian
I. B. Gharbia
Valérie Garès
M. Haddou
Quang-Huy Tran
14
5
0
25 Jan 2022
FedLGA: Towards System-Heterogeneity of Federated Learning via Local
  Gradient Approximation
FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation
Xingyu Li
Zhe Qu
Bo Tang
Zhuo Lu
FedML
19
25
0
22 Dec 2021
SCORE: Approximating Curvature Information under Self-Concordant
  Regularization
SCORE: Approximating Curvature Information under Self-Concordant Regularization
Adeyemi Damilare Adeoye
Alberto Bemporad
10
4
0
14 Dec 2021
Depth Without the Magic: Inductive Bias of Natural Gradient Descent
Depth Without the Magic: Inductive Bias of Natural Gradient Descent
A. Kerekes
Anna Mészáros
Ferenc Huszár
ODL
21
4
0
22 Nov 2021
Training Neural Networks with Fixed Sparse Masks
Training Neural Networks with Fixed Sparse Masks
Yi-Lin Sung
Varun Nair
Colin Raffel
FedML
18
196
0
18 Nov 2021
Merging Models with Fisher-Weighted Averaging
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
27
348
0
18 Nov 2021
Neuron-based Pruning of Deep Neural Networks with Better Generalization
  using Kronecker Factored Curvature Approximation
Neuron-based Pruning of Deep Neural Networks with Better Generalization using Kronecker Factored Curvature Approximation
Abdolghani Ebrahimi
Diego Klabjan
12
4
0
16 Nov 2021
On the Importance of Firth Bias Reduction in Few-Shot Classification
On the Importance of Firth Bias Reduction in Few-Shot Classification
Saba Ghaffari
Ehsan Saleh
David A. Forsyth
Yu-xiong Wang
30
13
0
06 Oct 2021
Bootstrapped Meta-Learning
Bootstrapped Meta-Learning
Sebastian Flennerhag
Yannick Schroecker
Tom Zahavy
Hado van Hasselt
David Silver
Satinder Singh
28
58
0
09 Sep 2021
Incremental Learning for Personalized Recommender Systems
Incremental Learning for Personalized Recommender Systems
Yunbo Ouyang
Jun Shi
Haichao Wei
Huiji Gao
BDL
CLL
OffRL
16
3
0
13 Aug 2021
Towards Zero-shot Language Modeling
Towards Zero-shot Language Modeling
E. Ponti
Ivan Vulić
Ryan Cotterell
Roi Reichart
Anna Korhonen
22
19
0
06 Aug 2021
Entropic alternatives to initialization
Entropic alternatives to initialization
Daniele Musso
37
1
0
16 Jul 2021
The Bayesian Learning Rule
The Bayesian Learning Rule
Mohammad Emtiyaz Khan
Håvard Rue
BDL
55
73
0
09 Jul 2021
On the Variance of the Fisher Information for Deep Learning
On the Variance of the Fisher Information for Deep Learning
Alexander Soen
Ke Sun
FedML
FAtt
6
15
0
09 Jul 2021
Task-agnostic Continual Learning with Hybrid Probabilistic Models
Task-agnostic Continual Learning with Hybrid Probabilistic Models
Polina Kirichenko
Mehrdad Farajtabar
Dushyant Rao
Balaji Lakshminarayanan
Nir Levine
Ang Li
Huiyi Hu
A. Wilson
Razvan Pascanu
VLM
BDL
CLL
14
19
0
24 Jun 2021
Spectral Normalisation for Deep Reinforcement Learning: an Optimisation
  Perspective
Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective
Florin Gogianu
Tudor Berariu
Mihaela Rosca
Claudia Clopath
L. Buşoniu
Razvan Pascanu
16
52
0
11 May 2021
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural
  Networks
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural Networks
Hidenori Tanaka
D. Kunin
19
26
0
06 May 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of
  Multilayer Perceptron: The Haar Orthogonal Case
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
B. Collins
Tomohiro Hayase
22
7
0
24 Mar 2021
A Distributed Optimisation Framework Combining Natural Gradient with
  Hessian-Free for Discriminative Sequence Training
A Distributed Optimisation Framework Combining Natural Gradient with Hessian-Free for Discriminative Sequence Training
Adnan Haider
Chao Zhang
Florian Kreyssig
P. Woodland
6
7
0
12 Mar 2021
Previous
12345
Next