Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1301.3584
Cited By
Revisiting Natural Gradient for Deep Networks
16 January 2013
Razvan Pascanu
Yoshua Bengio
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Revisiting Natural Gradient for Deep Networks"
50 / 229 papers shown
Title
Analysis and Comparison of Two-Level KFAC Methods for Training Deep Neural Networks
Abdoulaye Koroko
A. Anciaux-Sedrakian
I. B. Gharbia
Valérie Garès
M. Haddou
Quang-Huy Tran
17
0
0
31 Mar 2023
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Nico Daheim
Nouha Dziri
Mrinmaya Sachan
Iryna Gurevych
E. Ponti
MoMe
28
30
0
30 Mar 2023
Achieving High Accuracy with PINNs via Energy Natural Gradients
Johannes Müller
Marius Zeinhofer
13
4
0
25 Feb 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
26
11
0
14 Feb 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
19
0
0
08 Feb 2023
Efficient Parametric Approximations of Neural Network Function Space Distance
Nikita Dhawan
Sicong Huang
Juhan Bae
Roger C. Grosse
14
5
0
07 Feb 2023
On-the-fly Denoising for Data Augmentation in Natural Language Understanding
Tianqing Fang
Wenxuan Zhou
Fangyu Liu
Hongming Zhang
Yangqiu Song
Muhao Chen
36
1
0
20 Dec 2022
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices
Kazuki Osawa
Shigang Li
Torsten Hoefler
AI4CE
33
24
0
25 Nov 2022
Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively
Haojie Zhang
Ge Li
Jia Li
Zhongjin Zhang
Yuqi Zhu
Zhi Jin
AI4CE
8
26
0
03 Nov 2022
Numerical Optimizations for Weighted Low-rank Estimation on Language Model
Ting Hua
Yen-Chang Hsu
Felicity Wang
Qiang Lou
Yilin Shen
Hongxia Jin
13
13
0
02 Nov 2022
Correlation of the importances of neural network weights calculated by modern methods of overcoming catastrophic forgetting
Alexey Kutalev
6
0
0
24 Oct 2022
Exclusive Supermask Subnetwork Training for Continual Learning
Prateek Yadav
Mohit Bansal
CLL
22
5
0
18 Oct 2022
Optimisation & Generalisation in Networks of Neurons
Jeremy Bernstein
AI4CE
16
2
0
18 Oct 2022
GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization
Zhiyuan Zhang
Ruixuan Luo
Qi Su
Xueting Sun
24
11
0
13 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
11
1
0
11 Oct 2022
Continuous Diagnosis and Prognosis by Controlling the Update Process of Deep Neural Networks
Chenxi Sun
Hongyan Li
Moxian Song
D. Cai
B. Zhang
linda Qiao
21
8
0
06 Oct 2022
DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning
Hyounguk Shon
Janghyeon Lee
Seungwook Kim
Junmo Kim
CLL
21
11
0
17 Aug 2022
Empirical investigations on WVA structural issues
Alexey Kutalev
A. Lapina
CLL
13
1
0
11 Aug 2022
Language model compression with weighted low-rank factorization
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
14
92
0
30 Jun 2022
Constrained Imitation Learning for a Flapping Wing Unmanned Aerial Vehicle
T. K C
Taeyoung Lee
16
2
0
08 Jun 2022
Few-Shot Learning by Dimensionality Reduction in Gradient Space
M. Gauch
M. Beck
Thomas Adler
D. Kotsur
Stefan Fiel
...
Markus Holzleitner
Werner Zellinger
D. Klotz
Sepp Hochreiter
Sebastian Lehner
35
9
0
07 Jun 2022
Mitigating Dataset Bias by Using Per-sample Gradient
Sumyeong Ahn
Seongyoon Kim
Se-Young Yun
43
20
0
31 May 2022
Learning to Accelerate by the Methods of Step-size Planning
Hengshuai Yao
21
0
0
01 Apr 2022
Towards Exemplar-Free Continual Learning in Vision Transformers: an Account of Attention, Functional and Weight Regularization
Francesco Pelosin
Saurav Jha
A. Torsello
Bogdan Raducanu
Joost van de Weijer
CLL
21
28
0
24 Mar 2022
Half-Inverse Gradients for Physical Deep Learning
Patrick Schnell
Philipp Holl
Nils Thuerey
8
7
0
18 Mar 2022
Hierarchical Memory Learning for Fine-Grained Scene Graph Generation
Youming Deng
Yansheng Li
Yongjun Zhang
Xiang Xiang
Jian Wang
Jingdong Chen
Jiayi Ma
31
20
0
14 Mar 2022
Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based Optimization Problems
L. Nurbekyan
Wanzhou Lei
Yunbo Yang
15
12
0
13 Feb 2022
A Geometric Understanding of Natural Gradient
Qinxun Bai
S. Rosenberg
Wei Xu
19
2
0
13 Feb 2022
DeepStability: A Study of Unstable Numerical Methods and Their Solutions in Deep Learning
Eliska Kloberdanz
Kyle G. Kloberdanz
Wei Le
14
15
0
07 Feb 2022
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
Frederik Benzing
ODL
37
23
0
28 Jan 2022
Fast Moving Natural Evolution Strategy for High-Dimensional Problems
Masahiro Nomura
I. Ono
9
6
0
27 Jan 2022
Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition
Abdoulaye Koroko
A. Anciaux-Sedrakian
I. B. Gharbia
Valérie Garès
M. Haddou
Quang-Huy Tran
14
5
0
25 Jan 2022
FedLGA: Towards System-Heterogeneity of Federated Learning via Local Gradient Approximation
Xingyu Li
Zhe Qu
Bo Tang
Zhuo Lu
FedML
19
25
0
22 Dec 2021
SCORE: Approximating Curvature Information under Self-Concordant Regularization
Adeyemi Damilare Adeoye
Alberto Bemporad
10
4
0
14 Dec 2021
Depth Without the Magic: Inductive Bias of Natural Gradient Descent
A. Kerekes
Anna Mészáros
Ferenc Huszár
ODL
21
4
0
22 Nov 2021
Training Neural Networks with Fixed Sparse Masks
Yi-Lin Sung
Varun Nair
Colin Raffel
FedML
18
196
0
18 Nov 2021
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
27
348
0
18 Nov 2021
Neuron-based Pruning of Deep Neural Networks with Better Generalization using Kronecker Factored Curvature Approximation
Abdolghani Ebrahimi
Diego Klabjan
12
4
0
16 Nov 2021
On the Importance of Firth Bias Reduction in Few-Shot Classification
Saba Ghaffari
Ehsan Saleh
David A. Forsyth
Yu-xiong Wang
30
13
0
06 Oct 2021
Bootstrapped Meta-Learning
Sebastian Flennerhag
Yannick Schroecker
Tom Zahavy
Hado van Hasselt
David Silver
Satinder Singh
28
58
0
09 Sep 2021
Incremental Learning for Personalized Recommender Systems
Yunbo Ouyang
Jun Shi
Haichao Wei
Huiji Gao
BDL
CLL
OffRL
16
3
0
13 Aug 2021
Towards Zero-shot Language Modeling
E. Ponti
Ivan Vulić
Ryan Cotterell
Roi Reichart
Anna Korhonen
22
19
0
06 Aug 2021
Entropic alternatives to initialization
Daniele Musso
37
1
0
16 Jul 2021
The Bayesian Learning Rule
Mohammad Emtiyaz Khan
Håvard Rue
BDL
55
73
0
09 Jul 2021
On the Variance of the Fisher Information for Deep Learning
Alexander Soen
Ke Sun
FedML
FAtt
6
15
0
09 Jul 2021
Task-agnostic Continual Learning with Hybrid Probabilistic Models
Polina Kirichenko
Mehrdad Farajtabar
Dushyant Rao
Balaji Lakshminarayanan
Nir Levine
Ang Li
Huiyi Hu
A. Wilson
Razvan Pascanu
VLM
BDL
CLL
14
19
0
24 Jun 2021
Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective
Florin Gogianu
Tudor Berariu
Mihaela Rosca
Claudia Clopath
L. Buşoniu
Razvan Pascanu
16
52
0
11 May 2021
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural Networks
Hidenori Tanaka
D. Kunin
19
26
0
06 May 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
B. Collins
Tomohiro Hayase
22
7
0
24 Mar 2021
A Distributed Optimisation Framework Combining Natural Gradient with Hessian-Free for Discriminative Sequence Training
Adnan Haider
Chao Zhang
Florian Kreyssig
P. Woodland
6
7
0
12 Mar 2021
Previous
1
2
3
4
5
Next