Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.03744
Cited By
Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?
11 January 2018
Boris Hanin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?"
50 / 50 papers shown
Title
HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models
Zheng Lin
Yuxin Zhang
Zhe Chen
Zihan Fang
Xianhao Chen
Praneeth Vepakomma
Wei Ni
Jun Luo
Yue Gao
MoE
49
2
0
05 May 2025
Low-Loss Space in Neural Networks is Continuous and Fully Connected
Yongding Tian
Zaid Al-Ars
Maksim Kitsak
P. Hofstee
3DPC
31
1
0
05 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
44
1
0
02 May 2025
Deep Neural Nets as Hamiltonians
Mike Winer
Boris Hanin
199
0
0
31 Mar 2025
FAIR: Facilitating Artificial Intelligence Resilience in Manufacturing Industrial Internet
Yingyan Zeng
Ismini Lourentzou
Xinwei Deng
R. Jin
AI4CE
64
0
0
03 Mar 2025
Federated Learning with Flexible Architectures
Jong-Ik Park
Carlee Joe-Wong
FedML
45
3
0
14 Jun 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
39
6
0
09 Apr 2024
Quantitative CLTs in Deep Neural Networks
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
41
12
0
12 Jul 2023
Intelligent gradient amplification for deep neural networks
S. Basodi
K. Pusuluri
Xueli Xiao
Yi Pan
ODL
21
1
0
29 May 2023
Depth Dependence of
μ
μ
μ
P Learning Rates in ReLU MLPs
Samy Jelassi
Boris Hanin
Ziwei Ji
Sashank J. Reddi
Srinadh Bhojanapalli
Surinder Kumar
27
6
0
13 May 2023
A Neural Emulator for Uncertainty Estimation of Fire Propagation
Andrew Bolt
Conrad Sanderson
J. Dabrowski
C. Huston
Petra Kuhnert
27
3
0
10 May 2023
On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee
Chenyang Li
Jihoon Chung
Mengnan Du
Haimin Wang
Xianlian Zhou
Bohao Shen
33
1
0
13 Mar 2023
Error convergence and engineering-guided hyperparameter search of PINNs: towards optimized I-FENN performance
Panos Pantidis
Habiba Eldababy
Christopher Miguel Tagle
M. Mobasher
35
20
0
03 Mar 2023
Expected Gradients of Maxout Networks and Consequences to Parameter Initialization
Hanna Tseran
Guido Montúfar
ODL
30
0
0
17 Jan 2023
Langevin algorithms for very deep Neural Networks with application to image classification
Pierre Bras
25
6
0
27 Dec 2022
Langevin algorithms for Markovian Neural Networks and Deep Stochastic control
Pierre Bras
Gilles Pagès
30
3
0
22 Dec 2022
Accelerating Dataset Distillation via Model Augmentation
Lei Zhang
Jie M. Zhang
Bowen Lei
Subhabrata Mukherjee
Xiang Pan
Bo Zhao
Caiwen Ding
Heng Chang
Dongkuan Xu
DD
47
62
0
12 Dec 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
40
2
0
05 Oct 2022
Graph Neural Networks Extract High-Resolution Cultivated Land Maps from Sentinel-2 Image Series
Lukasz Tulczyjew
M. Kawulok
Nicolas Longépé
Bertrand Le Saux
J. Nalepa
21
14
0
03 Aug 2022
PSO-Convolutional Neural Networks with Heterogeneous Learning Rate
N. H. Phong
A. Santos
B. Ribeiro
27
8
0
20 May 2022
Regularization by Misclassification in ReLU Neural Networks
Elisabetta Cornacchia
Jan Hązła
Ido Nachum
Amir Yehudayoff
NoLa
25
2
0
03 Nov 2021
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
G. Bingham
Risto Miikkulainen
ODL
24
4
0
18 Sep 2021
Clipped Hyperbolic Classifiers Are Super-Hyperbolic Classifiers
Yunhui Guo
Xudong Wang
Yubei Chen
Stella X. Yu
26
45
0
23 Jul 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes
Boris Hanin
BDL
32
44
0
04 Jul 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
37
33
0
07 Jun 2021
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
Ameya Dilip Jagtap
Yeonjong Shin
Kenji Kawaguchi
George Karniadakis
ODL
45
131
0
20 May 2021
Activation function design for deep networks: linearity and effective initialisation
Michael Murray
V. Abrol
Jared Tanner
ODL
LLMSV
29
18
0
17 May 2021
Deep limits and cut-off phenomena for neural networks
B. Avelin
A. Karlsson
AI4CE
40
2
0
21 Apr 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
Arnulf Jentzen
Adrian Riekert
MLT
37
13
0
01 Apr 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases
Arnulf Jentzen
T. Kröger
ODL
28
7
0
23 Feb 2021
Deep ReLU Networks Preserve Expected Length
Boris Hanin
Ryan Jeong
David Rolnick
29
14
0
21 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions
Patrick Cheridito
Arnulf Jentzen
Adrian Riekert
Florian Rossmannek
28
24
0
19 Feb 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
41
3
0
12 Jan 2021
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
40
2
0
04 Jan 2021
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
24
133
0
22 Sep 2020
Tensor Programs III: Neural Matrix Laws
Greg Yang
19
44
0
22 Sep 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
36
79
0
17 Sep 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
58
135
0
25 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Composite Travel Generative Adversarial Networks for Tabular and Sequential Population Synthesis
Godwin Badu-Marfo
Bilal Farooq
Zachary Patterson
39
31
0
15 Apr 2020
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OOD
AI4CE
40
120
0
26 Mar 2020
Machine Learning from a Continuous Viewpoint
E. Weinan
Chao Ma
Lei Wu
33
102
0
30 Dec 2019
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
27
168
0
19 Dec 2019
Finite Depth and Width Corrections to the Neural Tangent Kernel
Boris Hanin
Mihai Nica
MDE
30
149
0
13 Sep 2019
Infinitely deep neural networks as diffusion processes
Stefano Peluchetti
Stefano Favaro
ODL
14
31
0
27 May 2019
Data driven approximation of parametrized PDEs by Reduced Basis and Neural Networks
N. D. Santo
S. Deparis
Luca Pegolotti
19
66
0
02 Apr 2019
Interpreting Neural Networks Using Flip Points
Roozbeh Yousefzadeh
D. O’Leary
AAML
FAtt
22
17
0
21 Mar 2019
On the security relevance of weights in deep learning
Kathrin Grosse
T. A. Trost
Marius Mosbach
Michael Backes
Dietrich Klakow
AAML
32
6
0
08 Feb 2019
How to Start Training: The Effect of Initialization and Architecture
Boris Hanin
David Rolnick
19
253
0
05 Mar 2018
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
186
1,186
0
30 Nov 2014
1