Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.12661
Cited By
ZerO Initialization: Initializing Neural Networks with only Zeros and Ones
25 October 2021
Jiawei Zhao
Florian Schäfer
Anima Anandkumar
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ZerO Initialization: Initializing Neural Networks with only Zeros and Ones"
14 / 14 papers shown
Title
ASGO: Adaptive Structured Gradient Optimization
Kang An
Yuxing Liu
Rui Pan
Shiqian Ma
D. Goldfarb
Tong Zhang
ODL
97
2
0
26 Mar 2025
A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization
Md Yousuf Harun
Christopher Kanan
AI4CE
55
0
0
09 Mar 2025
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
Yehonathan Refael
Iftach Arbel
Ofir Lindenbaum
Tom Tirer
71
0
0
26 Feb 2025
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
Yehonathan Refael
Jonathan Svirsky
Boris Shustin
Wasim Huleihel
Ofir Lindenbaum
47
3
0
31 Dec 2024
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
Hyunwoo Lee
Hayoung Choi
Hyunju Kim
39
1
0
03 Oct 2024
An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification
W. Boulila
Eman Alshanqiti
Ayyub Alzahem
Anis Koubaa
Nabil Mlaiki
36
2
0
01 Jun 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
46
2
0
30 Apr 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao
Zhenyu Zhang
Beidi Chen
Zhangyang Wang
A. Anandkumar
Yuandong Tian
43
179
0
06 Mar 2024
Improved weight initialization for deep and narrow feedforward neural network
Hyunwoo Lee
Yunho Kim
Seungyeop Yang
Hayoung Choi
ODL
30
3
0
07 Nov 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms
Guillaume Berger
Manik Dhingra
Antoine Mercier
Yash Savani
Sunny Panchal
Fatih Porikli
SupR
20
5
0
08 Mar 2023
Hypercomplex Image-to-Image Translation
Eleonora Grassucci
Luigi Sigillo
A. Uncini
Danilo Comminiello
29
7
0
04 May 2022
Nonparametric Learning of Two-Layer ReLU Residual Units
Zhunxuan Wang
Linyun He
Chunchuan Lyu
Shay B. Cohen
MLT
OffRL
33
1
0
17 Aug 2020
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
244
349
0
14 Jun 2018
1