Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.03713
Cited By
G
\mathcal{G}
G
-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space
11 February 2018
Qi Meng
Shuxin Zheng
Huishuai Zhang
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"$\mathcal{G}$-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space"
7 / 7 papers shown
Title
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
99
0
0
01 Feb 2025
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
Han Xiao
Kashif Rasul
Roland Vollgraf
86
8,807
0
25 Aug 2017
The Shattered Gradients Problem: If resnets are the answer, then what is the question?
David Balduzzi
Marcus Frean
Lennox Leary
J. P. Lewis
Kurt Wan-Duo Ma
Brian McWilliams
ODL
44
399
0
28 Feb 2017
Recurrent Neural Networks With Limited Numerical Precision
Joachim Ott
Zhouhan Lin
Yanzhe Zhang
Shih-Chii Liu
Yoshua Bengio
MQ
51
77
0
24 Aug 2016
Identity Mappings in Deep Residual Networks
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
240
10,149
0
16 Mar 2016
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
78
6,619
0
22 Dec 2012
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
338
7,650
0
03 Jul 2012
1