ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.03713
  4. Cited By
$\mathcal{G}$-SGD: Optimizing ReLU Neural Networks in its Positively
  Scale-Invariant Space

G\mathcal{G}G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space

11 February 2018
Qi Meng
Shuxin Zheng
Huishuai Zhang
Wei Chen
Zhi-Ming Ma
Tie-Yan Liu
ArXivPDFHTML

Papers citing "$\mathcal{G}$-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space"

7 / 7 papers shown
Title
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
99
0
0
01 Feb 2025
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning
  Algorithms
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
Han Xiao
Kashif Rasul
Roland Vollgraf
86
8,807
0
25 Aug 2017
The Shattered Gradients Problem: If resnets are the answer, then what is
  the question?
The Shattered Gradients Problem: If resnets are the answer, then what is the question?
David Balduzzi
Marcus Frean
Lennox Leary
J. P. Lewis
Kurt Wan-Duo Ma
Brian McWilliams
ODL
44
399
0
28 Feb 2017
Recurrent Neural Networks With Limited Numerical Precision
Recurrent Neural Networks With Limited Numerical Precision
Joachim Ott
Zhouhan Lin
Yanzhe Zhang
Shih-Chii Liu
Yoshua Bengio
MQ
51
77
0
24 Aug 2016
Identity Mappings in Deep Residual Networks
Identity Mappings in Deep Residual Networks
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
240
10,149
0
16 Mar 2016
ADADELTA: An Adaptive Learning Rate Method
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
78
6,619
0
22 Dec 2012
Improving neural networks by preventing co-adaptation of feature
  detectors
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton
Nitish Srivastava
A. Krizhevsky
Ilya Sutskever
Ruslan Salakhutdinov
VLM
338
7,650
0
03 Jul 2012
1