ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1206.1106
  4. Cited By
No More Pesky Learning Rates

No More Pesky Learning Rates

6 June 2012
Tom Schaul
Sixin Zhang
Yann LeCun
ArXivPDFHTML

Papers citing "No More Pesky Learning Rates"

9 / 9 papers shown
Title
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Accelerating Learned Image Compression Through Modeling Neural Training Dynamics
Yichi Zhang
Zhihao Duan
Yuning Huang
Fengqing Zhu
90
0
0
23 May 2025
Reinforcement Teaching
Reinforcement Teaching
Alex Lewandowski
Calarina Muslimani
Dale Schuurmans
Matthew E. Taylor
Jun Luo
112
1
0
28 Jan 2025
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
77
7
0
14 Oct 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
103
2
0
26 May 2024
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
81
9
0
23 Feb 2024
Stochastic Optimization for Performative Prediction
Stochastic Optimization for Performative Prediction
Celestine Mendler-Dünner
Juan C. Perdomo
Tijana Zrnic
Moritz Hardt
24
114
0
12 Jun 2020
Adaptive learning rates and parallelization for stochastic, sparse,
  non-smooth gradients
Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients
Tom Schaul
Yann LeCun
ODL
33
28
0
16 Jan 2013
Estimating the Hessian by Back-propagating Curvature
Estimating the Hessian by Back-propagating Curvature
James Martens
Ilya Sutskever
Kevin Swersky
42
80
0
27 Jun 2012
Towards Optimal One Pass Large Scale Learning with Averaged Stochastic
  Gradient Descent
Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent
Wenyuan Xu
39
157
0
13 Jul 2011
1