ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.17813
  4. Cited By
A Spectral Condition for Feature Learning

A Spectral Condition for Feature Learning

26 October 2023
Greg Yang
James B. Simon
Jeremy Bernstein
ArXivPDFHTML

Papers citing "A Spectral Condition for Feature Learning"

10 / 10 papers shown
Title
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
36
3
0
10 Jan 2025
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
Yehonathan Refael
Jonathan Svirsky
Boris Shustin
Wasim Huleihel
Ofir Lindenbaum
41
3
0
31 Dec 2024
The Optimization Landscape of SGD Across the Feature Learning Strength
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
C. Pehlevan
43
2
0
06 Oct 2024
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
u-μ\muμP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
53
9
0
24 Jul 2024
Infinite Limits of Multi-head Transformer Dynamics
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
47
9
0
24 May 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao
Zhenyu (Allen) Zhang
Beidi Chen
Zhangyang Wang
A. Anandkumar
Yuandong Tian
43
175
0
06 Mar 2024
Meta-Principled Family of Hyperparameter Scaling Strategies
Meta-Principled Family of Hyperparameter Scaling Strategies
Sho Yaida
50
16
0
10 Oct 2022
Learning by Turning: Neural Architecture Aware Optimisation
Learning by Turning: Neural Architecture Aware Optimisation
Yang Liu
Jeremy Bernstein
M. Meister
Yisong Yue
ODL
41
26
0
14 Feb 2021
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Trainability and Accuracy of Neural Networks: An Interacting Particle
  System Approach
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach
Grant M. Rotskoff
Eric Vanden-Eijnden
59
118
0
02 May 2018
1