ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.03215
  4. Cited By
Multi-scale Feature Learning Dynamics: Insights for Double Descent

Multi-scale Feature Learning Dynamics: Insights for Double Descent

6 December 2021
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
ArXivPDFHTML

Papers citing "Multi-scale Feature Learning Dynamics: Insights for Double Descent"

19 / 19 papers shown
Title
A dynamic view of the double descent
A dynamic view of the double descent
Vivek Shripad Borkar
58
0
0
03 May 2025
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
138
0
0
29 Apr 2025
On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process
Shun Iwase
Shuya Takahashi
Nakamasa Inoue
Rio Yokota
Ryo Nakamura
Hirokatsu Kataoka
74
0
0
04 Mar 2025
The Fair Language Model Paradox
The Fair Language Model Paradox
Andrea Pinto
Tomer Galanti
Randall Balestriero
25
0
0
15 Oct 2024
Unified Neural Network Scaling Laws and Scale-time Equivalence
Unified Neural Network Scaling Laws and Scale-time Equivalence
Akhilan Boopathy
Ila Fiete
35
0
0
09 Sep 2024
Towards understanding epoch-wise double descent in two-layer linear
  neural networks
Towards understanding epoch-wise double descent in two-layer linear neural networks
Amanda Olmin
Fredrik Lindsten
MLT
27
3
0
13 Jul 2024
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
Jaerin Lee
Bong Gyun Kang
Kihoon Kim
Kyoung Mu Lee
35
11
0
30 May 2024
No Data Augmentation? Alternative Regularizations for Effective Training
  on Small Datasets
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
S. Mougiakakou
27
3
0
04 Sep 2023
Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross
  Entropy
Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy
A. Puli
Lily H. Zhang
Yoav Wald
Rajesh Ranganath
15
19
0
24 Aug 2023
Predicting Grokking Long Before it Happens: A look into the loss
  landscape of models which grok
Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
Pascal Junior Tikeng Notsawo
Hattie Zhou
Mohammad Pezeshki
Irina Rish
G. Dumas
19
23
0
23 Jun 2023
Deep incremental learning models for financial temporal tabular datasets
  with distribution shifts
Deep incremental learning models for financial temporal tabular datasets with distribution shifts
Thomas Wong
Mauricio Barahona
OOD
AIFin
AI4TS
18
0
0
14 Mar 2023
Unifying Grokking and Double Descent
Unifying Grokking and Double Descent
Peter W. Battaglia
David Raposo
Kelsey
34
31
0
10 Mar 2023
Over-training with Mixup May Hurt Generalization
Over-training with Mixup May Hurt Generalization
Zixuan Liu
Ziqiao Wang
Hongyu Guo
Yongyi Mao
NoLa
21
11
0
02 Mar 2023
Grokking phase transitions in learning local rules with gradient descent
Grokking phase transitions in learning local rules with gradient descent
Bojan Žunkovič
E. Ilievski
63
16
0
26 Oct 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully
  Connected Neural Networks
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
Charles Edison Tripp
J. Perr-Sauer
L. Hayne
M. Lunacek
Jamil Gafur
AI4CE
21
0
0
25 Jul 2022
Towards Understanding Grokking: An Effective Theory of Representation
  Learning
Towards Understanding Grokking: An Effective Theory of Representation Learning
Ziming Liu
O. Kitouni
Niklas Nolte
Eric J. Michaud
Max Tegmark
Mike Williams
AI4CE
17
143
0
20 May 2022
Generalizing similarity in noisy setups: the DIBS phenomenon
Generalizing similarity in noisy setups: the DIBS phenomenon
Nayara Fonseca
V. Guidetti
13
0
0
30 Jan 2022
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning
  Dynamics
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
101
77
0
08 Dec 2020
Double Trouble in Double Descent : Bias and Variance(s) in the Lazy
  Regime
Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
Florent Krzakala
93
152
0
02 Mar 2020
1