Deep Grokking: Would Deep Neural Networks Generalize Better?

29 May 2024

Papers citing "Deep Grokking: Would Deep Neural Networks Generalize Better?"

6 / 6 papers shown

Title
Generalization or Memorization: Dynamic Decoding for Mode Steering Xuanming Zhang 24 0 0 25 Oct 2025
Investigating the Impact of Rational Dilated Wavelet Transform on Motor Imagery EEG Decoding with Deep Learning Models Marco Siino Giuseppe Bonomo Rosario Sorbello Ilenia Tinnirello 4 2 0 10 Oct 2025
Learning words in groups: fusion algebras, tensor ranks and grokking Maor Shutman Oren Louidor Ran Tessler 56 1 0 08 Sep 2025
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Yiping Wang Qing Yang Zhiyuan Zeng Liliang Ren Liu Liu ... Jianfeng Gao Weizhu Chen Shuaiqiang Wang Simon Shaolei Du Haoran Pan OffRL ReLM LRM 512 137 0 29 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation Xinyu Zhou Simin Fan Martin Jaggi Jie Fu 149 0 0 24 Apr 2025
Language Models "Grok" to CopyNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Ang Lv Ruobing Xie Xingwu Sun Zhanhui Kang Rui Yan LLMAG 222 2 0 14 Sep 2024