ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.15175
  4. Cited By
Unified View of Grokking, Double Descent and Emergent Abilities: A
  Perspective from Circuits Competition

Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition

23 February 2024
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
ArXivPDFHTML

Papers citing "Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition"

4 / 4 papers shown
Title
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
40
0
0
04 Apr 2025
Information-Theoretic Progress Measures reveal Grokking is an Emergent
  Phase Transition
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
50
3
0
16 Aug 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
82
19
0
02 Jul 2024
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
389
8,495
0
28 Jan 2022
1