Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.15175
Cited By
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
23 February 2024
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition"
4 / 4 papers shown
Title
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
40
0
0
04 Apr 2025
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
50
3
0
16 Aug 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
82
19
0
02 Jul 2024
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
389
8,495
0
28 Jan 2022
1