ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.08360
  4. Cited By
The Transient Nature of Emergent In-Context Learning in Transformers

The Transient Nature of Emergent In-Context Learning in Transformers

14 November 2023
Aaditya K. Singh
Stephanie C. Y. Chan
Ted Moskovitz
Erin Grant
Andrew M. Saxe
Felix Hill
ArXivPDFHTML

Papers citing "The Transient Nature of Emergent In-Context Learning in Transformers"

30 / 30 papers shown
Title
In-Context Learning can distort the relationship between sequence likelihoods and biological fitness
In-Context Learning can distort the relationship between sequence likelihoods and biological fitness
Pranav Kantroo
Günter P. Wagner
Benjamin B. Machta
45
0
0
23 Apr 2025
How do language models learn facts? Dynamics, curricula and hallucinations
How do language models learn facts? Dynamics, curricula and hallucinations
Nicolas Zucchet
J. Bornschein
Stephanie C. Y. Chan
Andrew Kyle Lampinen
Razvan Pascanu
Soham De
KELM
HILM
LRM
77
2
1
27 Mar 2025
Strategy Coopetition Explains the Emergence and Transience of In-Context Learning
Aaditya K. Singh
Ted Moskovitz
Sara Dragutinovic
Felix Hill
Stephanie C. Y. Chan
Andrew Saxe
136
0
0
07 Mar 2025
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang
Aaditya K. Singh
Peter E. Latham
Andrew Saxe
MLT
64
1
0
28 Jan 2025
What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning
What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning
Jelena Bratulić
Sudhanshu Mittal
Christian Rupprecht
Thomas Brox
41
0
0
09 Jan 2025
Can Custom Models Learn In-Context? An Exploration of Hybrid
  Architecture Performance on In-Context Learning Tasks
Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks
Ryan Campbell
Nelson Lojo
Kesava Viswanadha
Christoffer Grondal Tryggestad
Derrick Han Sun
Sriteja Vijapurapu
August Rolfsen
Anant Sahai
24
0
0
06 Nov 2024
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman
Alexander Nikulin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
Andrei Polubarov
Igor Kiselev
Vladislav Kurenkov
OffRL
50
1
0
04 Nov 2024
Weight decay induces low-rank attention layers
Weight decay induces low-rank attention layers
Seijin Kobayashi
Yassir Akram
J. Oswald
39
6
0
31 Oct 2024
Toward Understanding In-context vs. In-weight Learning
Toward Understanding In-context vs. In-weight Learning
Bryan Chan
Xinyi Chen
András Gyorgy
Dale Schuurmans
72
3
0
30 Oct 2024
Wrong-of-Thought: An Integrated Reasoning Framework with
  Multi-Perspective Verification and Wrong Information
Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information
Yongheng Zhang
Qiguang Chen
Jingxuan Zhou
Peng Wang
Jiasheng Si
Jin Wang
Wenpeng Lu
Libo Qin
LRM
46
3
0
06 Oct 2024
In-context Learning in Presence of Spurious Correlations
In-context Learning in Presence of Spurious Correlations
Hrayr Harutyunyan
R. Darbinyan
Samvel Karapetyan
Hrant Khachatrian
LRM
46
1
0
04 Oct 2024
Geometric Signatures of Compositionality Across a Language Model's Lifetime
Geometric Signatures of Compositionality Across a Language Model's Lifetime
Jin Hwa Lee
Thomas Jiralerspong
Lei Yu
Yoshua Bengio
Emily Cheng
CoGe
84
0
0
02 Oct 2024
Transformer Normalisation Layers and the Independence of Semantic
  Subspaces
Transformer Normalisation Layers and the Independence of Semantic Subspaces
S. Menary
Samuel Kaski
Andre Freitas
44
2
0
25 Jun 2024
What Do Language Models Learn in Context? The Structured Task Hypothesis
What Do Language Models Learn in Context? The Structured Task Hypothesis
Jiaoda Li
Yifan Hou
Mrinmaya Sachan
Ryan Cotterell
LRM
36
7
0
06 Jun 2024
Is In-Context Learning in Large Language Models Bayesian? A Martingale
  Perspective
Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective
Fabian Falck
Ziyu Wang
Chris Holmes
58
12
0
02 Jun 2024
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Suraj Anand
Michael A. Lepori
Jack Merullo
Ellie Pavlick
CLL
29
6
0
28 May 2024
Towards Better Understanding of In-Context Learning Ability from
  In-Context Uncertainty Quantification
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantification
Shang Liu
Zhongze Cai
Guanting Chen
Xiaocheng Li
UQCV
43
1
0
24 May 2024
Asymptotic theory of in-context learning by linear attention
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
C. Pehlevan
23
10
0
20 May 2024
Better & Faster Large Language Models via Multi-token Prediction
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle
Badr Youbi Idrissi
Baptiste Rozière
David Lopez-Paz
Gabriele Synnaeve
24
93
0
30 Apr 2024
A phase transition between positional and semantic learning in a
  solvable model of dot-product attention
A phase transition between positional and semantic learning in a solvable model of dot-product attention
Hugo Cui
Freya Behrens
Florent Krzakala
Lenka Zdeborová
MLT
25
11
0
06 Feb 2024
The mechanistic basis of data dependence and abrupt learning in an
  in-context classification task
The mechanistic basis of data dependence and abrupt learning in an in-context classification task
Gautam Reddy
19
50
0
03 Dec 2023
How are Prompts Different in Terms of Sensitivity?
How are Prompts Different in Terms of Sensitivity?
Sheng Lu
Hendrik Schuff
Iryna Gurevych
32
18
0
13 Nov 2023
The Mystery of In-Context Learning: A Comprehensive Survey on
  Interpretation and Analysis
The Mystery of In-Context Learning: A Comprehensive Survey on Interpretation and Analysis
Yuxiang Zhou
Jiazheng Li
Yanzheng Xiang
Hanqi Yan
Lin Gui
Yulan He
22
14
0
01 Nov 2023
Breaking through the learning plateaus of in-context learning in
  Transformer
Breaking through the learning plateaus of in-context learning in Transformer
Jingwen Fu
Tao Yang
Yuwang Wang
Yan Lu
Nanning Zheng
30
0
0
12 Sep 2023
Uncovering mesa-optimization algorithms in Transformers
Uncovering mesa-optimization algorithms in Transformers
J. Oswald
Eyvind Niklasson
Maximilian Schlegel
Seijin Kobayashi
Nicolas Zucchet
...
Mark Sandler
Blaise Agüera y Arcas
Max Vladymyrov
Razvan Pascanu
João Sacramento
24
53
0
11 Sep 2023
Explaining grokking through circuit efficiency
Explaining grokking through circuit efficiency
Vikrant Varma
Rohin Shah
Zachary Kenton
János Kramár
Ramana Kumar
18
47
0
05 Sep 2023
The Learnability of In-Context Learning
The Learnability of In-Context Learning
Noam Wies
Yoav Levine
Amnon Shashua
122
91
0
14 Mar 2023
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
458
0
24 Sep 2022
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
231
4,460
0
23 Jan 2020
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
317
11,681
0
09 Mar 2017
1