ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17322
  4. Cited By
From Compression to Expansion: A Layerwise Analysis of In-Context Learning

From Compression to Expansion: A Layerwise Analysis of In-Context Learning

22 May 2025
Jiachen Jiang
Yuxin Dong
Jinxin Zhou
Zhihui Zhu
ArXivPDFHTML

Papers citing "From Compression to Expansion: A Layerwise Analysis of In-Context Learning"

36 / 36 papers shown
Title
Adaptive Layer-skipping in Pre-trained LLMs
Adaptive Layer-skipping in Pre-trained LLMs
Xuan Luo
Weizhi Wang
Xifeng Yan
298
1
0
31 Mar 2025
Layer by Layer: Uncovering Hidden Representations in Language Models
Layer by Layer: Uncovering Hidden Representations in Language Models
Oscar Skean
Md Rifat Arefin
Dan Zhao
Niket Patel
Jalal Naghiyev
Yann LeCun
Ravid Shwartz-Ziv
MILM
AIFin
121
14
0
04 Feb 2025
A Law of Next-Token Prediction in Large Language Models
A Law of Next-Token Prediction in Large Language Models
Hangfeng He
Weijie J. Su
47
7
0
24 Aug 2024
DETAIL: Task DEmonsTration Attribution for Interpretable In-context
  Learning
DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning
Zijian Zhou
Xiaoqiang Lin
Xinyi Xu
Alok Prakash
Daniela Rus
K. H. Low
43
4
0
22 May 2024
In-Context Learning State Vector with Inner and Momentum Optimization
In-Context Learning State Vector with Inner and Momentum Optimization
Dongfang Li
Zhenyu Liu
Xinshuo Hu
Zetian Sun
Baotian Hu
Min Zhang
61
7
0
17 Apr 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You
  Expect
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
60
122
0
06 Mar 2024
How do Large Language Models Learn In-Context? Query and Key Matrices of
  In-Context Heads are Two Towers for Metric Learning
How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning
Zeping Yu
Sophia Ananiadou
61
11
0
05 Feb 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The
  Rise of Code Intelligence
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Daya Guo
Qihao Zhu
Dejian Yang
Zhenda Xie
Kai Dong
...
Yu-Huan Wu
Yiming Li
Fuli Luo
Yingfei Xiong
W. Liang
ELM
67
716
0
25 Jan 2024
LLM360: Towards Fully Transparent Open-Source LLMs
LLM360: Towards Fully Transparent Open-Source LLMs
Zhengzhong Liu
Aurick Qiao
Willie Neiswanger
Hongyi Wang
Bowen Tan
...
Zhiting Hu
Mark Schulze
Preslav Nakov
Timothy Baldwin
Eric Xing
65
72
0
11 Dec 2023
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu
Tri Dao
Mamba
43
2,552
0
01 Dec 2023
In-context Vectors: Making In Context Learning More Effective and
  Controllable Through Latent Space Steering
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Sheng Liu
Haotian Ye
Lei Xing
James Y. Zou
39
97
0
11 Nov 2023
In-Context Learning Creates Task Vectors
In-Context Learning Creates Task Vectors
Roee Hendel
Mor Geva
Amir Globerson
53
146
0
24 Oct 2023
Function Vectors in Large Language Models
Function Vectors in Large Language Models
Eric Todd
Millicent Li
Arnab Sen Sharma
Aaron Mueller
Byron C. Wallace
David Bau
25
111
0
23 Oct 2023
Generalized Neural Collapse for a Large Number of Classes
Generalized Neural Collapse for a Large Number of Classes
Jiachen Jiang
Jinxin Zhou
Peng Wang
Qing Qu
Dustin Mixon
Chong You
Zhihui Zhu
AI4CE
43
27
0
09 Oct 2023
In-Context Learning Learns Label Relationships but Is Not Conventional
  Learning
In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Jannik Kossen
Y. Gal
Tom Rainforth
80
34
0
23 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Percy Liang
RALM
62
1,521
0
06 Jul 2023
Transformers learn to implement preconditioned gradient descent for
  in-context learning
Transformers learn to implement preconditioned gradient descent for in-context learning
Kwangjun Ahn
Xiang Cheng
Hadi Daneshmand
S. Sra
ODL
53
159
0
01 Jun 2023
How Does Information Bottleneck Help Deep Learning?
How Does Information Bottleneck Help Deep Learning?
Kenji Kawaguchi
Zhun Deng
Xu Ji
Jiaoyang Huang
54
58
0
30 May 2023
Label Words are Anchors: An Information Flow Perspective for
  Understanding In-Context Learning
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Lean Wang
Lei Li
Damai Dai
Deli Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
86
183
0
23 May 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
40
1,231
0
03 Apr 2023
Why Can GPT Learn In-Context? Language Models Implicitly Perform
  Gradient Descent as Meta-Optimizers
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers
Damai Dai
Yutao Sun
Li Dong
Y. Hao
Shuming Ma
Zhifang Sui
Furu Wei
LRM
45
159
0
20 Dec 2022
Transformers learn in-context by gradient descent
Transformers learn in-context by gradient descent
J. Oswald
Eyvind Niklasson
E. Randazzo
João Sacramento
A. Mordvintsev
A. Zhmoginov
Max Vladymyrov
MLT
65
463
0
15 Dec 2022
Are All Losses Created Equal: A Neural Collapse Perspective
Are All Losses Created Equal: A Neural Collapse Perspective
Jinxin Zhou
Chong You
Xiao Li
Kangning Liu
Sheng Liu
Qing Qu
Zhihui Zhu
48
63
0
04 Oct 2022
Nearest Class-Center Simplification through Intermediate Layers
Nearest Class-Center Simplification through Intermediate Layers
Ido Ben-Shaul
S. Dekel
60
27
0
21 Jan 2022
On the Role of Neural Collapse in Transfer Learning
On the Role of Neural Collapse in Transfer Learning
Tomer Galanti
András Gyorgy
Marcus Hutter
SSL
37
90
0
30 Dec 2021
An Explanation of In-context Learning as Implicit Bayesian Inference
An Explanation of In-context Learning as Implicit Bayesian Inference
Sang Michael Xie
Aditi Raghunathan
Percy Liang
Tengyu Ma
ReLM
BDL
VPVLM
LRM
119
728
0
03 Nov 2021
Meta-learning via Language Model In-context Tuning
Meta-learning via Language Model In-context Tuning
Yanda Chen
Ruiqi Zhong
Sheng Zha
George Karypis
He He
270
160
0
15 Oct 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
A Geometric Analysis of Neural Collapse with Unconstrained Features
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
51
200
0
06 May 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
302
1,152
0
18 Apr 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
  in Imbalanced Training
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
138
170
0
29 Jan 2021
Prevalence of Neural Collapse during the terminal phase of deep learning
  training
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
48
563
0
18 Aug 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear
  Attention
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
84
1,716
0
29 Jun 2020
Learning Diverse and Discriminative Representations via the Principle of
  Maximal Coding Rate Reduction
Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction
Yaodong Yu
Kwan Ho Ryan Chan
Chong You
Chaobing Song
Yi-An Ma
SSL
55
194
0
15 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
306
41,106
0
28 May 2020
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
Ji Xin
Raphael Tang
Jaejun Lee
Yaoliang Yu
Jimmy J. Lin
22
370
0
27 Apr 2020
Efficient Attention: Attention with Linear Complexities
Efficient Attention: Attention with Linear Complexities
Zhuoran Shen
Mingyuan Zhang
Haiyu Zhao
Shuai Yi
Hongsheng Li
72
519
0
04 Dec 2018
1