Neurons in Large Language Models: Dead, N-gram, Positional

Neurons in Large Language Models: Dead, N-gram, Positional

9 September 2023

Javier Ferrando

Christoforos Nalmpantis

Papers citing "Neurons in Large Language Models: Dead, N-gram, Positional"

18 / 18 papers shown

Title
Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity Ruifeng Ren Yong Liu 156 0 0 26 Apr 2025
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models Tyler A. Chang Benjamin Bergen 50 0 0 21 Apr 2025
Repetition Neurons: How Do Language Models Produce Repetitions? Tatsuya Hiraoka Kentaro Inui MILM 75 6 0 21 Feb 2025
Scaling Embedding Layers in Language Models Da Yu Edith Cohen Badih Ghazi Yangsibo Huang Pritish Kamath Ravi Kumar Daogao Liu Chiyuan Zhang 82 0 0 03 Feb 2025
Understanding Layer Significance in LLM Alignment Guangyuan Shi Zexin Lu Xiaoyu Dong Wenlong Zhang Xuanyu Zhang Yujie Feng Xiao-Ming Wu 58 2 0 23 Oct 2024
Can Transformers Learn $n$ -gram Language Models? Anej Svete Nadav Borenstein M. Zhou Isabelle Augenstein Ryan Cotterell 47 7 0 03 Oct 2024
Mitigating Copy Bias in In-Context Learning through Neuron Pruning Ameen Ali Lior Wolf Ivan Titov 44 2 0 02 Oct 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models Daking Rai Yilun Zhou Shi Feng Abulhair Saparov Ziyu Yao 82 19 0 02 Jul 2024
The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov Kushal Tirumala Hassan Shapourian Paolo Glorioso Daniel A. Roberts 52 81 0 26 Mar 2024
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models Carlo Nicolini Jacopo Staiano Bruno Lepri Raffaele Marino MoE 34 1 0 13 Mar 2024
A Simple and Effective Pruning Approach for Large Language Models Mingjie Sun Zhuang Liu Anna Bair J. Zico Kolter 62 359 0 20 Jun 2023
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model Michael Hanna Ollie Liu Alexandre Variengien LRM 193 121 0 30 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models Mor Geva Jasmijn Bastings Katja Filippova Amir Globerson KELM 191 266 0 28 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 212 497 0 01 Nov 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 339 12,003 0 04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 282 1,996 0 31 Dec 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 264 4,489 0 23 Jan 2020
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives Elena Voita Rico Sennrich Ivan Titov 201 181 0 03 Sep 2019