From Tokens to Words: On the Inner Lexicon of LLMs

From Tokens to Words: On the Inner Lexicon of LLMs

8 October 2024

Roy Schwartz

Papers citing "From Tokens to Words: On the Inner Lexicon of LLMs"

10 / 10 papers shown

Title
Layers at Similar Depths Generate Similar Activations Across LLM Architectures Christopher Wolfram Aaron Schein 37 1 0 03 Apr 2025
Page Classification for Print Imaging Pipeline Shaoyuan Xu Cheng Lu Mark Shaw Peter Bauer J. Allebach VLM 43 1 0 03 Apr 2025
Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure Boshi Wang Huan Sun 44 2 0 02 Apr 2025
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models Guy Kaplan Michael Toker Yuval Reif Yonatan Belinkov Roy Schwartz DiffM 50 0 0 01 Apr 2025
SuperBPE: Space Travel for Language Models Alisa Liu J. Hayase Valentin Hofmann Sewoong Oh Noah A. Smith Yejin Choi 53 3 0 17 Mar 2025
Adversarial Tokenization Renato Lui Geh Zilei Shao Mathias Niepert SILM AAML 87 0 0 04 Mar 2025
Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors Kohei Tsuji Tatsuya Hiraoka Yuchang Cheng Eiji Aramaki Tomoya Iwakura 79 0 0 27 Feb 2025
Probing Semantic Routing in Large Mixture-of-Expert Models M. L. Olson Neale Ratzlaff Musashi Hinck Man Luo Sungduk Yu Chendi Xue Vasudev Lal MoE LRM 57 2 0 15 Feb 2025
Enhancing LLM Character-Level Manipulation via Divide and Conquer Zhen Xiong Yujun Cai Bryan Hooi Nanyun Peng Kai-Wei Chang Zhecheng Li 70 0 0 12 Feb 2025
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference Go Kamoda Benjamin Heinzerling Tatsuro Inaba Keito Kudo Keisuke Sakaguchi Kentaro Inui MILM 38 2 0 27 Jan 2025