Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.21073
Cited By
v1
v2 (latest)
Shared Global and Local Geometry of Language Model Embeddings
27 March 2025
Andrew Lee
Melanie Weber
F. Viégas
Martin Wattenberg
FedML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Shared Global and Local Geometry of Language Model Embeddings"
33 / 33 papers shown
Title
Jailbreak Strength and Model Similarity Predict Transferability
Rico Angell
Jannik Brinkmann
He He
24
0
0
15 Jun 2025
Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit
Charles Goddard
Fernando Fernandes Neto
30
0
0
07 Jun 2025
Transferring Features Across Language Models With Model Stitching
Alan Chen
Jack Merullo
Alessandro Stolfo
Ellie Pavlick
35
0
0
07 Jun 2025
Do different prompting methods yield a common task representation in language models?
Guy Davidson
Todd M. Gureckis
Brenden M. Lake
Adina Williams
58
2
0
17 May 2025
Probing the Vulnerability of Large Language Models to Polysemantic Interventions
Bofan Gong
Shiyang Lai
Dawn Song
AAML
MILM
72
1
0
16 May 2025
The Geometry of Self-Verification in a Task-Specific Reasoning Model
Andrew Lee
Lihao Sun
Chris Wendler
Fernanda Viégas
Martin Wattenberg
LRM
172
1
0
19 Apr 2025
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals
Yuyang Miao
Zehua Chen
Chong Li
Danilo Mandic
DiffM
MedIm
79
9
0
06 Oct 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
149
922
0
31 Jul 2024
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
Kiho Park
Yo Joong Choe
Yibo Jiang
Victor Veitch
133
41
0
03 Jun 2024
The Platonic Representation Hypothesis
Minyoung Huh
Brian Cheung
Tongzhou Wang
Phillip Isola
138
142
0
13 May 2024
Fishing for Magikarp: Automatically Detecting Under-trained Tokens in Large Language Models
Sander Land
Max Bartolo
116
25
0
08 May 2024
Universal Neurons in GPT2 Language Models
Wes Gurnee
Theo Horsley
Zifan Carl Guo
Tara Rezaei Kheirkhah
Qinyi Sun
Will Hathaway
Neel Nanda
Dimitris Bertsimas
MILM
158
47
0
22 Jan 2024
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Andrew Lee
Xiaoyan Bai
Itamar Pres
Martin Wattenberg
Jonathan K. Kummerfeld
Rada Mihalcea
147
121
0
03 Jan 2024
Steering Llama 2 via Contrastive Activation Addition
Nina Rimsky
Nick Gabrieli
Julian Schulz
Meg Tong
Evan Hubinger
Alexander Matt Turner
LLMSV
61
226
0
09 Dec 2023
The Linear Representation Hypothesis and the Geometry of Large Language Models
Kiho Park
Yo Joong Choe
Victor Veitch
LLMSV
MILM
176
190
0
07 Nov 2023
Circuit Component Reuse Across Tasks in Transformer Language Models
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
84
71
0
12 Oct 2023
Emergent Linear Representations in World Models of Self-Supervised Sequence Models
Neel Nanda
Andrew Lee
Martin Wattenberg
FAtt
MILM
122
186
0
02 Sep 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELM
HILM
160
584
0
06 Jun 2023
A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations
Bilal Chughtai
Lawrence Chan
Neel Nanda
116
103
0
06 Feb 2023
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
97
407
0
19 Dec 2022
Linearly Mapping from Image to Text Space
Jack Merullo
Louis Castricato
Carsten Eickhoff
Ellie Pavlick
VLM
248
118
0
30 Sep 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
144
389
0
28 Mar 2022
Revisiting Model Stitching to Compare Neural Representations
Yamini Bansal
Preetum Nakkiran
Boaz Barak
FedML
117
121
0
14 Jun 2021
Contrastive Learning Inverts the Data Generating Process
Roland S. Zimmermann
Yash Sharma
Steffen Schneider
Matthias Bethge
Wieland Brendel
SSL
376
223
0
17 Feb 2021
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
228
1,224
0
24 Sep 2020
Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples
Angie Boggust
Brandon Carter
Arvind Satyanarayan
102
65
0
10 Dec 2019
Gromov-Wasserstein Alignment of Word Embedding Spaces
David Alvarez-Melis
Tommi Jaakkola
OT
60
328
0
31 Aug 2018
Adversarial Reprogramming of Neural Networks
Gamaleldin F. Elsayed
Ian Goodfellow
Jascha Narain Sohl-Dickstein
OOD
AAML
55
183
0
28 Jun 2018
Residual Connections Encourage Iterative Inference
Stanislaw Jastrzebski
Devansh Arpit
Nicolas Ballas
Vikas Verma
Tong Che
Yoshua Bengio
95
156
0
13 Oct 2017
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.5K
195,053
0
10 Dec 2015
Understanding image representations by measuring their equivariance and equivalence
Karel Lenc
Andrea Vedaldi
SSL
FAtt
155
538
0
21 Nov 2014
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
429
33,605
0
16 Oct 2013
Exploiting Similarities among Languages for Machine Translation
Tomas Mikolov
Quoc V. Le
Ilya Sutskever
111
1,597
0
17 Sep 2013
1