Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.12130
Cited By
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models
26 April 2022
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models"
8 / 8 papers shown
Title
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
49
34
0
27 May 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun
Avi Caciularu
Adam Pearce
Lucas Dixon
Mor Geva
34
88
0
11 Jan 2024
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
39
7
0
21 Nov 2023
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Abbaavaram Gowtham Reddy
Saketh Bachu
Harsh Nilesh Pathak
Ben Godfrey
V. Balasubramanian
V. Varshaneya
Satya Narayanan Kar
CML
31
0
0
24 Mar 2023
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
72
439
0
08 Dec 2022
Analyzing Transformers in Embedding Space
Guy Dar
Mor Geva
Ankit Gupta
Jonathan Berant
29
83
0
06 Sep 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying-Cong Chen
Xinyan Xiao
Jing Liu
Hua Wu
37
4
0
28 Jul 2022
Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces with Pseudowords
Taelin Karidi
Yichu Zhou
Nathan Schneider
Omri Abend
Vivek Srikumar
86
13
0
23 Sep 2021
1