ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.12130
  4. Cited By
LM-Debugger: An Interactive Tool for Inspection and Intervention in
  Transformer-Based Language Models

LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

26 April 2022
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
    KELM
ArXivPDFHTML

Papers citing "LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models"

8 / 8 papers shown
Title
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
49
34
0
27 May 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations
  of Language Models
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun
Avi Caciularu
Adam Pearce
Lucas Dixon
Mor Geva
34
88
0
11 Jan 2024
Compositional Capabilities of Autoregressive Transformers: A Study on
  Synthetic, Interpretable Tasks
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
39
7
0
21 Nov 2023
Towards Learning and Explaining Indirect Causal Effects in Neural
  Networks
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Abbaavaram Gowtham Reddy
Saketh Bachu
Harsh Nilesh Pathak
Ben Godfrey
V. Balasubramanian
V. Varshaneya
Satya Narayanan Kar
CML
31
0
0
24 Mar 2023
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
72
439
0
08 Dec 2022
Analyzing Transformers in Embedding Space
Analyzing Transformers in Embedding Space
Guy Dar
Mor Geva
Ankit Gupta
Jonathan Berant
29
83
0
06 Sep 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying-Cong Chen
Xinyan Xiao
Jing Liu
Hua Wu
37
4
0
28 Jul 2022
Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces
  with Pseudowords
Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces with Pseudowords
Taelin Karidi
Yichu Zhou
Nathan Schneider
Omri Abend
Vivek Srikumar
86
13
0
23 Sep 2021
1