ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.01761
  4. Cited By
Rethinking Interpretability in the Era of Large Language Models

Rethinking Interpretability in the Era of Large Language Models

30 January 2024
Chandan Singh
J. Inala
Michel Galley
Rich Caruana
Jianfeng Gao
    LRM
    AI4CE
ArXivPDFHTML

Papers citing "Rethinking Interpretability in the Era of Large Language Models"

50 / 61 papers shown
Title
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Chen Shani
Dan Jurafsky
Yann LeCun
Ravid Shwartz-Ziv
143
0
0
21 May 2025
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Boyang Deng
Songyou Peng
Kyle Genova
Gordon Wetzstein
Noah Snavely
Leonidas Guibas
Thomas Funkhouser
HAI
363
0
0
11 Apr 2025
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Miguel López-Otal
Jorge Gracia
Jordi Bernad
Carlos Bobed
Lucía Pitarch-Ballesteros
Emma Anglés-Herrero
VLM
86
1
0
09 Apr 2025
Dataset Featurization: Uncovering Natural Language Features through Unsupervised Data Reconstruction
Dataset Featurization: Uncovering Natural Language Features through Unsupervised Data Reconstruction
Michal Bravansky
Vaclav Kubon
Suhas Hariharan
Robert Kirk
95
1
0
24 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
167
0
0
17 Feb 2025
Deciphering Functions of Neurons in Vision-Language Models
Deciphering Functions of Neurons in Vision-Language Models
Jiaqi Xu
Cuiling Lan
Xuejin Chen
Yan Lu
VLM
190
0
0
10 Feb 2025
Making Sense Of Distributed Representations With Activation Spectroscopy
Kyle Reing
Greg Ver Steeg
Aram Galstyan
61
0
0
28 Jan 2025
Interacting Large Language Model Agents. Interpretable Models and Social Learning
Interacting Large Language Model Agents. Interpretable Models and Social Learning
Adit Jain
Vikram Krishnamurthy
LLMAG
78
0
0
02 Nov 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
67
15
0
27 Jul 2024
Unveiling LLM Mechanisms Through Neural ODEs and Control Theory
Unveiling LLM Mechanisms Through Neural ODEs and Control Theory
Yukun Zhang
Qi Dong
67
0
0
23 Jun 2024
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Zijin Hong
Zheng Yuan
Qinggang Zhang
Hao Chen
Junnan Dong
Feiran Huang
Xiao Huang
123
70
0
12 Jun 2024
Metaheuristics and Large Language Models Join Forces: Toward an Integrated Optimization Approach
Metaheuristics and Large Language Models Join Forces: Toward an Integrated Optimization Approach
Camilo Chacón Sartori
Christian Blum
Filippo Bistaffa
Guillem Rodríguez Corominas
AIFin
86
4
0
28 May 2024
Perturbation-Restrained Sequential Model Editing
Perturbation-Restrained Sequential Model Editing
Junjie Ma
Hong Wang
Haoyang Xu
Zhen-Hua Ling
Jia-Chen Gu
KELM
118
10
0
27 May 2024
Genetic Programming for Explainable Manifold Learning
Genetic Programming for Explainable Manifold Learning
Ben Cravens
Andrew Lensen
Paula Maddigan
Bing Xue
74
1
0
21 Mar 2024
A Comprehensive Survey of Hallucination Mitigation Techniques in Large
  Language Models
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Anku Rani
Vipula Rawte
Aman Chadha
Amitava Das
HILM
87
198
0
02 Jan 2024
How do Language Models Bind Entities in Context?
How do Language Models Bind Entities in Context?
Jiahai Feng
Jacob Steinhardt
70
39
0
26 Oct 2023
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in
  AlphaZero
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero
Lisa Schut
Nenad Tomašev
Tom McGrath
Demis Hassabis
Ulrich Paquet
Been Kim
46
35
0
25 Oct 2023
What Algorithms can Transformers Learn? A Study in Length Generalization
What Algorithms can Transformers Learn? A Study in Length Generalization
Hattie Zhou
Arwen Bradley
Etai Littwin
Noam Razin
Omid Saremi
Josh Susskind
Samy Bengio
Preetum Nakkiran
63
121
0
24 Oct 2023
Tree Prompting: Efficient Task Adaptation without Fine-Tuning
Tree Prompting: Efficient Task Adaptation without Fine-Tuning
John X. Morris
Chandan Singh
Alexander M. Rush
Jianfeng Gao
Yuntian Deng
VLM
LRM
66
19
0
21 Oct 2023
Eliciting Human Preferences with Language Models
Eliciting Human Preferences with Language Models
Belinda Z. Li
Alex Tamkin
Noah D. Goodman
Jacob Andreas
RALM
68
51
0
17 Oct 2023
Circuit Component Reuse Across Tasks in Transformer Language Models
Circuit Component Reuse Across Tasks in Transformer Language Models
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
70
70
0
12 Oct 2023
Impact of Co-occurrence on Factual Knowledge of Large Language Models
Impact of Co-occurrence on Factual Knowledge of Large Language Models
Cheongwoong Kang
Jaesik Choi
KELM
59
17
0
12 Oct 2023
From Supervised to Generative: A Novel Paradigm for Tabular Deep
  Learning with Large Language Models
From Supervised to Generative: A Novel Paradigm for Tabular Deep Learning with Large Language Models
Xumeng Wen
Han Zhang
Shun Zheng
Wei Xu
Jiang Bian
LMTD
ALM
97
20
0
11 Oct 2023
Benchmarking and Improving Generator-Validator Consistency of Language
  Models
Benchmarking and Improving Generator-Validator Consistency of Language Models
Xiang Lisa Li
Vaishnavi Shrivastava
Siyan Li
Tatsunori Hashimoto
Percy Liang
63
30
0
03 Oct 2023
Large Language Models for Automated Open-domain Scientific Hypotheses
  Discovery
Large Language Models for Automated Open-domain Scientific Hypotheses Discovery
Zonglin Yang
Xinya Du
Junxian Li
Jie Zheng
Soujanya Poria
Min Zhang
LRM
48
54
0
06 Sep 2023
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Maciej Besta
Nils Blach
Aleš Kubíček
Robert Gerstenberger
Michal Podstawski
...
Joanna Gajda
Tomasz Lehmann
H. Niewiadomski
Piotr Nyczyk
Torsten Hoefler
LRM
AI4CE
LM&Ro
104
662
0
18 Aug 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural
  Language Explanations
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Yanda Chen
Ruiqi Zhong
Narutatsu Ri
Chen Zhao
He He
Jacob Steinhardt
Zhou Yu
Kathleen McKeown
LRM
52
51
0
17 Jul 2023
Measuring Faithfulness in Chain-of-Thought Reasoning
Measuring Faithfulness in Chain-of-Thought Reasoning
Tamera Lanham
Anna Chen
Ansh Radhakrishnan
Benoit Steiner
Carson E. Denison
...
Zac Hatfield-Dodds
Jared Kaplan
J. Brauner
Sam Bowman
Ethan Perez
ReLM
LRM
61
184
0
17 Jul 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
312
4,253
0
09 Jun 2023
Faithfulness Tests for Natural Language Explanations
Faithfulness Tests for Natural Language Explanations
Pepa Atanasova
Oana-Maria Camburu
Christina Lioma
Thomas Lukasiewicz
J. Simonsen
Isabelle Augenstein
FAtt
63
64
0
29 May 2023
MaNtLE: Model-agnostic Natural Language Explainer
MaNtLE: Model-agnostic Natural Language Explainer
Rakesh R Menon
Kerem Zaman
Shashank Srivastava
FAtt
LRM
51
2
0
22 May 2023
Explaining black box text modules in natural language with language
  models
Explaining black box text modules in natural language with language models
Chandan Singh
Aliyah R. Hsu
Richard Antonello
Shailee Jain
Alexander G. Huth
Bin Yu
Jianfeng Gao
MILM
50
54
0
17 May 2023
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
Zhengxuan Wu
Atticus Geiger
Thomas Icard
Christopher Potts
Noah D. Goodman
MILM
70
92
0
15 May 2023
Goal Driven Discovery of Distributional Differences via Language
  Descriptions
Goal Driven Discovery of Distributional Differences via Language Descriptions
Ruiqi Zhong
Peter Zhang
Steve Li
Jinwoo Ahn
Dan Klein
Jacob Steinhardt
77
51
0
28 Feb 2023
Large Language Models Struggle to Learn Long-Tail Knowledge
Large Language Models Struggle to Learn Long-Tail Knowledge
Nikhil Kandpal
H. Deng
Adam Roberts
Eric Wallace
Colin Raffel
RALM
KELM
112
414
0
15 Nov 2022
Measuring and Narrowing the Compositionality Gap in Language Models
Measuring and Narrowing the Compositionality Gap in Language Models
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
ReLM
KELM
LRM
143
617
0
07 Oct 2022
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
305
510
0
24 Sep 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function
  Classes
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
116
504
0
01 Aug 2022
Scaling Laws and Interpretability of Learning from Repeated Data
Scaling Laws and Interpretability of Learning from Repeated Data
Danny Hernandez
Tom B. Brown
Tom Conerly
Nova Dassarma
Dawn Drain
...
Catherine Olsson
Dario Amodei
Nicholas Joseph
Jared Kaplan
Sam McCandlish
58
114
0
21 May 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
416
6,202
0
05 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
738
9,267
0
28 Jan 2022
Natural Language Descriptions of Deep Visual Features
Natural Language Descriptions of Deep Visual Features
Evan Hernandez
Sarah Schwettmann
David Bau
Teona Bagashvili
Antonio Torralba
Jacob Andreas
MILM
288
120
0
26 Jan 2022
Knowledge Neurons in Pretrained Transformers
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
79
449
0
18 Apr 2021
Interpretable Machine Learning: Fundamental Principles and 10 Grand
  Challenges
Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges
Cynthia Rudin
Chaofan Chen
Zhi Chen
Haiyang Huang
Lesia Semenova
Chudi Zhong
FaML
AI4CE
LRM
156
668
0
20 Mar 2021
Interpretation of NLP models through input marginalization
Interpretation of NLP models through input marginalization
Siwon Kim
Jihun Yi
Eunji Kim
Sungroh Yoon
MILM
FAtt
72
60
0
27 Oct 2020
Does the Whole Exceed its Parts? The Effect of AI Explanations on
  Complementary Team Performance
Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance
Gagan Bansal
Tongshuang Wu
Joyce Zhou
Raymond Fok
Besmira Nushi
Ece Kamar
Marco Tulio Ribeiro
Daniel S. Weld
70
591
0
26 Jun 2020
REALM: Retrieval-Augmented Language Model Pre-Training
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
103
2,090
0
10 Feb 2020
Attention is not not Explanation
Attention is not not Explanation
Sarah Wiegreffe
Yuval Pinter
XAI
AAML
FAtt
92
908
0
13 Aug 2019
Incorporating Priors with Feature Attribution on Text Classification
Incorporating Priors with Feature Attribution on Text Classification
Frederick Liu
Besim Avci
FAtt
FaML
74
120
0
19 Jun 2019
What Does BERT Look At? An Analysis of BERT's Attention
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
12
Next