ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.00737
  4. Cited By
Implicit Representations of Meaning in Neural Language Models

Implicit Representations of Meaning in Neural Language Models

1 June 2021
Belinda Z. Li
Maxwell Nye
Jacob Andreas
    NAIMILM
ArXiv (abs)PDFHTMLGithub (54★)

Papers citing "Implicit Representations of Meaning in Neural Language Models"

50 / 122 papers shown
Title
Don't throw the baby out with the bathwater: How and why deep learning for ARC
Don't throw the baby out with the bathwater: How and why deep learning for ARC
Jack Cole
Mohamed Osman
LRM
45
0
0
17 Jun 2025
Large Language Models Do Multi-Label Classification Differently
Large Language Models Do Multi-Label Classification Differently
Marcus Ma
Georgios Chochlakis
Niyantha Maruthu Pandiyan
Jesse Thomason
Shrikanth Narayanan
108
1
0
23 May 2025
Language Models use Lookbacks to Track Beliefs
Language Models use Lookbacks to Track Beliefs
Nikhil Prakash
Natalie Shapira
Arnab Sen Sharma
Christoph Riedl
Yonatan Belinkov
Tamar Rott Shaham
David Bau
Atticus Geiger
KELM
82
1
0
20 May 2025
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Mirian Hipolito Garcia
Camille Couturier
Daniel Madrigal Diaz
Ankur Mallick
Anastasios Kyrillidis
Robert Sim
Victor Rühle
Saravan Rajmohan
75
1
0
23 Apr 2025
Revisiting the Othello World Model Hypothesis
Yifei Yuan
Anders Søgaard
LRM
97
0
0
06 Mar 2025
Towards Understanding Distilled Reasoning Models: A Representational Approach
Towards Understanding Distilled Reasoning Models: A Representational Approach
David D. Baek
Max Tegmark
LRM
116
6
0
05 Mar 2025
(How) Do Language Models Track State?
Belinda Z. Li
Zifan Carl Guo
Jacob Andreas
LRM
115
3
0
04 Mar 2025
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
Carlos Gómez-Rodríguez
SyDaAILawELMVLM
267
0
0
25 Feb 2025
From Text to Space: Mapping Abstract Spatial Models in LLMs during a Grid-World Navigation Task
From Text to Space: Mapping Abstract Spatial Models in LLMs during a Grid-World Navigation Task
Nicolas Martorell
LLMAG
136
2
0
23 Feb 2025
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships
Angie Boggust
Hyemin Bang
Hendrik Strobelt
Arvindmani Satyanarayan
108
1
0
17 Feb 2025
MET-Bench: Multimodal Entity Tracking for Evaluating the Limitations of Vision-Language and Reasoning Models
MET-Bench: Multimodal Entity Tracking for Evaluating the Limitations of Vision-Language and Reasoning Models
Vanya Cohen
Raymond J. Mooney
114
0
0
15 Feb 2025
Mechanistic Interpretability of Emotion Inference in Large Language Models
Mechanistic Interpretability of Emotion Inference in Large Language Models
Ala Nekouvaght Tak
Amin Banayeeanzade
Anahita Bolourani
Mina Kian
Robin Jia
Jonathan Gratch
110
0
0
08 Feb 2025
Harmonic Loss Trains Interpretable AI Models
Harmonic Loss Trains Interpretable AI Models
David D. Baek
Ziming Liu
Riya Tyagi
Max Tegmark
159
2
0
03 Feb 2025
Emergent Stack Representations in Modeling Counter Languages Using Transformers
Emergent Stack Representations in Modeling Counter Languages Using Transformers
Utkarsh Tiwari
Aviral Gupta
Michael Hahn
502
0
0
03 Feb 2025
ICLR: In-Context Learning of Representations
ICLR: In-Context Learning of Representations
Core Francisco Park
Andrew Lee
Ekdeep Singh Lubana
Yongyi Yang
Maya Okawa
Kento Nishi
Martin Wattenberg
Hidenori Tanaka
AIFin
252
6
0
29 Dec 2024
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu
Xinyan Velocity Yu
Dani Yogatama
Jiasen Lu
Yoon Kim
AIFin
163
22
0
07 Nov 2024
Generative linguistics contribution to artificial intelligence: Where
  this contribution lies?
Generative linguistics contribution to artificial intelligence: Where this contribution lies?
Mohammed Q. Shormani
AI4CE
63
1
0
26 Oct 2024
Automatic Mapping of Anatomical Landmarks from Free-Text Using Large
  Language Models: Insights from Llama-2
Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2
Mohamad Abdi
Gerardo Hermosillo Valadez
H. Yerebakan
MedIm
61
0
0
16 Oct 2024
Systems with Switching Causal Relations: A Meta-Causal Perspective
Systems with Switching Causal Relations: A Meta-Causal Perspective
Moritz Willig
Tim Nelson Tobiasch
Florian Peter Busch
Jonas Seng
Devendra Singh Dhami
Kristian Kersting
CML
152
0
0
16 Oct 2024
Exploring Natural Language-Based Strategies for Efficient Number
  Learning in Children through Reinforcement Learning
Exploring Natural Language-Based Strategies for Efficient Number Learning in Children through Reinforcement Learning
Tirthankar Mittra
48
0
0
10 Oct 2024
Generalization from Starvation: Hints of Universality in LLM Knowledge
  Graph Learning
Generalization from Starvation: Hints of Universality in LLM Knowledge Graph Learning
David D. Baek
Yuxiao Li
Max Tegmark
76
2
0
10 Oct 2024
Chip-Tuning: Classify Before Language Models Say
Chip-Tuning: Classify Before Language Models Say
Fangwei Zhu
Dian Li
Jiajun Huang
Gang Liu
Hui Wang
Zhifang Sui
62
0
0
09 Oct 2024
Chain and Causal Attention for Efficient Entity Tracking
Chain and Causal Attention for Efficient Entity Tracking
Erwan Fagnou
Paul Caillon
Blaise Delattre
Alexandre Allauzen
95
5
0
07 Oct 2024
Counterfactual Token Generation in Large Language Models
Counterfactual Token Generation in Large Language Models
Ivi Chatzi
N. C. Benz
Eleni Straitouri
Stratis Tsirtsis
Manuel Gomez Rodriguez
LRM
119
5
0
25 Sep 2024
Perception-guided Jailbreak against Text-to-Image Models
Perception-guided Jailbreak against Text-to-Image Models
Yihao Huang
Le Liang
Tianlin Li
Xiaojun Jia
Run Wang
Weikai Miao
G. Pu
Yang Liu
127
11
0
20 Aug 2024
Understanding Generative AI Content with Embedding Models
Understanding Generative AI Content with Embedding Models
Max Vargas
Reilly Cannon
A. Engel
Anand D. Sarwate
Tony Chiang
224
3
0
19 Aug 2024
Latent Causal Probing: A Formal Perspective on Probing with Causal
  Models of Data
Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data
Charles Jin
Martin Rinard
88
1
0
18 Jul 2024
States Hidden in Hidden States: LLMs Emerge Discrete State
  Representations Implicitly
States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly
Junhao Chen
Shengding Hu
Zhiyuan Liu
Maosong Sun
LRM
84
5
0
16 Jul 2024
Monitoring Latent World States in Language Models with Propositional
  Probes
Monitoring Latent World States in Language Models with Propositional Probes
Jiahai Feng
Stuart Russell
Jacob Steinhardt
HILM
89
14
0
27 Jun 2024
Does GPT Really Get It? A Hierarchical Scale to Quantify Human vs AI's Understanding of Algorithms
Does GPT Really Get It? A Hierarchical Scale to Quantify Human vs AI's Understanding of Algorithms
Mirabel Reid
Santosh Vempala
ELM
93
0
0
20 Jun 2024
Estimating Knowledge in Large Language Models Without Generating a
  Single Token
Estimating Knowledge in Large Language Models Without Generating a Single Token
Daniela Gottesman
Mor Geva
97
14
0
18 Jun 2024
Refusal in Language Models Is Mediated by a Single Direction
Refusal in Language Models Is Mediated by a Single Direction
Andy Arditi
Oscar Obeso
Aaquib Syed
Daniel Paleka
Nina Panickssery
Wes Gurnee
Neel Nanda
171
218
0
17 Jun 2024
A Notion of Complexity for Theory of Mind via Discrete World Models
A Notion of Complexity for Theory of Mind via Discrete World Models
X. A. Huang
Emanuele La Malfa
Samuele Marro
Andrea Asperti
Anthony Cohn
Michael Wooldridge
95
8
0
16 Jun 2024
What Should Embeddings Embed? Autoregressive Models Represent Latent
  Generating Distributions
What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions
Liyi Zhang
Michael Y. Li
Thomas Griffiths
77
3
0
06 Jun 2024
Evaluating the World Model Implicit in a Generative Model
Evaluating the World Model Implicit in a Generative Model
Keyon Vafa
Justin Y. Chen
Jon M. Kleinberg
S. Mullainathan
Ashesh Rambachan
166
41
0
06 Jun 2024
InversionView: A General-Purpose Method for Reading Information from
  Neural Activations
InversionView: A General-Purpose Method for Reading Information from Neural Activations
Xinting Huang
Madhur Panwar
Navin Goyal
Michael Hahn
101
5
0
27 May 2024
Implicit In-context Learning
Implicit In-context Learning
Zhuowei Li
Zihao Xu
Ligong Han
Yunhe Gao
Song Wen
Di Liu
Hao Wang
Dimitris N. Metaxas
149
3
0
23 May 2024
Simulating Policy Impacts: Developing a Generative Scenario Writing
  Method to Evaluate the Perceived Effects of Regulation
Simulating Policy Impacts: Developing a Generative Scenario Writing Method to Evaluate the Perceived Effects of Regulation
Julia Barnett
Kimon Kieslich
Nicholas Diakopoulos
57
5
0
15 May 2024
Towards Principled Evaluations of Sparse Autoencoders for
  Interpretability and Control
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Aleksandar Makelov
Georg Lange
Neel Nanda
79
41
0
14 May 2024
A Philosophical Introduction to Language Models - Part II: The Way
  Forward
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
124
15
0
06 May 2024
Mechanistic Interpretability for AI Safety -- A Review
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
139
158
0
22 Apr 2024
SelfIE: Self-Interpretation of Large Language Model Embeddings
SelfIE: Self-Interpretation of Large Language Model Embeddings
Haozhe Chen
Carl Vondrick
Chengzhi Mao
67
27
0
16 Mar 2024
Towards a theory of model distillation
Towards a theory of model distillation
Enric Boix-Adserà
FedMLVLM
80
8
0
14 Mar 2024
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period
  of Large Language Models
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models
Chao Qian
Jie Zhang
Wei Yao
Dongrui Liu
Zhen-fei Yin
Yu Qiao
Yong Liu
Jing Shao
LLMSVLRM
98
12
0
29 Feb 2024
RAVEL: Evaluating Interpretability Methods on Disentangling Language
  Model Representations
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
Jing-ling Huang
Zhengxuan Wu
Christopher Potts
Mor Geva
Atticus Geiger
130
35
0
27 Feb 2024
What Do Language Models Hear? Probing for Auditory Representations in
  Language Models
What Do Language Models Hear? Probing for Auditory Representations in Language Models
Jerry Ngo
Yoon Kim
AuLLMMILM
66
8
0
26 Feb 2024
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity
  Tracking
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
Nikhil Prakash
Tamar Rott Shaham
Tal Haklay
Yonatan Belinkov
David Bau
99
67
0
22 Feb 2024
On the Tip of the Tongue: Analyzing Conceptual Representation in Large
  Language Models with Reverse-Dictionary Probe
On the Tip of the Tongue: Analyzing Conceptual Representation in Large Language Models with Reverse-Dictionary Probe
Ningyu Xu
Qi Zhang
Menghan Zhang
Peng Qian
Xuanjing Huang
LRM
124
3
0
22 Feb 2024
Strong hallucinations from negation and how to fix them
Strong hallucinations from negation and how to fix them
Nicholas Asher
Swarnadeep Bhar
ReLMLRM
54
5
0
16 Feb 2024
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video
  Editing
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Bryan Wang
Yuliang Li
Zhaoyang Lv
Haijun Xia
Yan Xu
Raj Sodhi
92
53
0
15 Feb 2024
123
Next