ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.15421
  4. Cited By
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in
  Interactions

FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

24 October 2023
Hyunwoo J. Kim
Melanie Sclar
Xuhui Zhou
Ronan Le Bras
Gunhee Kim
Yejin Choi
Maarten Sap
    LLMAG
ArXivPDFHTML

Papers citing "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"

50 / 63 papers shown
Title
$\texttt{DIAMONDs}$: A Dataset for $\mathbb{D}$ynamic $\mathbb{I}$nformation $\mathbb{A}$nd $\mathbb{M}$ental modeling $\mathbb{O}$f $\mathbb{N}$umeric $\mathbb{D}$iscussions
DIAMONDs\texttt{DIAMONDs}DIAMONDs: A Dataset for D\mathbb{D}Dynamic I\mathbb{I}Information A\mathbb{A}And M\mathbb{M}Mental modeling O\mathbb{O}Of N\mathbb{N}Numeric D\mathbb{D}Discussions
Sayontan Ghosh
Mahnaz Koupaee
Yash Kumar Lal
Pegah Alipoormolabashi
Mohammad Saqib Hasan
Jun Seok Kang
Niranjan Balasubramanian
9
0
0
19 May 2025
R^3-VQA: "Read the Room" by Video Social Reasoning
R^3-VQA: "Read the Room" by Video Social Reasoning
Lixing Niu
Jiapeng Li
Xingping Yu
Shu Wang
Ruining Feng
Bo Wu
Ping Wei
Yansen Wang
Lifeng Fan
51
0
0
07 May 2025
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Bang Zhang
Ruotian Ma
Qingxuan Jiang
Peisong Wang
Jiaqi Chen
...
Fanghua Ye
Jian Li
Yifan Yang
Zhaopeng Tu
Xiaolong Li
LLMAG
ELM
ALM
111
0
1
01 May 2025
Rethinking Theory of Mind Benchmarks for LLMs: Towards A User-Centered Perspective
Rethinking Theory of Mind Benchmarks for LLMs: Towards A User-Centered Perspective
Qiaosi Wang
Xuhui Zhou
Maarten Sap
Jodi Forlizzi
Hong Shen
38
0
0
15 Apr 2025
Measurement of LLM's Philosophies of Human Nature
Measurement of LLM's Philosophies of Human Nature
Minheng Ni
Ennan Wu
Zidong Gong
Zhengyuan Yang
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
Lijuan Wang
Wangmeng Zuo
37
0
0
03 Apr 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
177
0
0
28 Mar 2025
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Towards properly implementing Theory of Mind in AI systems: An account of four misconceptions
Ramira van der Meulen
Rineke Verbrugge
Max van Duijn
46
0
0
28 Feb 2025
Persuasion Should be Double-Blind: A Multi-Domain Dialogue Dataset With Faithfulness Based on Causal Theory of Mind
Persuasion Should be Double-Blind: A Multi-Domain Dialogue Dataset With Faithfulness Based on Causal Theory of Mind
Dingyi Zhang
Deyu Zhou
69
1
0
28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
45
0
0
28 Feb 2025
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues
Fangxu Yu
Lai Jiang
Shenyi Huang
Zhen Wu
Xinyu Dai
LLMAG
87
0
0
28 Feb 2025
Social Genome: Grounded Social Reasoning Abilities of Multimodal Models
Social Genome: Grounded Social Reasoning Abilities of Multimodal Models
Leena Mathur
Marian Qian
Paul Pu Liang
Louis-Philippe Morency
LRM
187
1
0
21 Feb 2025
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hyunwoo Kim
Melanie Sclar
Tan Zhi-Xuan
Lance Ying
Sydney Levine
Yang Liu
Joshua B. Tenenbaum
Yejin Choi
LRM
LLMAG
61
0
0
17 Feb 2025
A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks
A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks
Hieu Minh "Jord" Nguyen
LM&MA
LRM
56
0
0
10 Feb 2025
Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection
Bo Yang
Jiaxian Guo
Yusuke Iwasawa
Y. Matsuo
AI4CE
41
1
0
28 Jan 2025
Lies, Damned Lies, and Distributional Language Statistics: Persuasion
  and Deception with Large Language Models
Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language Models
Cameron R. Jones
Benjamin Bergen
67
5
0
22 Dec 2024
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Eitan Wagner
Nitay Alon
J. Barnby
Omri Abend
LRM
85
2
0
18 Dec 2024
Multi-ToM: Evaluating Multilingual Theory of Mind Capabilities in Large
  Language Models
Multi-ToM: Evaluating Multilingual Theory of Mind Capabilities in Large Language Models
Jayanta Sadhu
Ayan Antik Khan
Noshin Nawal
Sanju Basak
Abhik Bhattacharjee
Rifat Shahriyar
74
0
0
24 Nov 2024
From Imitation to Introspection: Probing Self-Consciousness in Language
  Models
From Imitation to Introspection: Probing Self-Consciousness in Language Models
Sirui Chen
Shu Yu
Shengjie Zhao
Chaochao Lu
MILM
LRM
30
1
0
24 Oct 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Wenkai Li
Jiarui Liu
Andy Liu
Xuhui Zhou
Mona Diab
Maarten Sap
59
6
0
21 Oct 2024
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit
  ToM Application in LLMs
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Yuling Gu
Oyvind Tafjord
Hyunwoo Kim
Jared Moore
Ronan Le Bras
Peter Clark
Yejin Choi
33
8
0
17 Oct 2024
EgoSocialArena: Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective
EgoSocialArena: Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective
Guiyang Hou
Wenqi Zhang
Yongliang Shen
Zeqi Tan
Sihao Shen
Weiming Lu
31
0
0
08 Oct 2024
Constrained Reasoning Chains for Enhancing Theory-of-Mind in Large
  Language Models
Constrained Reasoning Chains for Enhancing Theory-of-Mind in Large Language Models
Zizheng Lin
Chunkit Chan
Yangqiu Song
Xin Liu
LRM
26
1
0
20 Sep 2024
Multimodal Fusion with LLMs for Engagement Prediction in Natural
  Conversation
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation
Cheng Charles Ma
Kevin Hyekang Joo
Alexandria K. Vail
Sunreeta Bhattacharya
Álvaro Fernández García
Kailana Baker-Matsuoka
Sheryl Mathew
Lori L. Holt
Fernando De la Torre
49
3
0
13 Sep 2024
Beyond Preferences in AI Alignment
Beyond Preferences in AI Alignment
Tan Zhi-Xuan
Micah Carroll
Matija Franklin
Hal Ashton
41
16
0
30 Aug 2024
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao
Tianshi Li
Weiyan Shi
Yanchen Liu
Diyi Yang
PILM
58
15
0
29 Aug 2024
CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models
CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models
S. Bharti
Shiyun Cheng
Jihyun Rho
Martina Rao
Mu Cai
Yong Jae Lee
Martina Rau
Xiaojin Zhu
45
1
0
26 Aug 2024
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
Haojun Shi
Suyu Ye
Xinyu Fang
Chuanyang Jin
Leyla Isik
Yen-Ling Kuo
Tianmin Shu
LLMAG
75
7
0
22 Aug 2024
Understanding Epistemic Language with a Language-augmented Bayesian Theory of Mind
Understanding Epistemic Language with a Language-augmented Bayesian Theory of Mind
Lance Ying
Tan Zhi-Xuan
Lionel Wong
Vikash K. Mansinghka
J. Tenenbaum
61
1
0
21 Aug 2024
Large Model Strategic Thinking, Small Model Efficiency: Transferring
  Theory of Mind in Large Language Models
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models
Nunzio Lorè
Alireza Ilami
Babak Heydari
LRM
43
1
0
05 Aug 2024
Perceptions to Beliefs: Exploring Precursory Inferences for Theory of
  Mind in Large Language Models
Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models
Chani Jung
Dongkwan Kim
Jiho Jin
Jiseon Kim
Yeon Seonwoo
Yejin Choi
Alice H. Oh
Hyunwoo Kim
LRM
58
7
0
08 Jul 2024
Stark: Social Long-Term Multi-Modal Conversation with Persona
  Commonsense Knowledge
Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge
Young-Jun Lee
Dokyong Lee
Junyoung Youn
Kyeongjin Oh
ByungSoo Ko
Jonghwan Hyeon
Ho-Jin Choi
36
2
0
04 Jul 2024
TimeToM: Temporal Space is the Key to Unlocking the Door of Large
  Language Models' Theory-of-Mind
TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind
Guiyang Hou
Wenqi Zhang
Yongliang Shen
Linjuan Wu
Weiming Lu
LRM
AI4CE
38
7
0
01 Jul 2024
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
Matteo Bortoletto
Constantin Ruhdorfer
Lei Shi
Andreas Bulling
AI4MH
LRM
48
4
0
25 Jun 2024
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at
  Trivial Alterations to the False Belief Task?
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
Zhiqiang Pi
Annapurna Vadaparty
Benjamin K. Bergen
Cameron R. Jones
26
2
0
20 Jun 2024
A Notion of Complexity for Theory of Mind via Discrete World Models
A Notion of Complexity for Theory of Mind via Discrete World Models
X. A. Huang
Emanuele La Malfa
Samuele Marro
Andrea Asperti
Anthony Cohn
Michael Wooldridge
45
6
0
16 Jun 2024
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs
  for Open-Ended Responses
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
Maryam Amirizaniani
Elias Martin
Maryna Sivachenko
A. Mashhadi
Chirag Shah
LRM
45
12
0
09 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
105
31
0
09 Jun 2024
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in
  Large Language Models
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models
Weizhi Tang
Vaishak Belle
LLMAG
LRM
27
1
0
07 Jun 2024
TimeChara: Evaluating Point-in-Time Character Hallucination of
  Role-Playing Large Language Models
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
Jaewoo Ahn
Taehyun Lee
Junyoung Lim
Jin-Hwa Kim
Sangdoo Yun
Hwaran Lee
Gunhee Kim
LLMAG
HILM
37
12
0
28 May 2024
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation
  of Non-Literal Intent Resolution in LLMs
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs
Akhila Yerukola
Saujas Vaduguru
Daniel Fried
Maarten Sap
37
1
0
14 May 2024
Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals
Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals
Joshua Clymer
Caden Juang
Severin Field
CVBM
34
2
0
08 May 2024
From Persona to Personalization: A Survey on Role-Playing Language
  Agents
From Persona to Personalization: A Survey on Role-Playing Language Agents
Jiangjie Chen
Xintao Wang
Rui Xu
Siyu Yuan
Yikai Zhang
...
Caiyu Hu
Siye Wu
Scott Ren
Ziquan Fu
Yanghua Xiao
62
79
0
28 Apr 2024
Text-Tuple-Table: Towards Information Integration in Text-to-Table
  Generation via Global Tuple Extraction
Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction
Zheye Deng
Chunkit Chan
Weiqi Wang
Yuxi Sun
Wei Fan
Tianshi Zheng
Yauwai Yim
Yangqiu Song
LMTD
RALM
43
10
0
22 Apr 2024
NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on
  Negotiation Surrounding
NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding
Chunkit Chan
Cheng Jiayang
Yauwai Yim
Zheye Deng
Wei Fan
Haoran Li
Xin Liu
Hongming Zhang
Weiqi Wang
Yangqiu Song
LLMAG
35
22
0
21 Apr 2024
Advancing Social Intelligence in AI Agents: Technical Challenges and
  Open Questions
Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions
Leena Mathur
Paul Pu Liang
Louis-Philippe Morency
LLMAG
38
7
0
17 Apr 2024
PRobELM: Plausibility Ranking Evaluation for Language Models
PRobELM: Plausibility Ranking Evaluation for Language Models
Moy Yuan
Chenxi Whitehouse
Eric Chamoun
Rami Aly
Andreas Vlachos
91
4
0
04 Apr 2024
Is this the real life? Is this just fantasy? The Misleading Success of
  Simulating Social Interactions With LLMs
Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Xuhui Zhou
Zhe Su
Tiwalayo Eisape
Hyunwoo J. Kim
Maarten Sap
34
38
0
08 Mar 2024
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using
  Common Ground
Views Are My Own, but Also Yours: Benchmarking Theory of Mind Using Common Ground
Adil Soubki
John Murzaku
Arash Yousefi Jordehi
Peter Zeng
Magdalena Markowska
Seyed Abolghasem Mirroshandel
Owen Rambow
VLM
23
7
0
04 Mar 2024
PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large
  Language Models
PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models
Fiona Anting Tan
G. Yeo
Fanyou Wu
Weijie Xu
Vinija Jain
Aman Chadha
Kokil Jaidka
Yang Liu
See-Kiong Ng
LRM
41
5
0
04 Mar 2024
ToMBench: Benchmarking Theory of Mind in Large Language Models
ToMBench: Benchmarking Theory of Mind in Large Language Models
Zhuang Chen
Jincenzi Wu
Jinfeng Zhou
Bosi Wen
Guanqun Bi
...
Yaru Cao
Mengting Hu
Yunghwei Lai
Zexuan Xiong
Minlie Huang
43
12
0
23 Feb 2024
12
Next