ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16755
  4. Cited By
HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning
  in Large Language Models

HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models

25 October 2023
Yinghui He
Yufan Wu
Yilin Jia
Rada Mihalcea
Yulong Chen
Naihao Deng
    LRM
    LLMAG
ArXivPDFHTML

Papers citing "HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models"

22 / 22 papers shown
Title
$\texttt{DIAMONDs}$: A Dataset for $\mathbb{D}$ynamic $\mathbb{I}$nformation $\mathbb{A}$nd $\mathbb{M}$ental modeling $\mathbb{O}$f $\mathbb{N}$umeric $\mathbb{D}$iscussions
DIAMONDs\texttt{DIAMONDs}DIAMONDs: A Dataset for D\mathbb{D}Dynamic I\mathbb{I}Information A\mathbb{A}And M\mathbb{M}Mental modeling O\mathbb{O}Of N\mathbb{N}Numeric D\mathbb{D}Discussions
Sayontan Ghosh
Mahnaz Koupaee
Yash Kumar Lal
Pegah Alipoormolabashi
Mohammad Saqib Hasan
Jun Seok Kang
Niranjan Balasubramanian
5
0
0
19 May 2025
R^3-VQA: "Read the Room" by Video Social Reasoning
R^3-VQA: "Read the Room" by Video Social Reasoning
Lixing Niu
Jiapeng Li
Xingping Yu
Shu Wang
Ruining Feng
Bo Wu
Ping Wei
Yue Wang
Lifeng Fan
51
0
0
07 May 2025
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
Bang Zhang
Ruotian Ma
Qingxuan Jiang
Peisong Wang
Jiaqi Chen
...
Fanghua Ye
Jian Li
Yifan Yang
Zhaopeng Tu
Xiaolong Li
LLMAG
ELM
ALM
109
0
1
01 May 2025
AI Awareness
AI Awareness
Xianrui Li
Haoyuan Shi
Rongwu Xu
Wei Xu
59
0
0
25 Apr 2025
Assesing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation
Assesing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation
Takaya Arita
Wenxian Zheng
Reiji Suzuki
Fuminori Akiba
22
0
0
17 Apr 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
Jiahao Qiu
Yinghui He
Xinzhe Juan
Yitong Wang
Yong Liu
Zixin Yao
Yue Wu
Xun Jiang
L. Yang
Mengdi Wang
AI4MH
73
0
0
13 Apr 2025
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Yuheng Wu
Wentao Guo
Zirui Liu
Heng Ji
Zhaozhuo Xu
Denghui Zhang
33
0
0
05 Apr 2025
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Yi-Long Lu
Chunhui Zhang
Jiajun Song
Lifeng Fan
Wei Wang
OffRL
53
0
0
02 Apr 2025
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
The Mind in the Machine: A Survey of Incorporating Psychological Theories in LLMs
Zizhou Liu
Ziwei Gong
Lin Ai
Zheng Hui
Run Chen
Colin Wayne Leach
Michelle R. Greene
Julia Hirschberg
LLMAG
150
0
0
28 Mar 2025
Persuasion Should be Double-Blind: A Multi-Domain Dialogue Dataset With Faithfulness Based on Causal Theory of Mind
Persuasion Should be Double-Blind: A Multi-Domain Dialogue Dataset With Faithfulness Based on Causal Theory of Mind
Dingyi Zhang
Deyu Zhou
66
1
0
28 Feb 2025
Re-evaluating Theory of Mind evaluation in large language models
Re-evaluating Theory of Mind evaluation in large language models
Jennifer Hu
Felix Sosa
T. Ullman
45
0
0
28 Feb 2025
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hyunwoo Kim
Melanie Sclar
Tan Zhi-Xuan
Lance Ying
Sydney Levine
Yang Liu
Joshua B. Tenenbaum
Yejin Choi
LRM
LLMAG
56
0
0
17 Feb 2025
Large Language Models as Theory of Mind Aware Generative Agents with Counterfactual Reflection
Bo Yang
Jiaxian Guo
Yusuke Iwasawa
Y. Matsuo
AI4CE
41
1
0
28 Jan 2025
Belief in the Machine: Investigating Epistemological Blind Spots of
  Language Models
Belief in the Machine: Investigating Epistemological Blind Spots of Language Models
Mirac Suzgun
Tayfun Gur
Federico Bianchi
Daniel E. Ho
Thomas F. Icard
Dan Jurafsky
James Zou
31
1
0
28 Oct 2024
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit
  ToM Application in LLMs
SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Yuling Gu
Oyvind Tafjord
Hyunwoo Kim
Jared Moore
Ronan Le Bras
Peter Clark
Yejin Choi
33
8
0
17 Oct 2024
EgoSocialArena: Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective
EgoSocialArena: Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective
Guiyang Hou
Wenqi Zhang
Yongliang Shen
Zeqi Tan
Sihao Shen
Weiming Lu
31
0
0
08 Oct 2024
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language
  Models
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Jiayi Gui
Yiming Liu
Jiale Cheng
Xiaotao Gu
Xiao-Yang Liu
Hongning Wang
Yuxiao Dong
Jie Tang
Minlie Huang
ELM
LLMAG
LRM
37
2
0
28 Aug 2024
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
Matteo Bortoletto
Constantin Ruhdorfer
Lei Shi
Andreas Bulling
AI4MH
LRM
46
5
0
25 Jun 2024
LLMs achieve adult human performance on higher-order theory of mind
  tasks
LLMs achieve adult human performance on higher-order theory of mind tasks
Winnie Street
John Oliver Siy
Geoff Keeling
Adrien Baranes
Benjamin Barnett
Michael McKibben
Tatenda Kanyere
Alison Lentz
Blaise Agüera y Arcas
Robin I. M. Dunbar
LRM
51
32
0
29 May 2024
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
Jiaqi Shao
Tianjun Yuan
Tao Lin
Xuanyu Cao
50
0
0
28 May 2024
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
232
1,742
0
07 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
301
2,232
0
22 Mar 2023
1