ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.09450
78
0

UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs

11 June 2025
Prameshwar Thiyagarajan
Vaishnavi Parimi
Shamant Sai
Soumil Garg
Zhangir Meirbek
Nitin Yarlagadda
Kevin Zhu
Chris Kim
    LLMAG
ArXiv (abs)PDFHTML
Main:6 Pages
Bibliography:1 Pages
6 Tables
Appendix:6 Pages
Abstract

Theory of Mind (ToM), the ability to understand the mental states of oneself and others, remains a challenging area for large language models (LLMs), which often fail to predict human mental states accurately. In this paper, we introduce UniToMBench, a unified benchmark that integrates the strengths of SimToM and TOMBENCH to systematically improve and assess ToM capabilities in LLMs by integrating multi-interaction task designs and evolving story scenarios. Supported by a custom dataset of over 1,000 hand-written scenarios, UniToMBench combines perspective-taking techniques with diverse evaluation metrics to better stimulate social cognition in LLMs. Through evaluation, we observe that while models like GPT-4o and GPT-4o Mini show consistently high accuracy in tasks involving emotional and belief-related scenarios, with results usually above 80%, there is significant variability in their performance across knowledge-based tasks. These results highlight both the strengths and limitations of current LLMs in ToM-related tasks, underscoring the value of UniToMBench as a comprehensive tool for future development. Our code is publicly available here:this https URL.

View on arXiv
@article{thiyagarajan2025_2506.09450,
  title={ UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs },
  author={ Prameshwar Thiyagarajan and Vaishnavi Parimi and Shamant Sai and Soumil Garg and Zhangir Meirbek and Nitin Yarlagadda and Kevin Zhu and Chris Kim },
  journal={arXiv preprint arXiv:2506.09450},
  year={ 2025 }
}
Comments on this paper