Theory of Mind May Have Spontaneously Emerged in Large Language Models

Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2023

4 February 2023

Michal Kosinskihttps://www.semanticscholar.org/me/account

LLMAG

LRM

ArXiv (abs)PDF HTML Github

Main:11 Pages

3 Figures

Abstract

Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training. Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children. These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.

View on arXiv

Comments on this paper