ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.08946
83
1

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

13 February 2025
Mo Yu
Lemao Liu
J. Wu
Tsz Ting Chung
Shunchi Zhang
JiangNan Li
Dit-Yan Yeung
Jie Zhou
ArXivPDFHTML
Abstract

In a systematic way, we investigate a widely asked question: Do LLMs really understand what they say?, which relates to the more familiar term Stochastic Parrot. To this end, we propose a summative assessment over a carefully designed physical concept understanding task, PhysiCo. Our task alleviates the memorization issue via the usage of grid-format inputs that abstractly describe physical phenomena. The grids represents varying levels of understanding, from the core phenomenon, application examples to analogies to other abstract patterns in the grid world. A comprehensive study on our task demonstrates: (1) state-of-the-art LLMs, including GPT-4o, o1 and Gemini 2.0 flash thinking, lag behind humans by ~40%; (2) the stochastic parrot phenomenon is present in LLMs, as they fail on our grid task but can describe and recognize the same concepts well in natural language; (3) our task challenges the LLMs due to intrinsic difficulties rather than the unfamiliar grid format, as in-context learning and fine-tuning on same formatted data added little to their performance.

View on arXiv
@article{yu2025_2502.08946,
  title={ The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding },
  author={ Mo Yu and Lemao Liu and Junjie Wu and Tsz Ting Chung and Shunchi Zhang and Jiangnan Li and Dit-Yan Yeung and Jie Zhou },
  journal={arXiv preprint arXiv:2502.08946},
  year={ 2025 }
}
Comments on this paper