Whether large language models (LLMs) process language similarly to humans has been the subject of much theoretical and practical debate. We examine this question through the lens of the production-interpretation distinction found in human sentence processing and evaluate the extent to which instruction-tuned LLMs replicate this distinction. Using an empirically documented asymmetry between production and interpretation in humans for implicit causality verbs as a testbed, we find that some LLMs do quantitatively and qualitatively reflect human-like asymmetries between production and interpretation. We demonstrate that whether this behavior holds depends upon both model size - with larger models more likely to reflect human-like patterns and the choice of meta-linguistic prompts used to elicit the behavior.
View on arXiv@article{lam2025_2503.17579, title={ Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility }, author={ Suet-Ying Lam and Qingcheng Zeng and Jingyi Wu and Rob Voigt }, journal={arXiv preprint arXiv:2503.17579}, year={ 2025 } }