Structured, flexible, and robust: benchmarking and improving large
language models towards more human-like behavior in out-of-distribution
reasoning tasks

Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks

11 May 2022

Papers citing "Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks"

14 / 14 papers shown

Title
The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners Vince Trencsenyi Agnieszka Mensfelt Kostas Stathis LRM 26 0 0 14 May 2025
Distinguishing AI-Generated and Human-Written Text Through Psycholinguistic Analysis Chidimma Opara DeLMO 42 0 0 03 May 2025
A Survey of AI Agent Protocols Yuqing Yang Huacan Chai Yangqiu Song S. Qi Muning Wen ... Gaowei Chang Wei Liu Ying Wen Yong Yu Wenbo Zhang LLMAG 69 1 0 23 Apr 2025
Testing the limits of fine-tuning to improve reasoning in vision language models Luca M. Schulze Buschoff Konstantinos Voudouris Elif Akata Matthias Bethge Joshua B. Tenenbaum Eric Schulz LRM VLM Presented at ResearchTrend Connect \| VLM on 14 Mar 2025 126 0 1 24 Feb 2025
Large Language Models and Cognitive Science: A Comprehensive Review of Similarities, Differences, and Challenges Qian Niu Junyu Liu Ziqian Bi Pohsun Feng Benji Peng ... Ming Li Lawrence KQ Yan Yichao Zhang Caitlyn Heqi Yin Cheng Fei 42 15 0 04 Sep 2024
People use fast, goal-directed simulation to reason about novel games Cedegao E. Zhang Katherine M. Collins L. Wong Adrian Weller Adrian Weller Joshua B. Tenenbaum LRM 37 0 0 19 Jul 2024
PDDLEGO: Iterative Planning in Textual Environments Li Zhang Peter Alexander Jansen Tianyi Zhang Peter Clark Chris Callison-Burch Niket Tandon LM&Ro 39 6 0 30 May 2024
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models Wei He Shichun Liu Jun Zhao Yiwen Ding Yi Lu Zhiheng Xi Tao Gui Qi Zhang Xuanjing Huang 42 1 0 01 Apr 2024
MacGyver: Are Large Language Models Creative Problem Solvers? Yufei Tian Abhilasha Ravichander Lianhui Qin Ronan Le Bras Raja Marjieh Nanyun Peng Yejin Choi Thomas L. Griffiths Faeze Brahman AI4CE LLMAG 15 11 0 16 Nov 2023
Human-Centered Planning Yuliang Li Nitin Kamra Ruta Desai A. Halevy 23 1 0 08 Nov 2023
Distilling Script Knowledge from Large Language Models for Constrained Language Planning Siyu Yuan Jiangjie Chen Ziquan Fu Xuyang Ge Soham Shah C. R. Jankowski Yanghua Xiao Deqing Yang 43 47 0 09 May 2023
Dissociating language and thought in large language models Kyle Mahowald Anna A. Ivanova I. Blank Nancy Kanwisher J. Tenenbaum Evelina Fedorenko ELM ReLM 29 209 0 16 Jan 2023
Using cognitive psychology to understand GPT-3 Marcel Binz Eric Schulz ELM LLMAG 250 440 0 21 Jun 2022
Skill Induction and Planning with Latent Language Pratyusha Sharma Antonio Torralba Jacob Andreas LM&Ro 202 108 0 04 Oct 2021