Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.11164
Cited By
Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs
22 April 2023
Anthony G. Cohn
Jose Hernandez-Orallo
ELM
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs"
14 / 14 papers shown
Title
Geospatial Mechanistic Interpretability of Large Language Models
Stef De Sabbata
Stefano Mizzaro
Kevin Roitero
AI4CE
37
0
0
06 May 2025
Can Large Language Models Reason about the Region Connection Calculus?
Anthony G Cohn
Robert E Blackwell
LRM
71
2
0
29 Nov 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
47
3
0
28 Oct 2024
Reasoning Elicitation in Language Models via Counterfactual Feedback
Alihan Hüyük
Xinnuo Xu
Jacqueline Maasch
Aditya V. Nori
Javier González
ReLM
LRM
235
1
0
02 Oct 2024
Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning
Fangjun Li
David C. Hogg
Anthony G. Cohn
LRM
40
6
0
23 May 2024
Temporal Blind Spots in Large Language Models
Jonas Wallat
Adam Jatowt
Avishek Anand
43
3
0
22 Jan 2024
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark
Fangjun Li
David C. Hogg
Anthony G. Cohn
LRM
43
26
0
08 Jan 2024
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
Rongwu Xu
Brian S. Lin
Shujian Yang
Tianqi Zhang
Weiyan Shi
Tianwei Zhang
Zhixuan Fang
Wei Xu
Han Qiu
52
51
0
14 Dec 2023
Exploring and Improving the Spatial Reasoning Abilities of Large Language Models
Manasi Sharma
LRM
22
8
0
02 Dec 2023
Language Models as a Service: Overview of a New Paradigm and its Challenges
Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt
Michael Wooldridge
ALM
ELM
35
3
0
28 Sep 2023
An Evaluation of ChatGPT-4's Qualitative Spatial Reasoning Capabilities in RCC-8
Anthony G Cohn
LRM
27
9
0
27 Sep 2023
The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling Probabilistic Social Inferences from Linguistic Inputs
Lance Ying
Katherine M. Collins
Megan Wei
Cedegao E. Zhang
Tan Zhi-Xuan
Adrian Weller
J. Tenenbaum
L. Wong
45
14
0
25 Jun 2023
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
Boshi Wang
Xiang Yue
Huan Sun
ELM
LRM
46
60
0
22 May 2023
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models
Ning Bian
Xianpei Han
Le Sun
Hongyu Lin
Yaojie Lu
Xianpei Han
Shanshan Jiang
Bin Dong
KELM
ELM
AI4MH
LRM
32
76
0
29 Mar 2023
1