ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.11164
  4. Cited By
Dialectical language model evaluation: An initial appraisal of the
  commonsense spatial reasoning abilities of LLMs

Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs

22 April 2023
Anthony G. Cohn
Jose Hernandez-Orallo
    ELM
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs"

14 / 14 papers shown
Title
Geospatial Mechanistic Interpretability of Large Language Models
Geospatial Mechanistic Interpretability of Large Language Models
Stef De Sabbata
Stefano Mizzaro
Kevin Roitero
AI4CE
37
0
0
06 May 2025
Can Large Language Models Reason about the Region Connection Calculus?
Can Large Language Models Reason about the Region Connection Calculus?
Anthony G Cohn
Robert E Blackwell
LRM
71
2
0
29 Nov 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with
  Annual Updates
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
47
3
0
28 Oct 2024
Reasoning Elicitation in Language Models via Counterfactual Feedback
Reasoning Elicitation in Language Models via Counterfactual Feedback
Alihan Hüyük
Xinnuo Xu
Jacqueline Maasch
Aditya V. Nori
Javier González
ReLM
LRM
235
1
0
02 Oct 2024
Reframing Spatial Reasoning Evaluation in Language Models: A Real-World
  Simulation Benchmark for Qualitative Reasoning
Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning
Fangjun Li
David C. Hogg
Anthony G. Cohn
LRM
40
6
0
23 May 2024
Temporal Blind Spots in Large Language Models
Temporal Blind Spots in Large Language Models
Jonas Wallat
Adam Jatowt
Avishek Anand
43
3
0
22 Jan 2024
Advancing Spatial Reasoning in Large Language Models: An In-Depth
  Evaluation and Enhancement Using the StepGame Benchmark
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark
Fangjun Li
David C. Hogg
Anthony G. Cohn
LRM
43
26
0
08 Jan 2024
The Earth is Flat because...: Investigating LLMs' Belief towards
  Misinformation via Persuasive Conversation
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
Rongwu Xu
Brian S. Lin
Shujian Yang
Tianqi Zhang
Weiyan Shi
Tianwei Zhang
Zhixuan Fang
Wei Xu
Han Qiu
52
51
0
14 Dec 2023
Exploring and Improving the Spatial Reasoning Abilities of Large
  Language Models
Exploring and Improving the Spatial Reasoning Abilities of Large Language Models
Manasi Sharma
LRM
22
8
0
02 Dec 2023
Language Models as a Service: Overview of a New Paradigm and its
  Challenges
Language Models as a Service: Overview of a New Paradigm and its Challenges
Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt
Michael Wooldridge
ALM
ELM
35
3
0
28 Sep 2023
An Evaluation of ChatGPT-4's Qualitative Spatial Reasoning Capabilities
  in RCC-8
An Evaluation of ChatGPT-4's Qualitative Spatial Reasoning Capabilities in RCC-8
Anthony G Cohn
LRM
27
9
0
27 Sep 2023
The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling
  Probabilistic Social Inferences from Linguistic Inputs
The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling Probabilistic Social Inferences from Linguistic Inputs
Lance Ying
Katherine M. Collins
Megan Wei
Cedegao E. Zhang
Tan Zhi-Xuan
Adrian Weller
J. Tenenbaum
L. Wong
45
14
0
25 Jun 2023
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via
  Debate
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
Boshi Wang
Xiang Yue
Huan Sun
ELM
LRM
46
60
0
22 May 2023
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of
  Commonsense Problem in Large Language Models
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models
Ning Bian
Xianpei Han
Le Sun
Hongyu Lin
Yaojie Lu
Xianpei Han
Shanshan Jiang
Bin Dong
KELM
ELM
AI4MH
LRM
32
76
0
29 Mar 2023
1