Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.09841
Cited By
Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond
16 June 2023
Fangzhi Xu
Qika Lin
Jiawei Han
Tianzhe Zhao
Jun Liu
Erik Cambria
ELM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond"
24 / 24 papers shown
Title
Enigme: Generative Text Puzzles for Evaluating Reasoning in Language Models
John Hawkins
ReLM
LRM
57
0
0
08 May 2025
LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models
Kang He
Kaushik Roy
LRM
29
0
0
18 Apr 2025
Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning
Sanchit Kabra
Akshita Jha
Chandan K. Reddy
LRM
28
0
0
08 Apr 2025
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization
Jian Zhang
Zhilin Wang
Haiping Zhu
Jun Liu
Qika Lin
Erik Cambria
LLMAG
81
1
0
21 Mar 2025
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving
Jian Zhang
Zhiyuan Wang
Zhilin Wang
Xinyu Zhang
Fangzhi Xu
Qika Lin
Rui Mao
Erik Cambria
Jun Liu
LLMAG
56
1
0
21 Mar 2025
GKG-LLM: A Unified Framework for Generalized Knowledge Graph Construction
Jian Zhang
Bifan Wei
Shihao Qi
Haiping Zhu
Jun Liu
Qika Lin
47
0
0
14 Mar 2025
Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu
Vedant Shah
Muntasir Wahed
Kiet A. Nguyen
Adheesh Sunil Juvekar
Tal August
Ismini Lourentzou
51
0
0
13 Mar 2025
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
X. Zhang
Yuxuan Dong
Yongpeng Wu
Jiaxing Huang
Chengyou Jia
Basura Fernando
Mike Zheng Shou
L. Zhang
Jun Liu
AIMat
ReLM
LRM
53
2
0
17 Feb 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Erik Cambria
LM&MA
AILaw
93
154
0
28 Jan 2025
Watson: A Cognitive Observability Framework for the Reasoning of LLM-Powered Agents
Benjamin Rombaut
Sogol Masoumzadeh
Kirill Vasilevski
Dayi Lin
Ahmed E. Hassan
LRM
33
0
0
05 Nov 2024
Rulebreakers Challenge: Revealing a Blind Spot in Large Language Models' Reasoning with Formal Logic
Jason Chan
Robert Gaizauskas
Zhixue Zhao
ELM
AAML
LRM
35
0
0
21 Oct 2024
Critical Questions Generation: Motivation and Challenges
Blanca Calvo Figueras
Rodrigo Agerri
18
1
0
18 Oct 2024
Boosting Logical Fallacy Reasoning in LLMs via Logical Structure Tree
Yuanyuan Lei
Ruihong Huang
21
1
0
15 Oct 2024
Automated Theorem Provers Help Improve Large Language Model Reasoning
Lachlan McGinness
Peter Baumgartner
LRM
38
4
0
07 Aug 2024
Steamroller Problems: An Evaluation of LLM Reasoning Capability with Automated Theorem Prover Strategies
Lachlan McGinness
Peter Baumgartner
LRM
23
0
0
17 Jul 2024
Accuracy of a Large Language Model in Distinguishing Anti- And Pro-vaccination Messages on Social Media: The Case of Human Papillomavirus Vaccination
Soojong Kim
Kwanho Kim
Claire Wonjeong Jo
LM&MA
19
6
0
10 Apr 2024
Conditional and Modal Reasoning in Large Language Models
Wesley H. Holliday
M. Mandelkern
Cedegao E. Zhang
LRM
29
5
0
30 Jan 2024
BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models
Wei Qi Leong
Jian Gang Ngui
Yosephine Susanto
Hamsawardhini Rengarajan
Kengatharaiyer Sarveswaran
William-Chandra Tjhi
26
9
0
12 Sep 2023
How susceptible are LLMs to Logical Fallacies?
Amirreza Payandeh
Dan Pluth
Jordan Hosier
Xuesu Xiao
V. Gurbani
LLMAG
LRM
ELM
38
17
0
18 Aug 2023
ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time
Shangqing Tu
Chunyang Li
Jifan Yu
Xiaozhi Wang
Lei Hou
Juanzi Li
LLMAG
AI4MH
75
10
0
27 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
159
579
0
06 Apr 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
322
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
389
8,495
0
28 Jan 2022
Explaining Answers with Entailment Trees
Bhavana Dalvi
Peter Alexander Jansen
Oyvind Tafjord
Zhengnan Xie
Hannah Smith
Leighanna Pipatanangkura
Peter Clark
ReLM
FAtt
LRM
239
184
0
17 Apr 2021
1