Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models

Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models

29 May 2025

Papers citing "Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models"

11 / 11 papers shown

Title
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems Zixuan Ke Fangkai Jiao Yifei Ming Xuan-Phi Nguyen Austin Xu ... Chengwei Qin Peifeng Wang Siyang Song Caiming Xiong Shafiq Joty LRM 88 15 0 12 Apr 2025
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? Chenrui Fan Ming Li Lichao Sun Tianyi Zhou LRM 90 10 0 09 Apr 2025
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning Yuehan Qin Shawn Li Yi Nian Xinyan Velocity Yu Yue Zhao Xuezhe Ma HILM LRM 95 1 0 08 Apr 2025
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation Ruohao Guo Wei Xu Alan Ritter 68 3 0 12 Mar 2025
LIMO: Less is More for Reasoning Yixin Ye Zhen Huang Yang Xiao Ethan Chern Shijie Xia Pengfei Liu LRM 146 140 0 05 Feb 2025
Investigating the Robustness of Deductive Reasoning with Large Language Models Fabian Hoppe Filip Ilievski Jan-Christoph Kalo LRM 71 1 0 04 Feb 2025
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving Zain Ul Abedin Shahzeb Qamar Lucie Flek Akbar Karimi AAML 76 1 0 14 Jan 2025
JudgeBench: A Benchmark for Evaluating LLM-based Judges Sijun Tan Siyuan Zhuang Kyle Montgomery William Y. Tang Alejandro Cuadron Chenguang Wang Raluca A. Popa Ion Stoica ELM ALM 97 48 0 16 Oct 2024
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset Ke Wang Junting Pan Weikang Shi Zimu Lu Mingjie Zhan Hongsheng Li 65 159 0 22 Feb 2024
Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering Najoung Kim Ellie Pavlick Burcu Karagol Ayan Deepak Ramachandran 110 47 0 02 Jan 2021
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 539 4,773 0 23 Jan 2020