ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.23715
  4. Cited By
Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models

Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models

29 May 2025
Jinzhe Li
Gengxu Li
Yi-Ju Chang
Yuan Wu
    AAML
    ELM
    LRM
ArXivPDFHTML

Papers citing "Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models"

11 / 11 papers shown
Title
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
Zixuan Ke
Fangkai Jiao
Yifei Ming
Xuan-Phi Nguyen
Austin Xu
...
Chengwei Qin
Peifeng Wang
Siyang Song
Caiming Xiong
Shafiq Joty
LRM
88
15
0
12 Apr 2025
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Chenrui Fan
Ming Li
Lichao Sun
Tianyi Zhou
LRM
90
10
0
09 Apr 2025
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning
Yuehan Qin
Shawn Li
Yi Nian
Xinyan Velocity Yu
Yue Zhao
Xuezhe Ma
HILM
LRM
95
1
0
08 Apr 2025
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation
Ruohao Guo
Wei Xu
Alan Ritter
68
3
0
12 Mar 2025
LIMO: Less is More for Reasoning
LIMO: Less is More for Reasoning
Yixin Ye
Zhen Huang
Yang Xiao
Ethan Chern
Shijie Xia
Pengfei Liu
LRM
146
140
0
05 Feb 2025
Investigating the Robustness of Deductive Reasoning with Large Language Models
Investigating the Robustness of Deductive Reasoning with Large Language Models
Fabian Hoppe
Filip Ilievski
Jan-Christoph Kalo
LRM
71
1
0
04 Feb 2025
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving
Zain Ul Abedin
Shahzeb Qamar
Lucie Flek
Akbar Karimi
AAML
76
1
0
14 Jan 2025
JudgeBench: A Benchmark for Evaluating LLM-based Judges
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Sijun Tan
Siyuan Zhuang
Kyle Montgomery
William Y. Tang
Alejandro Cuadron
Chenguang Wang
Raluca A. Popa
Ion Stoica
ELM
ALM
97
48
0
16 Oct 2024
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
Ke Wang
Junting Pan
Weikang Shi
Zimu Lu
Mingjie Zhan
Hongsheng Li
65
159
0
22 Feb 2024
Which Linguist Invented the Lightbulb? Presupposition Verification for
  Question-Answering
Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering
Najoung Kim
Ellie Pavlick
Burcu Karagol Ayan
Deepak Ramachandran
110
47
0
02 Jan 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
539
4,773
0
23 Jan 2020
1