Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.02235
Cited By
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
6 January 2021
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies"
50 / 565 papers shown
Title
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi
Fan Nie
Alexandre Alahi
James Zou
Himabindu Lakkaraju
Yilun Du
Eric P. Xing
Sham Kakade
Hanlin Zhang
49
1
0
19 Jun 2025
RiOT: Efficient Prompt Refinement with Residual Optimization Tree
Chenyi Zhou
Zhengyan Shi
Yuan Yao
Lei Liang
H. Chen
Qiang Zhang
17
0
0
19 Jun 2025
CC-LEARN: Cohort-based Consistency Learning
Xiao Ye
Shaswat Shrivastava
Zhaonan Li
Jacob Dineen
Shijie Lu
Avneet Ahuja
Ming shen
Zhikun Xu
Ben Zhou
OffRL
LRM
43
0
0
18 Jun 2025
SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative Refinement
Chelsi Jain
Yiran Wu
Yifan Zeng
Jiale Liu
S hengyu Dai
Zhenwen Shao
Qingyun Wu
Huazheng Wang
21
0
0
16 Jun 2025
Understand the Implication: Learning to Think for Pragmatic Understanding
S. Sravanthi
Kishan Maharaj
Sravani Gunnu
Abhijit Mishra
Pushpak Bhattacharyya
ReLM
LRM
27
0
0
16 Jun 2025
BOW: Bottlenecked Next Word Exploration
Ming shen
Zhikun Xu
Xiao Ye
Jacob Dineen
Ben Zhou
OffRL
LRM
30
0
0
16 Jun 2025
Unveiling Confirmation Bias in Chain-of-Thought Reasoning
Yue Wan
Xiaowei Jia
Xiang Li
LRM
25
0
0
14 Jun 2025
KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs
Dingjun Wu
Y. Yan
Zhenghao Liu
Zhiyuan Liu
Maosong Sun
63
0
0
11 Jun 2025
Efficient Post-Training Refinement of Latent Reasoning in Large Language Models
Xinyuan Wang
Dongjie Wang
Wangyang Ying
Haoyue Bai
Nanxu Gong
Sixun Dong
Kunpeng Liu
Yanjie Fu
ReLM
LRM
51
0
0
10 Jun 2025
Sample Efficient Demonstration Selection for In-Context Learning
Kiran Purohit
Venktesh V
Sourangshu Bhattacharya
Avishek Anand
45
0
0
10 Jun 2025
From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
Siddartha Devic
Tejas Srinivasan
Jesse Thomason
Willie Neiswanger
Vatsal Sharan
26
0
0
09 Jun 2025
From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium
Xie Yi
Zhanke Zhou
Chentao Cao
Qiyu Niu
Tongliang Liu
Bo Han
25
0
0
09 Jun 2025
Evolutionary Perspectives on the Evaluation of LLM-Based AI Agents: A Comprehensive Survey
Jiachen Zhu
Menghui Zhu
Renting Rui
Rong Shan
Congmin Zheng
...
Jianghao Lin
Weiwen Liu
Ruiming Tang
Yong Yu
Weinan Zhang
LLMAG
ELM
49
0
0
06 Jun 2025
Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
Peijie Liu
Fengli Xu
Yong Li
LRM
58
0
0
06 Jun 2025
Revisiting Test-Time Scaling: A Survey and a Diversity-Aware Method for Efficient Reasoning
Ho-Lam Chung
Teng-Yun Hsiao
Hsiao-Ying Huang
Chunerh Cho
Jian-Ren Lin
Zhang Ziwei
Yun-Nung Chen
LRM
116
0
0
05 Jun 2025
A Statistical Physics of Language Model Reasoning
Jack David Carson
Amir Reisizadeh
LRM
AI4CE
80
0
0
04 Jun 2025
Robustness of Prompting: Enhancing Robustness of Large Language Models Against Prompting Attacks
Lin Mu
Guowei Chu
Li Ni
Lei Sang
Zhize Wu
Peiquan Jin
Yiwen Zhang
97
0
0
04 Jun 2025
Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation
Dingwei Chen
Ziqiang Liu
Feiteng Fang
Chak Tou Leong
Shiwen Ni
A. Argha
Hamid Alinejad-Rokny
Min Yang
Chengming Li
KELM
HILM
61
0
0
03 Jun 2025
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes
Meng Li
Michael Vrazitulis
David Schlangen
63
0
0
02 Jun 2025
Generalizable LLM Learning of Graph Synthetic Data with Reinforcement Learning
Yizhuo Zhang
Heng Wang
Shangbin Feng
Zhaoxuan Tan
Xinyun Liu
Yulia Tsvetkov
OffRL
65
0
0
01 Jun 2025
RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems
Yixiao Zeng
Tianyu Cao
Danqing Wang
Xinran Zhao
Zimeng Qiu
Morteza Ziyadi
Tongshuang Wu
Lei Li
RALM
51
0
0
01 Jun 2025
Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?
Jiayu Liu
Qing Zong
Weiqi Wang
Yangqiu Song
40
0
0
30 May 2025
Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration
Qinglin Zhu
Runcong Zhao
Hanqi Yan
Yulan He
Yudong Chen
Lin Gui
LRM
33
0
0
30 May 2025
Pretrained LLMs Learn Multiple Types of Uncertainty
Roi Cohen
Omri Fahn
Gerard de Melo
41
0
0
27 May 2025
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
Debargha Ganguly
Vikash Singh
Sreehari Sankar
Biyao Zhang
Xuecen Zhang
Srinivasan Iyengar
Xiaotian Han
Amit Sharma
Shivkumar Kalyanaraman
Vipin Chaudhary
66
0
0
26 May 2025
POQD: Performance-Oriented Query Decomposer for Multi-vector retrieval
Yaoyang Liu
Junlin Li
Yinjun Wu
Zhen Chen
67
0
0
25 May 2025
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
Xiaoqiang Wang
Suyuchen Wang
Yun Zhu
Bang Liu
ReLM
LRM
123
0
0
25 May 2025
Removal of Hallucination on Hallucination: Debate-Augmented RAG
Wentao Hu
Wengyu Zhang
Yiyang Jiang
C. Zhang
Xiaoyong Wei
Qing Li
61
0
0
24 May 2025
Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster
Xiao Chen
Sihang Zhou
K. Liang
Xiaoyu Sun
Xinwang Liu
LRM
38
1
0
24 May 2025
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
Ruichen Zhang
Rana Muhammad Shahroz Khan
Zhen Tan
Dawei Li
Song Wang
Tianlong Chen
LRM
63
0
0
24 May 2025
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
Xinghao Chen
Anhao Zhao
Heming Xia
Xuan Lu
Hanlin Wang
Yanjun Chen
Wei Zhang
Jian Wang
W. Li
Xiaoyu Shen
ReLM
LRM
89
0
0
22 May 2025
Social Bias in Popular Question-Answering Benchmarks
Angelie Kraft
Judith Simon
Sonja Schimmler
120
0
0
21 May 2025
Multilingual Test-Time Scaling via Initial Thought Transfer
Prasoon Bajpai
Tanmoy Chakraborty
LRM
66
0
0
21 May 2025
Prolonged Reasoning Is Not All You Need: Certainty-Based Adaptive Routing for Efficient LLM/MLLM Reasoning
Jinghui Lu
Haiyang Yu
Siliang Xu
Shiwei Ran
Guozhi Tang
...
Teng Fu
Hao Feng
Jingqun Tang
Hongru Wang
Can Huang
LRM
111
3
0
21 May 2025
The Effects of Data Augmentation on Confidence Estimation for LLMs
Rui Wang
Renyu Zhu
Minmin Lin
R. Wu
Tangjie Lv
Changjie Fan
Haobo Wang
21
0
0
21 May 2025
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps
Jie Ou
Jinyu Guo
Shuaihong Jiang
Zhaokun Wang
Libo Qin
Shunyu Yao
Wenhong Tian
3DV
163
0
0
19 May 2025
J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization
Austin Xu
Yilun Zhou
Xuan-Phi Nguyen
Caiming Xiong
Shafiq Joty
ELM
LRM
146
0
0
19 May 2025
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu
Ming Ma
Xiaomin Yu
Pengxiang Ding
Han Zhao
Mingyang Sun
Siteng Huang
Donglin Wang
LRM
207
0
0
18 May 2025
Relation Extraction or Pattern Matching? Unravelling the Generalisation Limits of Language Models for Biographical RE
Varvara Arzt
Allan Hanbury
Michael Wiegand
Gábor Recski
Terra Blevins
70
0
0
18 May 2025
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning
Yige Xu
Xu Guo
Zhiwei Zeng
Chunyan Miao
BDL
LRM
146
1
0
16 May 2025
Practical Reasoning Interruption Attacks on Reasoning Large Language Models
Yu Cui
Cong Zuo
SILM
AAML
LRM
96
0
0
10 May 2025
DeepCritic: Deliberate Critique with Large Language Models
Wenkai Yang
Jingwen Chen
Yankai Lin
Ji-Rong Wen
ALM
LRM
102
1
0
01 May 2025
Computational Reasoning of Large Language Models
Haitao Wu
Zongbo Han
Joey Tianyi Zhou
Huaxi Huang
Changqing Zhang
ELM
LRM
102
0
0
29 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
253
7
0
26 Apr 2025
Evaluating Multi-Hop Reasoning in Large Language Models: A Chemistry-Centric Case Study
Mohammad Khodadad
Ali Shiraee Kasmaee
Mahdi Astaraki
Nicholas Sherck
H. Mahyar
Soheila Samiee
LRM
431
0
0
23 Apr 2025
Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement
Xiaowei Yuan
Zhao Yang
Ziyang Huang
Yucheng Wang
Siqi Fan
Yiming Ju
Jun Zhao
Kang Liu
81
0
0
22 Apr 2025
CoLoTa: A Dataset for Entity-based Commonsense Reasoning over Long-Tail Knowledge
Armin Toroghi
Willis Guo
Scott Sanner
RALM
LRM
70
0
0
20 Apr 2025
Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey
Ahsan Bilal
Muhammad Ahmed Mohsin
Muhammad Umer
Muhammad Awais Khan Bangash
Muhammad Ali Jamshed
LLMAG
LRM
AI4CE
164
1
0
20 Apr 2025
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA
Xanh Ho
Jiahao Huang
Florian Boudin
Akiko Aizawa
ELM
141
0
0
16 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
422
13
0
15 Apr 2025
1
2
3
4
...
10
11
12
Next