Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.02235
Cited By
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
6 January 2021
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies"
50 / 137 papers shown
Title
Practical Reasoning Interruption Attacks on Reasoning Large Language Models
Yu Cui
Cong Zuo
SILM
AAML
LRM
29
0
0
10 May 2025
Turing Machine Evaluation for Large Language Model
Haitao Wu
Zongbo Han
Huaxi Huang
Changqing Zhang
ELM
LRM
62
0
0
29 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
X. Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Yu Jiang
ALM
ELM
84
1
0
26 Apr 2025
LLM-as-a-Judge: Reassessing the Performance of LLMs in Extractive QA
Xanh Ho
Jiahao Huang
Florian Boudin
Akiko Aizawa
ELM
31
0
0
16 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
142
0
0
15 Apr 2025
ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Gejian Zhao
Hanzhou Wu
Xinpeng Zhang
Athanasios V. Vasilakos
LRM
38
1
0
08 Apr 2025
Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning
Sugyeong Eo
Hyeonseok Moon
Evelyn Hayoon Zi
Chanjun Park
Heuiseok Lim
LLMAG
44
1
0
07 Apr 2025
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
49
0
0
29 Mar 2025
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Zhanke Zhou
Zhaocheng Zhu
Xuan Li
Mikhail Galkin
Xiao Feng
Sanmi Koyejo
Jian Tang
Bo Han
LRM
56
0
0
28 Mar 2025
"Well, Keep Thinking": Enhancing LLM Reasoning with Adaptive Injection Decoding
Hyunbin Jin
Je Won Yeom
Seunghyun Bae
Taesup Kim
LRM
ReLM
40
1
0
13 Mar 2025
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
Yiming Jia
J. Li
Xiang Yue
Bo Li
Ping Nie
Kai Zou
Wenhu Chen
LRM
79
2
0
13 Mar 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Mingyue Cheng
Yucong Luo
Jie Ouyang
Q. Liu
Huijie Liu
...
Bohou Zhang
Jiawei Cao
Jie Ma
Daoyu Wang
Enhong Chen
3DV
70
3
0
11 Mar 2025
MastermindEval: A Simple But Scalable Reasoning Benchmark
Jonas Golde
Patrick Haller
Fabio Barth
Alan Akbik
LRM
ReLM
ELM
51
2
0
07 Mar 2025
Voting or Consensus? Decision-Making in Multi-Agent Debate
Lars Benedikt Kaesberg
Jonas Becker
Jan Philip Wahle
Terry Ruas
Bela Gipp
66
1
0
26 Feb 2025
Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning
Xinghao Chen
Zhijing Sun
Wenjin Guo
Miaoran Zhang
Yanjun Chen
...
Hui Su
Yijie Pan
Dietrich Klakow
Wenjie Li
Xiaoyu Shen
LRM
51
4
0
25 Feb 2025
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan
Haozheng Luo
Manling Li
Han Liu
LRM
48
13
0
24 Feb 2025
Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering
Nick Ferguson
Liane Guillou
A. Bundy
Kwabena Nuamah
LRM
ELM
68
1
0
17 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
54
1
0
16 Feb 2025
Confidence Elicitation: A New Attack Vector for Large Language Models
Brian Formento
Chuan-Sheng Foo
See-Kiong Ng
AAML
96
0
0
07 Feb 2025
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
LRM
34
0
0
05 Feb 2025
Policy Guided Tree Search for Enhanced LLM Reasoning
Yang Li
LRM
53
0
0
04 Feb 2025
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains
Ran Xu
Hui Liu
Sreyashi Nag
Zhenwei Dai
Yaochen Xie
...
Chen Luo
Yang Li
Joyce C. Ho
Carl Yang
Qi He
RALM
68
8
0
28 Jan 2025
Verbosity-Aware Rationale Reduction: Effective Reduction of Redundant Rationale via Principled Criteria
Joonwon Jang
Jaehee Kim
Wonbin Kweon
Hwanjo Yu
LRM
45
1
0
03 Jan 2025
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks
Shengbin Yue
Siyuan Wang
Wei Chen
Xuanjing Huang
Zhongyu Wei
LLMAG
72
9
0
03 Jan 2025
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
OSLM
LRM
108
408
0
03 Jan 2025
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Qing Zong
Z. Wang
Tianshi Zheng
Xiyu Ren
Y. Song
62
1
0
31 Dec 2024
ToW: Thoughts of Words Improve Reasoning in Large Language Models
Zhikun Xu
Ming shen
Jacob Dineen
Zhaonan Li
Xiao Ye
Shijie Lu
Aswin Rrv
Chitta Baral
Ben Zhou
LRM
127
1
0
21 Oct 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
H. Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
42
6
0
17 Oct 2024
FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan
Pasquale Minervini
Pat Verga
Patrick Lewis
Isabelle Augenstein
ReLM
LRM
61
2
0
14 Oct 2024
Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs
Abdellah El Mekki
Muhammad Abdul-Mageed
LRM
31
0
0
14 Oct 2024
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
Hojae Lee
Junho Kim
SangKeun Lee
LRM
32
1
0
11 Oct 2024
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng
Liangming Pan
Xunjian Yin
Xinyi Wang
William Yang Wang
KELM
37
4
0
10 Oct 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Yifan Song
Weimin Xiong
Xiutian Zhao
Dawei Zhu
Wenhao Wu
Ke Wang
Cheng Li
Wei Peng
Sujian Li
LLMAG
26
9
0
10 Oct 2024
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
Xiangyu Peng
Congying Xia
Xinyi Yang
Caiming Xiong
Chien-Sheng Wu
Chen Xing
LRM
43
2
0
03 Oct 2024
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely
Siyun Zhao
Yuqing Yang
Zilong Wang
Zhiyuan He
Luna Qiu
Lili Qiu
SyDa
RALM
3DV
41
33
0
23 Sep 2024
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation
Chen Liang
Zhifan Feng
Zihe Liu
Wenbin Jiang
Jinan Xu
Yufeng Chen
Yong Wang
LLMAG
LRM
25
1
0
19 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
114
82
0
18 Sep 2024
Benchmarking Large Language Model Uncertainty for Prompt Optimization
Pei-Fu Guo
Yun-Da Tsai
Shou-De Lin
ELM
LRM
20
1
0
16 Sep 2024
Self-Harmonized Chain of Thought
Ziqi Jin
Wei Lu
LRM
19
2
0
06 Sep 2024
Path-Consistency: Prefix Enhancement for Efficient Inference in LLM
Jiace Zhu
Yingtao Shen
Jie Zhao
An Zou
LLMAG
LRM
27
4
0
25 Aug 2024
Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning
Xinglin Wang
Shaoxiong Feng
Yiwei Li
Peiwen Yuan
Y. Zhang
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
LRM
40
17
0
24 Aug 2024
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang
Bo Huang
Yufei Wang
Xingshan Zeng
Liangyou Li
Yasheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wei Wang
42
5
0
14 Aug 2024
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang
Haneul Yoo
Hwaran Lee
60
1
0
13 Aug 2024
CoverBench: A Challenging Benchmark for Complex Claim Verification
Alon Jacovi
Moran Ambar
Eyal Ben-David
Uri Shaham
Amir Feder
Mor Geva
Dror Marcus
Avi Caciularu
LMTD
49
3
0
06 Aug 2024
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Sangwon Yu
Jongyoon Song
Bongkyu Hwang
Hoyoung Kang
Sooah Cho
Junhwa Choi
Seongho Joe
Taehee Lee
Youngjune Gwon
Sungroh Yoon
114
4
0
31 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
71
9
0
09 Jul 2024
Retrieved In-Context Principles from Previous Mistakes
Hao-Lun Sun
Yong-jia Jiang
Bo Wang
Yingyan Hou
Yan Zhang
Pengjun Xie
Fei Huang
52
1
0
08 Jul 2024
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models
Jiabao Pan
Yan Zhang
Chen Zhang
Zuozhu Liu
Hongwei Wang
Haizhou Li
LRM
29
3
0
01 Jul 2024
PORT: Preference Optimization on Reasoning Traces
Salem Lahlou
Abdalgader Abubaker
Hakim Hacid
LRM
33
1
0
23 Jun 2024
Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions
Yiming Tang
Bin Dong
34
0
0
16 Jun 2024
1
2
3
Next