Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.11171
Cited By
v1
v2
v3
v4 (latest)
Self-Consistency Improves Chain of Thought Reasoning in Language Models
21 March 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Self-Consistency Improves Chain of Thought Reasoning in Language Models"
50 / 909 papers shown
Title
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
174
3
0
26 Feb 2025
Data-Efficient Multi-Agent Spatial Planning with LLMs
Huangyuan Su
Aaron Walsman
Daniel Garces
Sham Kakade
Stephanie Gil
LLMAG
Presented at
ResearchTrend Connect | LLMAG
on
28 Mar 2025
218
0
0
26 Feb 2025
General Intelligence Requires Reward-based Pretraining
Seungwook Han
Jyothish Pari
Samuel J. Gershman
Pulkit Agrawal
LRM
370
2
0
26 Feb 2025
CryptoPulse: Short-Term Cryptocurrency Forecasting with Dual-Prediction and Cross-Correlated Market Indicators
Amit Kumar
Taoran Ji
168
0
0
26 Feb 2025
Stay Focused: Problem Drift in Multi-Agent Debate
Jonas Becker
Lars Benedikt Kaesberg
Andreas Stephan
Jan Philip Wahle
Terry Ruas
Bela Gipp
145
2
0
26 Feb 2025
TestNUC: Enhancing Test-Time Computing Approaches and Scaling through Neighboring Unlabeled Data Consistency
Henry Peng Zou
Zhengyao Gu
Yue Zhou
Yankai Chen
Weizhi Zhang
Liancheng Fang
Yibo Wang
Yangning Li
Kay Liu
Philip S. Yu
150
1
0
26 Feb 2025
Voting or Consensus? Decision-Making in Multi-Agent Debate
Lars Benedikt Kaesberg
Jonas Becker
Jan Philip Wahle
Terry Ruas
Bela Gipp
148
7
0
26 Feb 2025
Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments
Patomporn Payoungkhamdee
Pume Tuchinda
Jinheon Baek
Samuel Cahyawijaya
Can Udomcharoenchaikit
Potsawee Manakul
Peerat Limkonchotiwat
Ekapol Chuangsuwanich
Sarana Nutanong
LRM
91
2
0
25 Feb 2025
PII-Bench: Evaluating Query-Aware Privacy Protection Systems
Hao Shen
Zhouhong Gu
Haokai Hong
Weili Han
99
0
0
25 Feb 2025
Chain of Draft: Thinking Faster by Writing Less
Silei Xu
Wenhao Xie
Lingxiao Zhao
Pengcheng He
AI4TS
LRM
194
85
0
25 Feb 2025
How Far are LLMs from Real Search? A Comprehensive Study on Efficiency, Completeness, and Inherent Capabilities
Minhua Lin
Hui Liu
Xianfeng Tang
Jingying Zeng
Zhenwei Dai
Chen Luo
Zheng Li
Xiang Zhang
Qi He
Suhang Wang
OffRL
LRM
99
1
0
25 Feb 2025
REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction
Omar Sharif
Joseph Gatto
Madhusudan Basak
S. Preum
92
0
0
24 Feb 2025
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
Marthe Ballon
Andres Algaba
Vincent Ginis
LRM
ReLM
104
17
0
24 Feb 2025
From Perceptions to Decisions: Wildfire Evacuation Decision Prediction with Behavioral Theory-informed LLMs
Ruxiao Chen
Chenguang Wang
Yuran Sun
Xilei Zhao
Susu Xu
159
3
0
24 Feb 2025
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Zhenyu Pan
Haozheng Luo
Manling Li
Han Liu
LRM
123
17
0
24 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
76
3
0
24 Feb 2025
DISC: DISC: Dynamic Decomposition Improves LLM Inference Scaling
Jonathan Light
Wei Cheng
Benjamin Rivière
Wu Yue
Masafumi Oyamada
Mengdi Wang
Yisong Yue
Santiago Paternain
Haifeng Chen
ReLM
LRM
129
4
0
23 Feb 2025
The Hidden Strength of Disagreement: Unraveling the Consensus-Diversity Tradeoff in Adaptive Multi-Agent Systems
Zengqing Wu
Takayuki Ito
106
2
0
23 Feb 2025
BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning
Haiteng Zhao
Chang Ma
FangZhi Xu
Lingpeng Kong
Zhi-Hong Deng
LRM
139
3
0
23 Feb 2025
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
Yunhai Feng
Jiaming Han
Zhiyong Yang
Xiangyu Yue
Sergey Levine
Jianlan Luo
LM&Ro
125
7
0
23 Feb 2025
Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations
Chunyang Li
Weiqi Wang
Tianshi Zheng
Yangqiu Song
LRM
134
6
0
22 Feb 2025
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking
Yuxuan Liu
Hongda Sun
Wenya Guo
Xinyan Xiao
Cunli Mao
Zhengtao Yu
Rui Yan
150
3
0
22 Feb 2025
Dynamic Parallel Tree Search for Efficient LLM Reasoning
Yifu Ding
Wentao Jiang
Shunyu Liu
Yongcheng Jing
Jinpei Guo
...
Zengmao Wang
Ziqiang Liu
Di Lin
Xianglong Liu
Dacheng Tao
LRM
122
11
0
22 Feb 2025
Improving Value-based Process Verifier via Structural Prior Injection
Zetian Sun
Dongfang Li
Baotian Hu
Jun Yu
Min Zhang
97
0
0
21 Feb 2025
Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation
Abdelrahman Abdallah
Bhawna Piryani
Jamshid Mozafari
Mohammed Ali
Adam Jatowt
356
1
0
21 Feb 2025
CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi
Seyed Mohammad Hadi Hosseini
Mahdieh Soleymani Baghshah
LRM
182
5
0
20 Feb 2025
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model
Emre Can Acikgoz
Jeremiah Greer
Akul Datta
Ze Yang
William Zeng
Oussama Elachqar
Emmanouil Koukoumidis
Dilek Hakkani-Tur
Gokhan Tur
LLMAG
186
3
0
20 Feb 2025
Reasoning and the Trusting Behavior of DeepSeek and GPT: An Experiment Revealing Hidden Fault Lines in Large Language Models
Rubing Li
João Sedoc
Arun Sundararajan
LRM
106
1
0
20 Feb 2025
DataSciBench: An LLM Agent Benchmark for Data Science
Dan Zhang
Sining Zhoubian
Min Cai
Fengzu Li
L. Yang
Wei Wang
Tianjiao Dong
Ziniu Hu
J. Tang
Yisong Yue
ALM
ELM
101
5
0
20 Feb 2025
A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos
Yang Yao
Xuan Tong
Ruofan Wang
Yixu Wang
Lujundong Li
Liang Liu
Yan Teng
Yun Wang
LRM
89
7
0
19 Feb 2025
SIFT: Grounding LLM Reasoning in Contexts via Stickers
Zihao Zeng
Xuyao Huang
Boxiu Li
Zhijie Deng
LRM
60
2
0
19 Feb 2025
Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction
Nils Constantin Hellwig
Jakob Fehle
Udo Kruschwitz
Christian Wolff
AI4MH
143
0
0
18 Feb 2025
R2-KG: General-Purpose Dual-Agent Framework for Reliable Reasoning on Knowledge Graphs
Sumin Jo
Junseong Choi
Jiho Kim
Edward Choi
129
0
0
18 Feb 2025
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
Bosi Wen
Pei Ke
Yufei Sun
C. Wang
Xiaotao Gu
Jinfeng Zhou
Jie Tang
Hongning Wang
Minlie Huang
41
0
0
18 Feb 2025
Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models
Artyom Kharinaev
Viktor Moskvoretskii
Egor Shvetsov
Kseniia Studenikina
Bykov Mikhail
Evgeny Burnaev
MQ
100
0
0
18 Feb 2025
Atom of Thoughts for Markov LLM Test-Time Scaling
Fengwei Teng
Zhaoyang Yu
Quan Shi
Jiayi Zhang
Chenglin Wu
Yuyu Luo
MU
LRM
134
23
0
17 Feb 2025
SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models
Yi Wu
Z. Xiong
Yiran Hu
Shreyash S. Iyengar
Nan Jiang
Aniket Bera
Lin Tan
Suresh Jagannathan
LM&Ro
LLMAG
168
4
0
17 Feb 2025
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task
Yuchen Yan
Yongliang Shen
Yang Liu
Jin Jiang
Xin Xu
Hao Fei
Jian Shao
Yueting Zhuang
ReLM
LRM
102
2
0
17 Feb 2025
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Zeqing Wang
Wentao Wan
Qiqing Lao
Runmeng Chen
Minjie Lang
Keze Wang
Liang Lin
Liang Lin
LRM
234
3
0
17 Feb 2025
Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning
Peiying Yu
Guoxin Chen
Jingjing Wang
LLMAG
LMTD
LRM
134
8
0
17 Feb 2025
Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models
Jongho Kim
Seung-won Hwang
LRM
AI4CE
159
1
0
17 Feb 2025
KnowPath: Knowledge-enhanced Reasoning via LLM-generated Inference Paths over Knowledge Graphs
Qi Zhao
Hongyu Yang
Qi Song
Xinwei Yao
Xiangyang Li
124
0
0
17 Feb 2025
Preference Optimization for Reasoning with Pseudo Feedback
Fangkai Jiao
Geyang Guo
Xingxing Zhang
Nancy F. Chen
Shafiq Joty
Furu Wei
LRM
212
16
0
17 Feb 2025
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Zhiyuan Zeng
Qinyuan Cheng
Zhangyue Yin
Yunhua Zhou
Xipeng Qiu
LRM
178
20
0
17 Feb 2025
AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification
Jue Chen
Tianchu Yao
Chao Qu
Bin Li
Minghao Yang
...
Haozhe Wang
Xihe Qiu
Wei Chu
Yinghui Xu
Yuan Qi
OffRL
LRM
106
2
0
17 Feb 2025
GraphThought: Graph Combinatorial Optimization with Thought Generation
Zixiao Huang
Lifeng Guo
Wenhao Li
Junjie Sheng
Chuyun Shen
Haosheng Chen
Bo Jin
Changhong Lu
Xiangfeng Wang
LRM
AI4CE
68
0
0
17 Feb 2025
Evaluating Step-by-step Reasoning Traces: A Survey
Jinu Lee
Julia Hockenmaier
LRM
ELM
155
2
0
17 Feb 2025
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis
Wenbo Zhang
Hengrui Cai
Wenyu Chen
110
1
0
17 Feb 2025
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Yujie Lin
Ante Wang
Moye Chen
Jingyao Liu
Hao Liu
Jinsong Su
Xinyan Xiao
LRM
143
3
0
17 Feb 2025
Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering
Nick Ferguson
Liane Guillou
A. Bundy
Kwabena Nuamah
LRM
ELM
141
1
0
17 Feb 2025
Previous
1
2
3
...
6
7
8
...
17
18
19
Next