Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.11171
Cited By
v1
v2
v3
v4 (latest)
Self-Consistency Improves Chain of Thought Reasoning in Language Models
21 March 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Self-Consistency Improves Chain of Thought Reasoning in Language Models"
50 / 920 papers shown
Title
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Sangwon Yu
Jongyoon Song
Bongkyu Hwang
Hoyoung Kang
Sooah Cho
Junhwa Choi
Seongho Joe
Taehee Lee
Youngjune Gwon
Sungroh Yoon
225
6
0
31 Jul 2024
Pyramid Coder: Hierarchical Code Generator for Compositional Visual Question Answering
Ruoyue Shen
Nakamasa Inoue
Koichi Shinoda
71
1
0
30 Jul 2024
Automated Review Generation Method Based on Large Language Models
Shican Wu
Xiao Ma
Dehui Luo
Lulu Li
Xiangcheng Shi
...
Ran Luo
Chunlei Pei
Zhijian Zhao
Zhi-Jian Zhao
Jinlong Gong
170
0
0
30 Jul 2024
A Survey on Employing Large Language Models for Text-to-SQL Tasks
Liang Shi
Zhengju Tang
Nan Zhang
Xiaotong Zhang
Zhi Yang
212
32
0
21 Jul 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLM
LRM
VLM
90
4
0
16 Jul 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
Quang H. Nguyen
Duy C. Hoang
Juliette Decugis
Saurav Manchanda
Nitesh Chawla
Khoa D. Doan
Khoa D. Doan
238
11
0
15 Jul 2024
Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues
Kuanchao Chu
Yi-Pei Chen
Hideki Nakayama
LLMAG
84
5
0
13 Jul 2024
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
Weize Chen
Ziming You
Ran Li
Yitong Guan
Chen Qian
Chenyang Zhao
Cheng Yang
Ruobing Xie
Zhiyuan Liu
Maosong Sun
LLMAG
103
40
0
09 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
157
14
0
09 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
128
19
0
08 Jul 2024
STOC-TOT: Stochastic Tree-of-Thought with Constrained Decoding for Complex Reasoning in Multi-Hop Question Answering
Zhenyu Bi
Daniel Hajialigol
Zhongkai Sun
Jie Hao
Xuan Wang
LRM
84
1
0
04 Jul 2024
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
Wenna Lai
H. Xie
Guandong Xu
Qing Li
LRM
88
3
0
02 Jul 2024
Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents
Fanzeng Xia
Hao Liu
Yisong Yue
Tongxin Li
186
1
0
02 Jul 2024
AutoFlow: Automated Workflow Generation for Large Language Model Agents
Zelong Li
Shuyuan Xu
Kai Mei
Wenyue Hua
Balaji Rama
Om Raheja
Hao Wang
He Zhu
Yongfeng Zhang
AIFin
AI4CE
LLMAG
100
19
0
01 Jul 2024
Eliminating Position Bias of Language Models: A Mechanistic Approach
Ziqi Wang
Hanlin Zhang
Xiner Li
Kuan-Hao Huang
Chi Han
Shuiwang Ji
Sham Kakade
Hao Peng
Heng Ji
159
20
0
01 Jul 2024
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data
Meng Fang
Xiangpeng Wan
Fei Lu
Fei Xing
Kai Zou
74
28
0
26 Jun 2024
Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems
Italo Luis da Silva
Hanqi Yan
Lin Gui
Yulan He
CML
109
0
0
26 Jun 2024
Visual Reasoning and Multi-Agent Approach in Multimodal Large Language Models (MLLMs): Solving TSP and mTSP Combinatorial Challenges
Mohammed Elhenawy
Ahmad Abutahoun
Taqwa I. Alhadidi
Ahmed Jaber
Huthaifa I. Ashqar
Shadi Jaradat
Ahmed Abdelhay
Sébastien Glaser
A. Rakotonirainy
LLMAG
LRM
92
12
0
26 Jun 2024
PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models
Huixuan Zhang
Yun Lin
Xiaojun Wan
144
0
0
26 Jun 2024
VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation
Kun Qian
Shunji Wan
Claudia Tang
Youzhi Wang
Xuanming Zhang
Maximillian Chen
Zhou Yu
AAML
93
12
0
25 Jun 2024
Autonomous Prompt Engineering in Large Language Models
Daan Kepel
Konstantina Valogianni
LLMAG
96
8
0
25 Jun 2024
NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization
Md Mahadi Hasan Nahid
Davood Rafiei
LMTD
118
5
0
25 Jun 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Orevaoghene Ahia
Shuyue Stella Li
Vidhisha Balachandran
Sunayana Sitaram
Yulia Tsvetkov
161
8
0
22 Jun 2024
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue
Avishree Khare
Rajeev Alur
Surbhi Goel
Eric Wong
174
3
0
21 Jun 2024
Reasoning Like a Doctor: Improving Medical Dialogue Systems via Diagnostic Reasoning Process Alignment
Kaishuai Xu
Yi Cheng
Wenjun Hou
Qiaoyu Tan
Wenjie Li
99
8
0
20 Jun 2024
medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs
Mingyi Jia
Junwen Duan
Yan Song
Jianxin Wang
101
9
0
20 Jun 2024
APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking
Can Jin
Hongwu Peng
Shiyu Zhao
Zhenting Wang
Wujiang Xu
Ligong Han
Jiahui Zhao
Kai Zhong
Sanguthevar Rajasekaran
Dimitris N. Metaxas
KELM
153
33
0
20 Jun 2024
From Single Agent to Multi-Agent: Improving Traffic Signal Control
Maksim Tislenko
Dmitrii Kisilev
48
0
0
19 Jun 2024
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs
Yi Fang
Moxin Li
Wenjie Wang
Hui Lin
Fuli Feng
LRM
121
8
0
17 Jun 2024
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
145
8
0
14 Jun 2024
Chain-of-Though (CoT) prompting strategies for medical error detection and correction
Zhaolong Wu
Abul Hasan
Jinge Wu
Yunsoo Kim
Jason PY Cheung
Teng Zhang
Honghan Wu
LRM
63
4
0
13 Jun 2024
Language Models are Crossword Solvers
Soumadeep Saha
Sutanoya Chakraborty
Saptarshi Saha
Utpal Garain
LRM
ReLM
121
3
0
13 Jun 2024
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Zijin Hong
Zheng Yuan
Qinggang Zhang
Hao Chen
Junnan Dong
Feiran Huang
Xiao Huang
195
74
0
12 Jun 2024
LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models
Alison Bartsch
A. Farimani
160
7
0
12 Jun 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou
Yang Zhang
Jacob Andreas
Shiyu Chang
161
7
0
11 Jun 2024
Improving Autoformalization using Type Checking
Auguste Poiroux
Gail Weiss
Viktor Kunčak
Antoine Bosselut
123
4
0
11 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
208
44
0
09 Jun 2024
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs
Arash Gholami Davoodi
Seyed Pouyan Mousavi Davoudi
Pouya Pezeshkpour
ELM
LRM
97
4
0
07 Jun 2024
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset
Weiqi Wang
Yangqiu Song
LRM
129
10
0
04 Jun 2024
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
Marianna Nezhurina
Lucia Cipolina-Kun
Mehdi Cherti
J. Jitsev
LLMAG
LRM
ELM
ReLM
191
37
0
04 Jun 2024
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
Zi-Yi Dou
Cheng-Fu Yang
Xueqing Wu
Kai-Wei Chang
Nanyun Peng
LRM
166
10
0
03 Jun 2024
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
Mingda Li
Xinyu Li
Yifan Chen
Wenfeng Xuan
Weinan Zhang
RALM
96
2
0
31 May 2024
OR-Bench: An Over-Refusal Benchmark for Large Language Models
Justin Cui
Wei-Lin Chiang
Ion Stoica
Cho-Jui Hsieh
ALM
161
55
0
31 May 2024
ParSEL: Parameterized Shape Editing with Language
Aditya Ganeshan
Ryan Y. Huang
Xianghao Xu
R. K. Jones
Daniel E. Ritchie
KELM
81
3
0
30 May 2024
Reasoning about concepts with LLMs: Inconsistencies abound
Rosario A. Uceda-Sosa
Karthikeyan N. Ramamurthy
Maria Chang
Moninder Singh
93
4
0
30 May 2024
Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation
Chengwei Dai
Kun Li
Wei Zhou
Song Hu
LRM
98
7
0
30 May 2024
Faithful Logical Reasoning via Symbolic Chain-of-Thought
Jundong Xu
Hao Fei
Liangming Pan
Qian Liu
Mong Li Lee
Wynne Hsu
OffRL
LRM
LLMAG
156
65
0
28 May 2024
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
Zejun Li
Ruipu Luo
Jiwen Zhang
Minghui Qiu
Zhongyu Wei
Zhongyu Wei
LRM
MLLM
185
17
0
27 May 2024
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
Bernal Jiménez Gutiérrez
Yiheng Shu
Yu Gu
Michihiro Yasunaga
Yu-Chuan Su
RALM
CLL
146
48
0
23 May 2024
Can LLMs Solve longer Math Word Problems Better?
Xin Xu
Tong Xiao
Zitong Chao
Zhenya Huang
Can Yang
Yang Wang
173
14
0
23 May 2024
Previous
1
2
3
...
11
12
13
...
17
18
19
Next