Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.13063
Cited By
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
22 June 2023
Miao Xiong
Zhiyuan Hu
Xinyang Lu
Yifei Li
Jie Fu
Junxian He
Bryan Hooi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
38 / 88 papers shown
Title
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
Evan Becker
Stefano Soatto
45
6
0
05 Jun 2024
Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners
Zhi Zheng
Qian Feng
Hang Li
Alois C. Knoll
Jianxiang Feng
54
6
0
01 Jun 2024
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
Wenqi Fan
Yujuan Ding
Liang-bo Ning
Shijie Wang
Hengyun Li
Dawei Yin
Tat-Seng Chua
Qing Li
RALM
3DV
40
191
0
10 May 2024
"I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust
Sunnie S. Y. Kim
Q. V. Liao
Mihaela Vorvoreanu
Steph Ballard
Jennifer Wortman Vaughan
40
51
0
01 May 2024
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
Yu Feng
Ben Zhou
Weidong Lin
Dan Roth
76
5
0
18 Apr 2024
Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation
Ruixin Yang
Dheeraj Rajagopal
S. Hayati
Bin Hu
Dongyeop Kang
LLMAG
43
5
0
14 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
53
6
0
12 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
29
10
0
03 Apr 2024
Are large language models superhuman chemists?
Adrian Mirza
Nawaf Alampara
Sreekanth Kunchapu
Benedict Emoekabu
Aswanth Krishnan
...
Leanne M. Stafast
Dinga Wonanke
Michael Pieler
P. Schwaller
Kevin Maik Jablonka
ELM
AI4MH
LRM
LM&MA
31
5
0
01 Apr 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
57
9
0
25 Mar 2024
Self-Consistency Boosts Calibration for Math Reasoning
Ante Wang
Linfeng Song
Ye Tian
Baolin Peng
Lifeng Jin
Haitao Mi
Jinsong Su
Dong Yu
LRM
29
5
0
14 Mar 2024
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Fine-Grained Self-Endorsement Improves Factuality and Reasoning
Ante Wang
Linfeng Song
Baolin Peng
Ye Tian
Lifeng Jin
Haitao Mi
Jinsong Su
Dong Yu
HILM
LRM
23
6
0
23 Feb 2024
Calibrating Large Language Models with Sample Consistency
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
51
23
0
21 Feb 2024
Soft Self-Consistency Improves Language Model Agents
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
LLMAG
24
8
0
20 Feb 2024
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Zijun Liu
Boqun Kou
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
32
2
0
19 Feb 2024
Overconfident and Unconfident AI Hinder Human-AI Collaboration
Jingshu Li
Yitian Yang
Renwen Zhang
Yi-Chieh Lee
40
1
0
12 Feb 2024
The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey
Dhruv Dhamani
Mary Lou Maher
40
1
0
29 Dec 2023
Reducing LLM Hallucinations using Epistemic Neural Networks
Shreyas Verma
Kien Tran
Yusuf Ali
Guangyu Min
38
8
0
25 Dec 2023
Robust Knowledge Extraction from Large Language Models using Social Choice Theory
Nico Potyka
Yuqicheng Zhu
Yunjie He
Evgeny Kharlamov
Steffen Staab
32
3
0
22 Dec 2023
Methods to Estimate Large Language Model Confidence
Maia Kotelanski
Robert Gallo
Ashwin Nayak
Thomas Savage
LM&MA
29
6
0
28 Nov 2023
A Survey on Multimodal Large Language Models for Autonomous Driving
Can Cui
Yunsheng Ma
Xu Cao
Wenqian Ye
Yang Zhou
...
Xinrui Yan
Shuqi Mei
Jianguo Cao
Ziran Wang
Chao Zheng
43
255
0
21 Nov 2023
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
Genglin Liu
Xingyao Wang
Lifan Yuan
Yangyi Chen
Hao Peng
29
16
0
16 Nov 2023
Towards A Unified View of Answer Calibration for Multi-Step Reasoning
Shumin Deng
Ningyu Zhang
Nay Oo
Bryan Hooi
LRM
48
2
0
15 Nov 2023
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation
Vaishnavi Shrivastava
Percy Liang
Ananya Kumar
28
28
0
15 Nov 2023
Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Sree Harsha Tanneru
Chirag Agarwal
Himabindu Lakkaraju
LRM
27
14
0
06 Nov 2023
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Zekun Wang
Zhongyuan Peng
Haoran Que
Jiaheng Liu
Wangchunshu Zhou
...
Wanli Ouyang
Ke Xu
Wenhu Chen
Jie Fu
Junran Peng
LLMAG
47
85
0
01 Oct 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
48
522
0
03 Sep 2023
Great Models Think Alike: Improving Model Reliability via Inter-Model Latent Agreement
Ailin Deng
Miao Xiong
Bryan Hooi
44
6
0
02 May 2023
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Jie Jessie Ren
Jiaming Luo
Yao-Min Zhao
Kundan Krishna
Mohammad Saleh
Balaji Lakshminarayanan
Peter J. Liu
OODD
75
96
0
30 Sep 2022
Re-Examining Calibration: The Case of Question Answering
Chenglei Si
Chen Zhao
Sewon Min
Jordan L. Boyd-Graber
67
30
0
25 May 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
326
3,273
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
413
8,559
0
28 Jan 2022
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
259
678
0
06 Jan 2021
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
209
154
0
30 Dec 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,675
0
05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
285
9,145
0
06 Jun 2015
Previous
1
2