Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14975
Cited By
v1
v2 (latest)
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
24 May 2023
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback"
50 / 101 papers shown
Title
Interpretable Text-Guided Image Clustering via Iterative Search
Bingchen Zhao
Oisin Mac Aodha
37
0
0
14 Jun 2025
Probably Approximately Correct Labels
Emmanuel J. Candès
Andrew Ilyas
Tijana Zrnic
143
0
0
12 Jun 2025
Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection
David Farr
Kevin Talty
Alexandra Farr
John Stockdale
Iain Cruickshank
Jevin West
69
0
0
11 Jun 2025
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Haoyi Song
Ruihan Ji
Naichen Shi
Fan Lai
Raed Al Kontar
84
0
0
11 Jun 2025
Know What You Don't Know: Uncertainty Calibration of Process Reward Models
Young-Jin Park
Kristjan Greenewald
Kaveh Alim
Hao Wang
Navid Azizan
LRM
71
0
0
11 Jun 2025
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
Polina Kirichenko
Mark Ibrahim
Kamalika Chaudhuri
Samuel J. Bell
LRM
25
0
0
10 Jun 2025
From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
Siddartha Devic
Tejas Srinivasan
Jesse Thomason
Willie Neiswanger
Vatsal Sharan
30
0
0
09 Jun 2025
Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge
Yi Sui
Chaozhuo Li
Chen Zhang
D. Song
Qiuchi Li
60
0
0
06 Jun 2025
Ignoring Directionality Leads to Compromised Graph Neural Network Explanations
Changsheng Sun
Xinke Li
Jin Song Dong
AAML
126
0
0
05 Jun 2025
Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation
Mingxuan Xia
Haobo Wang
Yixuan Li
Zewei Yu
Jindong Wang
Junbo Zhao
Runze Wu
97
1
0
04 Jun 2025
Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection
Chuyuan Li
Raymond Li
Thalia S. Field
Giuseppe Carenini
138
0
0
04 Jun 2025
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Yue Gong
Chuan Lei
X. Qin
Kapil Vaidya
Balakrishnan Narayanaswamy
Tim Kraska
33
0
0
04 Jun 2025
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
Tim Franzmeyer
Archie Sravankumar
Lijuan Liu
Yuning Mao
Rui Hou
Sinong Wang
Jakob Foerster
Luke Zettlemoyer
Madian Khabsa
KELM
ALM
93
0
0
04 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Yixuan Li
Seongheon Park
Ling Chen
AAML
HILM
67
0
0
03 Jun 2025
Quantitative LLM Judges
Aishwarya Sahoo
Jeevana Kruthi Karnuthala
Tushar Parmanand Budhwani
Pranchal Agarwal
Sankaran Vaidyanathan
...
Jennifer Healey
Nedim Lipka
Ryan Rossi
Uttaran Bhattacharya
Branislav Kveton
ELM
66
0
0
03 Jun 2025
Reconsidering LLM Uncertainty Estimation Methods in the Wild
Yavuz Faruk Bakman
D. Yaldiz
Sungmin Kang
Tuo Zhang
Baturalp Buyukates
Salman Avestimehr
Sai Praneeth Karimireddy
59
0
0
01 Jun 2025
Improving the Calibration of Confidence Scores in Text Generation Using the Output Distribution's Characteristics
Lorenzo Jaime Yu Flores
Ori Ernst
Jackie Chi Kit Cheung
37
0
0
31 May 2025
CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention
Yuxi Sun
Aoqi Zuo
Wei Gao
Jing Ma
44
0
0
31 May 2025
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
Gabrielle Kaili-May Liu
Gal Yona
Avi Caciularu
Idan Szpektor
Tim G. J. Rudner
Arman Cohan
48
0
0
30 May 2025
Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation
Caiqi Zhang
Xiaochen Zhu
Chengzu Li
Nigel Collier
Andreas Vlachos
OffRL
HILM
55
1
0
29 May 2025
Revisiting Uncertainty Estimation and Calibration of Large Language Models
Linwei Tao
Yi-Fan Yeh
Minjing Dong
Tao Huang
Philip Torr
Chang Xu
38
0
0
29 May 2025
Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs
Jakub Podolak
Rajeev Verma
ReLM
LRM
27
0
0
28 May 2025
Maximizing Confidence Alone Improves Reasoning
Mihir Prabhudesai
Lili Chen
Alex Ippoliti
Katerina Fragkiadaki
Hao Liu
Deepak Pathak
OOD
OffRL
ReLM
LRM
139
3
0
28 May 2025
Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies
Terrance Liu
Shuyi Wang
Daniel Preotiuc-Pietro
Yash Chandarana
Chirag Gupta
33
1
0
27 May 2025
VeriTrail: Closed-Domain Hallucination Detection with Traceability
Dasha Metropolitansky
Jonathan Larson
HILM
64
0
0
27 May 2025
Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing
Raoyuan Zhao
Abdullatif Köksal
Ali Modarressi
Michael A. Hedderich
Hinrich Schutze
51
0
0
27 May 2025
PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims
Valentin Knappich
Annemarie Friedrich
Anna Hätty
Simon Razniewski
65
0
0
27 May 2025
Writing Like the Best: Exemplar-Based Expository Text Generation
Yuxiang Liu
Kevin Chen-Chuan Chang
46
0
0
24 May 2025
Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
Hexiang Tan
Fei Sun
Sha Liu
Du Su
Qi Cao
...
Jingang Wang
Xunliang Cai
Yuanzhuo Wang
Huawei Shen
Xueqi Cheng
HILM
160
0
0
23 May 2025
Explaining Sources of Uncertainty in Automated Fact-Checking
Jingyi Sun
Greta Warren
Irina Shklovski
Isabelle Augenstein
67
1
0
23 May 2025
The Effects of Data Augmentation on Confidence Estimation for LLMs
Rui Wang
Renyu Zhu
Minmin Lin
R. Wu
Tangjie Lv
Changjie Fan
Haobo Wang
23
0
0
21 May 2025
Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles
Maya Gambhir
Keshav Ramji
Aaron Roth
Surbhi Goel
HILM
LRM
79
2
0
21 May 2025
Eliminating Hallucination-Induced Errors in LLM Code Generation with Functional Clustering
Chaitanya Ravuri
Saman Amarasinghe
16
0
0
16 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
108
2
0
04 May 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
95
3
0
14 Apr 2025
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification
Anqi Zhang
Yulin Chen
Jane Pan
Chen Zhao
Aurojit Panda
Jinyang Li
He He
ReLM
LRM
143
17
0
07 Apr 2025
Language Model Uncertainty Quantification with Attention Chain
Yinghao Li
Rushi Qiang
Lama Moukheiber
Chao Zhang
LRM
95
3
0
24 Mar 2025
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Xiaoou Liu
Tiejin Chen
Longchao Da
Chacha Chen
Zhen Lin
Hua Wei
HILM
146
8
0
20 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Sophia Hager
David Mueller
Kevin Duh
Nicholas Andrews
154
1
0
18 Mar 2025
Enough Coin Flips Can Make LLMs Act Bayesian
Ritwik Gupta
Rodolfo Corona
Jiaxin Ge
Eric Wang
Dan Klein
Trevor Darrell
David M. Chan
BDL
LRM
108
3
0
06 Mar 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
207
3
0
04 Mar 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
109
3
0
28 Feb 2025
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
178
3
0
26 Feb 2025
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini
Mirella Lapata
RALM
160
0
0
25 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
76
3
0
24 Feb 2025
Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data Annotation
Yuan Tian
Daniel Lee
Fei Wu
Tung Mai
Kun Qian
Siddhartha Sahai
Tianyi Zhang
Yunyao Li
SyDa
110
1
0
21 Feb 2025
CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi
Seyed Mohammad Hadi Hosseini
Mahdieh Soleymani Baghshah
LRM
182
5
0
20 Feb 2025
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception
Shiyu Ni
Keping Bi
Jiafeng Guo
Lulu Yu
Baolong Bi
Xueqi Cheng
94
5
0
17 Feb 2025
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Prateek Chhikara
90
3
0
16 Feb 2025
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Qiujie Xie
Qingqiu Li
Zhuohao Yu
Yuejie Zhang
Yue Zhang
Linyi Yang
ELM
136
5
0
15 Feb 2025
1
2
3
Next