ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14975
  4. Cited By
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
  Scores from Language Models Fine-Tuned with Human Feedback
v1v2 (latest)

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

24 May 2023
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
ArXiv (abs)PDFHTML

Papers citing "Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback"

50 / 101 papers shown
Title
Interpretable Text-Guided Image Clustering via Iterative Search
Interpretable Text-Guided Image Clustering via Iterative Search
Bingchen Zhao
Oisin Mac Aodha
37
0
0
14 Jun 2025
Probably Approximately Correct Labels
Probably Approximately Correct Labels
Emmanuel J. Candès
Andrew Ilyas
Tijana Zrnic
143
0
0
12 Jun 2025
Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection
Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection
David Farr
Kevin Talty
Alexandra Farr
John Stockdale
Iain Cruickshank
Jevin West
69
0
0
11 Jun 2025
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Haoyi Song
Ruihan Ji
Naichen Shi
Fan Lai
Raed Al Kontar
84
0
0
11 Jun 2025
Know What You Don't Know: Uncertainty Calibration of Process Reward Models
Know What You Don't Know: Uncertainty Calibration of Process Reward Models
Young-Jin Park
Kristjan Greenewald
Kaveh Alim
Hao Wang
Navid Azizan
LRM
71
0
0
11 Jun 2025
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
Polina Kirichenko
Mark Ibrahim
Kamalika Chaudhuri
Samuel J. Bell
LRM
25
0
0
10 Jun 2025
From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
Siddartha Devic
Tejas Srinivasan
Jesse Thomason
Willie Neiswanger
Vatsal Sharan
30
0
0
09 Jun 2025
Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge
Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge
Yi Sui
Chaozhuo Li
Chen Zhang
D. Song
Qiuchi Li
60
0
0
06 Jun 2025
Ignoring Directionality Leads to Compromised Graph Neural Network Explanations
Changsheng Sun
Xinke Li
Jin Song Dong
AAML
126
0
0
05 Jun 2025
Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation
Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation
Mingxuan Xia
Haobo Wang
Yixuan Li
Zewei Yu
Jindong Wang
Junbo Zhao
Runze Wu
97
1
0
04 Jun 2025
Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection
Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection
Chuyuan Li
Raymond Li
Thalia S. Field
Giuseppe Carenini
138
0
0
04 Jun 2025
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Yue Gong
Chuan Lei
X. Qin
Kapil Vaidya
Balakrishnan Narayanaswamy
Tim Kraska
33
0
0
04 Jun 2025
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
Tim Franzmeyer
Archie Sravankumar
Lijuan Liu
Yuning Mao
Rui Hou
Sinong Wang
Jakob Foerster
Luke Zettlemoyer
Madian Khabsa
KELMALM
93
0
0
04 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Yixuan Li
Seongheon Park
Ling Chen
AAMLHILM
67
0
0
03 Jun 2025
Quantitative LLM Judges
Quantitative LLM Judges
Aishwarya Sahoo
Jeevana Kruthi Karnuthala
Tushar Parmanand Budhwani
Pranchal Agarwal
Sankaran Vaidyanathan
...
Jennifer Healey
Nedim Lipka
Ryan Rossi
Uttaran Bhattacharya
Branislav Kveton
ELM
66
0
0
03 Jun 2025
Reconsidering LLM Uncertainty Estimation Methods in the Wild
Reconsidering LLM Uncertainty Estimation Methods in the Wild
Yavuz Faruk Bakman
D. Yaldiz
Sungmin Kang
Tuo Zhang
Baturalp Buyukates
Salman Avestimehr
Sai Praneeth Karimireddy
59
0
0
01 Jun 2025
Improving the Calibration of Confidence Scores in Text Generation Using the Output Distribution's Characteristics
Improving the Calibration of Confidence Scores in Text Generation Using the Output Distribution's Characteristics
Lorenzo Jaime Yu Flores
Ori Ernst
Jackie Chi Kit Cheung
37
0
0
31 May 2025
CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention
CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention
Yuxi Sun
Aoqi Zuo
Wei Gao
Jing Ma
44
0
0
31 May 2025
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
Gabrielle Kaili-May Liu
Gal Yona
Avi Caciularu
Idan Szpektor
Tim G. J. Rudner
Arman Cohan
48
0
0
30 May 2025
Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation
Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation
Caiqi Zhang
Xiaochen Zhu
Chengzu Li
Nigel Collier
Andreas Vlachos
OffRLHILM
55
1
0
29 May 2025
Revisiting Uncertainty Estimation and Calibration of Large Language Models
Revisiting Uncertainty Estimation and Calibration of Large Language Models
Linwei Tao
Yi-Fan Yeh
Minjing Dong
Tao Huang
Philip Torr
Chang Xu
38
0
0
29 May 2025
Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs
Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs
Jakub Podolak
Rajeev Verma
ReLMLRM
27
0
0
28 May 2025
Maximizing Confidence Alone Improves Reasoning
Maximizing Confidence Alone Improves Reasoning
Mihir Prabhudesai
Lili Chen
Alex Ippoliti
Katerina Fragkiadaki
Hao Liu
Deepak Pathak
OODOffRLReLMLRM
139
3
0
28 May 2025
Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies
Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies
Terrance Liu
Shuyi Wang
Daniel Preotiuc-Pietro
Yash Chandarana
Chirag Gupta
33
1
0
27 May 2025
VeriTrail: Closed-Domain Hallucination Detection with Traceability
VeriTrail: Closed-Domain Hallucination Detection with Traceability
Dasha Metropolitansky
Jonathan Larson
HILM
64
0
0
27 May 2025
Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing
Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing
Raoyuan Zhao
Abdullatif Köksal
Ali Modarressi
Michael A. Hedderich
Hinrich Schutze
51
0
0
27 May 2025
PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims
PEDANTIC: A Dataset for the Automatic Examination of Definiteness in Patent Claims
Valentin Knappich
Annemarie Friedrich
Anna Hätty
Simon Razniewski
65
0
0
27 May 2025
Writing Like the Best: Exemplar-Based Expository Text Generation
Writing Like the Best: Exemplar-Based Expository Text Generation
Yuxiang Liu
Kevin Chen-Chuan Chang
46
0
0
24 May 2025
Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
Hexiang Tan
Fei Sun
Sha Liu
Du Su
Qi Cao
...
Jingang Wang
Xunliang Cai
Yuanzhuo Wang
Huawei Shen
Xueqi Cheng
HILM
160
0
0
23 May 2025
Explaining Sources of Uncertainty in Automated Fact-Checking
Explaining Sources of Uncertainty in Automated Fact-Checking
Jingyi Sun
Greta Warren
Irina Shklovski
Isabelle Augenstein
67
1
0
23 May 2025
The Effects of Data Augmentation on Confidence Estimation for LLMs
The Effects of Data Augmentation on Confidence Estimation for LLMs
Rui Wang
Renyu Zhu
Minmin Lin
R. Wu
Tangjie Lv
Changjie Fan
Haobo Wang
23
0
0
21 May 2025
Conformal Language Model Reasoning with Coherent Factuality
Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles
Maya Gambhir
Keshav Ramji
Aaron Roth
Surbhi Goel
HILMLRM
79
2
0
21 May 2025
Eliminating Hallucination-Induced Errors in LLM Code Generation with Functional Clustering
Eliminating Hallucination-Induced Errors in LLM Code Generation with Functional Clustering
Chaitanya Ravuri
Saman Amarasinghe
16
0
0
16 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
108
2
0
04 May 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
95
3
0
14 Apr 2025
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification
Anqi Zhang
Yulin Chen
Jane Pan
Chen Zhao
Aurojit Panda
Jinyang Li
He He
ReLMLRM
143
17
0
07 Apr 2025
Language Model Uncertainty Quantification with Attention Chain
Language Model Uncertainty Quantification with Attention Chain
Yinghao Li
Rushi Qiang
Lama Moukheiber
Chao Zhang
LRM
95
3
0
24 Mar 2025
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Xiaoou Liu
Tiejin Chen
Longchao Da
Chacha Chen
Zhen Lin
Hua Wei
HILM
146
8
0
20 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Sophia Hager
David Mueller
Kevin Duh
Nicholas Andrews
154
1
0
18 Mar 2025
Enough Coin Flips Can Make LLMs Act Bayesian
Enough Coin Flips Can Make LLMs Act Bayesian
Ritwik Gupta
Rodolfo Corona
Jiaxin Ge
Eric Wang
Dan Klein
Trevor Darrell
David M. Chan
BDLLRM
108
3
0
06 Mar 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
207
3
0
04 Mar 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
109
3
0
28 Feb 2025
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
178
3
0
26 Feb 2025
Uncertainty Quantification in Retrieval Augmented Question Answering
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini
Mirella Lapata
RALM
160
0
0
25 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
76
3
0
24 Feb 2025
Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data Annotation
Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data Annotation
Yuan Tian
Daniel Lee
Fei Wu
Tung Mai
Kun Qian
Siddhartha Sahai
Tianyi Zhang
Yunyao Li
SyDa
110
1
0
21 Feb 2025
CER: Confidence Enhanced Reasoning in LLMs
CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi
Seyed Mohammad Hadi Hosseini
Mahdieh Soleymani Baghshah
LRM
182
5
0
20 Feb 2025
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception
Shiyu Ni
Keping Bi
Jiafeng Guo
Lulu Yu
Baolong Bi
Xueqi Cheng
94
5
0
17 Feb 2025
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Prateek Chhikara
90
3
0
16 Feb 2025
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Qiujie Xie
Qingqiu Li
Zhuohao Yu
Yuejie Zhang
Yue Zhang
Linyi Yang
ELM
136
5
0
15 Feb 2025
123
Next