Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.02454
Cited By
Graph-based Confidence Calibration for Large Language Models
3 November 2024
Yukun Li
Sijia Wang
Lifu Huang
Li-Ping Liu
UQCV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Graph-based Confidence Calibration for Large Language Models"
37 / 37 papers shown
Title
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
57
0
0
30 Apr 2025
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form Text
Sher Badshah
Hassan Sajjad
ELM
60
13
0
17 Aug 2024
Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach
Linyu Liu
Yu Pan
Xiaocheng Li
Guanting Chen
71
35
0
24 Apr 2024
Uncertainty in Language Models: Assessment through Rank-Calibration
Xinmeng Huang
Shuo Li
Mengxin Yu
Matteo Sesia
Hamed Hassani
Insup Lee
Osbert Bastani
Yan Sun
57
19
0
04 Apr 2024
LUQ: Long-text Uncertainty Quantification for LLMs
Caiqi Zhang
Fangyu Liu
Marco Basaldella
Nigel Collier
HILM
67
38
0
29 Mar 2024
Calibrating Large Language Models Using Their Generations Only
Dennis Ulmer
Martin Gubri
Hwaran Lee
Sangdoo Yun
Seong Joon Oh
UQLM
474
28
1
09 Mar 2024
MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs
Yavuz Faruk Bakman
D. Yaldiz
Baturalp Buyukates
Chenyang Tao
Dimitrios Dimitriadis
A. Avestimehr
59
23
0
19 Feb 2024
Reconfidencing LLMs from the Grouping Loss Perspective
Lihu Chen
Alexandre Perez-Lebel
Fabian M. Suchanek
Gaël Varoquaux
261
12
0
07 Feb 2024
Benchmarking LLMs via Uncertainty Quantification
Fanghua Ye
Mingming Yang
Jianhui Pang
Longyue Wang
Derek F. Wong
Emine Yilmaz
Shuming Shi
Zhaopeng Tu
ELM
209
55
0
23 Jan 2024
Can AI Write Classical Chinese Poetry like Humans? An Empirical Study Inspired by Turing Test
Zekun Deng
Haoxia Yang
Jun Wang
56
2
0
10 Jan 2024
Methods to Estimate Large Language Model Confidence
Maia Kotelanski
Robert Gallo
Ashwin Nayak
Thomas Savage
LM&MA
49
6
0
28 Nov 2023
LM-Polygraph: Uncertainty Estimation for Language Models
Ekaterina Fadeeva
Roman Vashurin
Akim Tsvigun
Artem Vazhentsev
Sergey Petrakov
...
Elizaveta Goncharova
Alexander Panchenko
Maxim Panov
Timothy Baldwin
Artem Shelmanov
48
67
0
13 Nov 2023
Detecting Pretraining Data from Large Language Models
Weijia Shi
Anirudh Ajith
Mengzhou Xia
Yangsibo Huang
Daogao Liu
Terra Blevins
Danqi Chen
Luke Zettlemoyer
MIALM
71
188
0
25 Oct 2023
A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing
Carlos Gómez-Rodríguez
Paul Williams
55
82
0
12 Oct 2023
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
Jiuhai Chen
Jonas W. Mueller
89
69
0
30 Aug 2023
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong
Zhiyuan Hu
Xinyang Lu
Yifei Li
Jie Fu
Junxian He
Bryan Hooi
195
433
0
22 Jun 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
335
4,298
0
09 Jun 2023
How Ready are Pre-trained Abstractive Models and LLMs for Legal Case Judgement Summarization?
Aniket Deroy
Kripabandhu Ghosh
Saptarshi Ghosh
ELM
AILaw
44
60
0
02 Jun 2023
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
...
Olivier Bachem
G. Elidan
Avinatan Hassidim
Olivier Pietquin
Idan Szpektor
HILM
73
85
0
31 May 2023
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models
Zhen Lin
Shubhendu Trivedi
Jimeng Sun
HILM
176
147
0
30 May 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
105
346
0
24 May 2023
WebCPM: Interactive Web Search for Chinese Long-form Question Answering
Yujia Qin
Zihan Cai
Di Jin
Lan Yan
Shi Liang
...
Ruobing Xie
Fanchao Qi
Zhiyuan Liu
Maosong Sun
Jie Zhou
RALM
59
92
0
11 May 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELM
ALM
LM&MA
163
1,187
0
29 Mar 2023
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Potsawee Manakul
Adian Liusie
Mark Gales
HILM
LRM
189
427
0
15 Mar 2023
A Survey on Uncertainty Quantification Methods for Deep Learning
Wenchong He
Zhe Jiang
Tingsong Xiao
Zelin Xu
Yukun Li
BDL
UQCV
AI4CE
80
23
0
26 Feb 2023
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Lorenz Kuhn
Y. Gal
Sebastian Farquhar
UQLM
187
295
0
19 Feb 2023
Evaluating the Factual Consistency of Large Language Models Through News Summarization
Derek Tam
Anisha Mascarenhas
Shiyue Zhang
Sarah Kwan
Joey Tianyi Zhou
Colin Raffel
HILM
56
105
0
15 Nov 2022
Exploring Predictive Uncertainty and Calibration in NLP: A Study on the Impact of Method & Data Scarcity
Dennis Ulmer
J. Frellsen
Christian Hardmeier
218
23
0
20 Oct 2022
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Jie Jessie Ren
Jiaming Luo
Yao-Min Zhao
Kundan Krishna
Mohammad Saleh
Balaji Lakshminarayanan
Peter J. Liu
OODD
108
109
0
30 Sep 2022
Teaching Models to Express Their Uncertainty in Words
Stephanie C. Lin
Jacob Hilton
Owain Evans
OOD
70
414
0
28 May 2022
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Stephanie C. Lin
Jacob Hilton
Owain Evans
HILM
140
1,897
0
08 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
184
3,743
0
03 Sep 2021
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
245
168
0
30 Dec 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
138
2,731
0
05 Jun 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,193
0
27 Aug 2019
CoQA: A Conversational Question Answering Challenge
Siva Reddy
Danqi Chen
Christopher D. Manning
RALM
HAI
98
1,202
0
21 Aug 2018
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
204
2,646
0
09 May 2017
1