Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.05221
Cited By
Language Models (Mostly) Know What They Know
11 July 2022
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
Ethan Perez
Nicholas Schiefer
Zac Hatfield-Dodds
Nova Dassarma
Eli Tran-Johnson
Scott Johnston
S. E. Showk
Andy Jones
Nelson Elhage
Tristan Hume
Anna Chen
Yuntao Bai
Sam Bowman
Stanislav Fort
Deep Ganguli
Danny Hernandez
Josh Jacobson
John Kernion
Shauna Kravec
Liane Lovitt
Kamal Ndousse
Catherine Olsson
Sam Ringer
Dario Amodei
Tom B. Brown
Jack Clark
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models (Mostly) Know What They Know"
50 / 135 papers shown
Title
Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis
Heydar Soudani
Evangelos Kanoulas
Faegheh Hasibi
31
0
0
12 May 2025
Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection
Pei-Fu Guo
Yun-Da Tsai
Shou-De Lin
UD
46
0
0
12 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
30
0
0
04 May 2025
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
28
0
0
30 Apr 2025
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
44
0
0
27 Apr 2025
Random-Set Large Language Models
Muhammad Mubashar
Shireen Kudukkil Manchingal
Fabio Cuzzolin
66
0
0
25 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
92
0
0
25 Apr 2025
Hallucination Detection in LLMs via Topological Divergence on Attention Graphs
Alexandra Bazarova
Aleksandr Yugay
Andrey Shulga
A. Ermilova
Andrei Volodichev
...
Dmitry Simakov
M. Savchenko
Andrey Savchenko
Serguei Barannikov
Alexey Zaytsev
HILM
30
0
0
14 Apr 2025
CCSK:Cognitive Convection of Self-Knowledge Based Retrieval Augmentation for Large Language Models
Jianling Lu
Mingqi Lv
Tieming Chen
RALM
45
0
0
07 Apr 2025
A Perplexity and Menger Curvature-Based Approach for Similarity Evaluation of Large Language Models
Yuantao Zhang
Zhankui Yang
AAML
35
0
0
05 Apr 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
49
1
0
18 Mar 2025
Don't lie to your friends: Learning what you know from collaborative self-play
Jacob Eisenstein
Reza Aghajani
Adam Fisch
Dheeru Dua
Fantine Huot
Mirella Lapata
Vicky Zayats
Jonathan Berant
70
0
0
18 Mar 2025
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
64
0
0
18 Mar 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
104
2
0
04 Mar 2025
How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach
Ayeong Lee
Ethan Che
Tianyi Peng
LRM
42
10
0
03 Mar 2025
Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks
Umar Ali Khan
Ekram Khan
Fiza Khan
A. A. Moinuddin
48
0
0
02 Mar 2025
Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs
Xiaomin Li
Zhou Yu
Ziji Zhang
Yingying Zhuang
S.
Narayanan Sadagopan
Anurag Beniwal
HILM
58
0
0
28 Feb 2025
END: Early Noise Dropping for Efficient and Effective Context Denoising
Hongye Jin
Pei Chen
Jingfeng Yang
Z. Wang
Meng-Long Jiang
...
X. Zhang
Zheng Li
Tianyi Liu
Huasheng Li
Bing Yin
125
0
0
26 Feb 2025
Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods
Nicola Cecere
Andrea Bacciu
Ignacio Fernández Tobías
Amin Mantrach
66
1
0
25 Feb 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
74
4
0
24 Feb 2025
Large Language Model Confidence Estimation via Black-Box Access
Tejaswini Pedapati
Amit Dhurandhar
Soumya Ghosh
Soham Dan
P. Sattigeri
89
3
0
21 Feb 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRM
AI4CE
ReLM
KELM
185
1
0
21 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations
Borui Yang
Md Afif Al Mamun
Jie M. Zhang
Gias Uddin
HILM
64
0
0
20 Feb 2025
SMART: Self-Aware Agent for Tool Overuse Mitigation
Cheng Qian
Emre Can Acikgoz
H. Wang
X. Chen
Avirup Sil
Dilek Hakkani-Tür
Gökhan Tür
Heng Ji
LLMAG
KELM
LRM
69
4
0
17 Feb 2025
Can Your Uncertainty Scores Detect Hallucinated Entity?
Min-Hsuan Yeh
Max Kamachee
Seongheon Park
Yixuan Li
HILM
49
1
0
17 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
L. Melo
Younesse Kaddar
Phil Blunsom
S. Kamath S
Yarin Gal
LRM
46
0
0
16 Feb 2025
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference
Roman Levin
Valeriia Cherepanova
Abhimanyu Hans
Avi Schwarzschild
Tom Goldstein
140
1
0
14 Feb 2025
Cost-Saving LLM Cascades with Early Abstention
Michael J. Zellinger
Rex Liu
Matt Thomson
102
0
0
13 Feb 2025
Can ChatGPT Diagnose Alzheimer's Disease?
Quoc Toan Nguyen
Linh Le
Xuan-The Tran
T. Do
Chin-Teng Lin
LM&MA
216
0
0
10 Feb 2025
Enhancing Hallucination Detection through Noise Injection
Litian Liu
Reza Pourreza
Sunny Panchal
Apratim Bhattacharyya
Yao Qin
Roland Memisevic
HILM
73
2
0
06 Feb 2025
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
LRM
34
0
0
05 Feb 2025
A statistically consistent measure of Semantic Variability using Language Models
Yi Liu
71
0
0
01 Feb 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
90
12
0
31 Dec 2024
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Nabeel Seedat
Caterina Tozzi
Andrea Hita Ardiaca
M. Schaar
James Weatherall
Adam Taylor
162
0
0
20 Nov 2024
Prompt-Guided Internal States for Hallucination Detection of Large Language Models
Fujie Zhang
Peiqi Yu
Biao Yi
Baolei Zhang
Tong Li
Zheli Liu
HILM
LRM
57
0
0
07 Nov 2024
Dynamic Strategy Planning for Efficient Question Answering with Large Language Models
Tanmay Parekh
Pradyot Prakash
Alexander Radovic
Akshay Shekher
Denis Savenkov
LRM
51
1
0
30 Oct 2024
Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the effect of Epistemic Markers on LLM-based Evaluation
Dongryeol Lee
Yerin Hwang
Yongil Kim
Joonsuk Park
Kyomin Jung
ELM
70
5
0
28 Oct 2024
ToW: Thoughts of Words Improve Reasoning in Large Language Models
Zhikun Xu
Ming shen
Jacob Dineen
Zhaonan Li
Xiao Ye
Shijie Lu
Aswin Rrv
Chitta Baral
Ben Zhou
LRM
132
1
0
21 Oct 2024
Do LLMs estimate uncertainty well in instruction-following?
Juyeon Heo
Miao Xiong
Christina Heinze-Deml
Jaya Narain
ELM
50
3
0
18 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui-cang Wang
LRM
45
4
0
17 Oct 2024
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael J.Q. Zhang
W. Bradley Knox
Eunsol Choi
48
3
0
17 Oct 2024
FIRE: Fact-checking with Iterative Retrieval and Verification
Zhuohan Xie
Rui Xing
Yuxia Wang
Jiahui Geng
Hasan Iqbal
Dhruv Sahnan
Iryna Gurevych
Preslav Nakov
HILM
52
2
0
17 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ZhongXiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
55
7
0
15 Oct 2024
On Calibration of LLM-based Guard Models for Reliable Content Moderation
Hongfu Liu
Hengguan Huang
Hao Wang
Xiangming Gu
Ye Wang
55
2
0
14 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
26
7
0
13 Oct 2024
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Enyu Zhou
Guodong Zheng
B. Wang
Zhiheng Xi
Shihan Dou
...
Yurong Mou
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
56
17
0
13 Oct 2024
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Sicheng Yu
Chengkai Jin
Huanyu Wang
Zhenghao Chen
Sheng Jin
...
Zhenbang Sun
Bingni Zhang
Jiawei Wu
Hao Zhang
Qianru Sun
67
5
0
04 Oct 2024
Can Language Model Understand Word Semantics as A Chatbot? An Empirical Study of Language Model Internal External Mismatch
Jinman Zhao
Xueyan Zhang
Xingyu Yue
Weizhe Chen
Zifan Qian
Ruiyu Wang
LRM
34
0
0
21 Sep 2024
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang
Haneul Yoo
Hwaran Lee
60
1
0
13 Aug 2024
Cost-Effective Hallucination Detection for LLMs
Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
HILM
37
4
0
31 Jul 2024
1
2
3
Next