Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.11803
Cited By
v1
v2 (latest)
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
16 December 2024
Boyang Xue
Fei Mi
Qi Zhu
Hongru Wang
Rui Wang
Sheng Wang
Erxin Yu
Xuming Hu
Kam-Fai Wong
HILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"
50 / 50 papers shown
Title
Toward a Theory of Agents as Tool-Use Decision-Makers
Hongru Wang
Cheng Qian
Manling Li
Jiahao Qiu
Boyang Xue
Mengdi Wang
Heng Ji
Kam-Fai Wong
63
0
0
01 Jun 2025
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
Junxiao Yang
Jinzhe Tu
Haoran Liu
Xiaoce Wang
Chujie Zheng
...
Caishun Chen
Tiantian He
Hongning Wang
Yew-Soon Ong
Minlie Huang
LRM
107
0
0
18 May 2025
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
268
702
0
06 Aug 2024
LLM Internal States Reveal Hallucination Risk Faced With a Query
Ziwei Ji
Delong Chen
Etsuko Ishii
Samuel Cahyawijaya
Yejin Bang
Bryan Wilie
Pascale Fung
HILM
LRM
115
35
0
03 Jul 2024
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference
Erxin Yu
Jing Li
Ming Liao
Siqi Wang
Zuchen Gao
Fei Mi
Lanqing Hong
ELM
LRM
122
19
0
25 Jun 2024
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities
Alexander Nikitin
Jannik Kossen
Yarin Gal
Pekka Marttinen
UQCV
140
45
0
30 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
163
137
0
09 May 2024
Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience
Haixia Han
Tingyun Li
Shisong Chen
Jie Shi
Chengyu Du
Yanghua Xiao
Jiaqing Liang
Xin Lin
89
11
0
16 Apr 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
110
45
0
27 Mar 2024
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
321
122
0
13 Mar 2024
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
Erxin Yu
Jing Li
Chunpu Xu
63
6
0
29 Feb 2024
Calibrating Large Language Models with Sample Consistency
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
103
29
0
21 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
91
52
0
14 Feb 2024
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation
Yuxin Liang
Zhuoyang Song
Hao Wang
Jiaxing Zhang
HILM
102
36
0
27 Jan 2024
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
Genglin Liu
Xingyao Wang
Lifan Yuan
Yangyi Chen
Hao Peng
100
19
0
16 Nov 2023
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELM
HILM
SyDa
92
185
0
14 Nov 2023
Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment
Boyang Xue
Weichao Wang
Hongru Wang
Fei Mi
Rui Wang
Yasheng Wang
Lifeng Shang
Xin Jiang
Qun Liu
Kam-Fai Wong
KELM
HILM
299
18
0
12 Oct 2023
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning
Yuchen Yang
Houqiang Li
Yanfeng Wang
Yu Wang
80
25
0
07 Oct 2023
A Long Way to Go: Investigating Length Correlations in RLHF
Prasann Singhal
Tanya Goyal
Jiacheng Xu
Greg Durrett
163
161
0
05 Oct 2023
Reinforced Self-Training (ReST) for Language Modeling
Çağlar Gülçehre
T. Paine
S. Srinivasan
Ksenia Konyushkova
L. Weerts
...
Chenjie Gu
Wolfgang Macherey
Arnaud Doucet
Orhan Firat
Nando de Freitas
OffRL
140
309
0
17 Aug 2023
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
Ruiyang Ren
Yuhao Wang
Yingqi Qu
Wayne Xin Zhao
Qingbin Liu
Hao Tian
Huaqin Wu
Ji-Rong Wen
Haifeng Wang
RALM
KELM
129
136
0
20 Jul 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
132
175
0
08 Jul 2023
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong
Zhiyuan Hu
Xinyang Lu
Yifei Li
Jie Fu
Junxian He
Bryan Hooi
241
452
0
22 Jun 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELM
HILM
195
584
0
06 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,190
0
29 May 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
183
357
0
24 May 2023
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting
Rui Wang
Hongru Wang
Fei Mi
Yi Chen
Boyang Xue
Kam-Fai Wong
Rui-Lan Xu
93
17
0
23 May 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.7K
13,558
0
27 Feb 2023
Navigating the Grey Area: How Expressions of Uncertainty and Overconfidence Affect Language Models
Kaitlyn Zhou
Dan Jurafsky
Tatsunori Hashimoto
125
69
0
26 Feb 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
313
1,651
0
15 Dec 2022
CoP: Factual Inconsistency Detection by Controlling the Preference
Shuaijie She
Xiang Geng
Shujian Huang
Jiajun Chen
103
5
0
03 Dec 2022
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
Willie Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
273
99
0
10 Oct 2022
Bayesian Neural Network Language Modeling for Speech Recognition
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCV
BDL
127
18
0
28 Aug 2022
Language Models (Mostly) Know What They Know
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
...
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
150
836
0
11 Jul 2022
Teaching Models to Express Their Uncertainty in Words
Stephanie C. Lin
Jacob Hilton
Owain Evans
OOD
144
425
0
28 May 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
307
2,632
0
12 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
1.3K
13,290
0
04 Mar 2022
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Stephanie C. Lin
Jacob Hilton
Owain Evans
HILM
163
1,956
0
08 Sep 2021
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
335
171
0
30 Dec 2020
A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges
Moloud Abdar
Farhad Pourpanah
Sadiq Hussain
Dana Rezazadegan
Li Liu
...
Xiaochun Cao
Abbas Khosravi
U. Acharya
V. Makarenkov
S. Nahavandi
BDL
UQCV
382
1,952
0
12 Nov 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.3K
42,754
0
28 May 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
615
1,776
0
18 Sep 2019
Correcting Length Bias in Neural Machine Translation
Kenton W. Murray
David Chiang
AIMat
88
158
0
29 Aug 2018
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks
Kimin Lee
Kibok Lee
Honglak Lee
Jinwoo Shin
OODD
203
2,078
0
10 Jul 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
715
19,378
0
20 Jul 2017
Crowdsourcing Multiple Choice Science Questions
Johannes Welbl
Nelson F. Liu
Matt Gardner
AI4Ed
149
522
0
19 Jul 2017
On Calibration of Modern Neural Networks
Chuan Guo
Geoff Pleiss
Yu Sun
Kilian Q. Weinberger
UQCV
303
5,901
0
14 Jun 2017
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
442
2,696
0
09 May 2017
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
1.1K
5,863
0
05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
1.0K
9,393
0
06 Jun 2015
1