ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.11803
  4. Cited By
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
v1v2 (latest)

UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models

16 December 2024
Boyang Xue
Fei Mi
Qi Zhu
Hongru Wang
Rui Wang
Sheng Wang
Erxin Yu
Xuming Hu
Kam-Fai Wong
    HILM
ArXiv (abs)PDFHTML

Papers citing "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"

50 / 50 papers shown
Title
Toward a Theory of Agents as Tool-Use Decision-Makers
Toward a Theory of Agents as Tool-Use Decision-Makers
Hongru Wang
Cheng Qian
Manling Li
Jiahao Qiu
Boyang Xue
Mengdi Wang
Heng Ji
Kam-Fai Wong
63
0
0
01 Jun 2025
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
Junxiao Yang
Jinzhe Tu
Haoran Liu
Xiaoce Wang
Chujie Zheng
...
Caishun Chen
Tiantian He
Hongning Wang
Yew-Soon Ong
Minlie Huang
LRM
107
0
0
18 May 2025
Scaling LLM Test-Time Compute Optimally can be More Effective than
  Scaling Model Parameters
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
268
702
0
06 Aug 2024
LLM Internal States Reveal Hallucination Risk Faced With a Query
LLM Internal States Reveal Hallucination Risk Faced With a Query
Ziwei Ji
Delong Chen
Etsuko Ishii
Samuel Cahyawijaya
Yejin Bang
Bryan Wilie
Pascale Fung
HILMLRM
115
35
0
03 Jul 2024
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue
  Coreference
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference
Erxin Yu
Jing Li
Ming Liao
Siqi Wang
Zuchen Gao
Fei Mi
Lanqing Hong
ELMLRM
122
19
0
25 Jun 2024
Kernel Language Entropy: Fine-grained Uncertainty Quantification for
  LLMs from Semantic Similarities
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities
Alexander Nikitin
Jannik Kossen
Yarin Gal
Pekka Marttinen
UQCV
140
45
0
30 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
163
137
0
09 May 2024
Enhancing Confidence Expression in Large Language Models Through
  Learning from Past Experience
Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience
Haixia Han
Tingyun Li
Shisong Chen
Jie Shi
Chengyu Du
Yanghua Xiao
Jiaqing Liang
Xin Lin
89
11
0
16 Apr 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown
  Questions Using RL from Knowledge Feedback
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
110
45
0
27 Mar 2024
Knowledge Conflicts for LLMs: A Survey
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
321
122
0
13 Mar 2024
PopALM: Popularity-Aligned Language Models for Social Media Trendy
  Response Prediction
PopALM: Popularity-Aligned Language Models for Social Media Trendy Response Prediction
Erxin Yu
Jing Li
Chunpu Xu
63
6
0
29 Feb 2024
Calibrating Large Language Models with Sample Consistency
Calibrating Large Language Models with Sample Consistency
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
103
29
0
21 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via
  Self-Evaluation
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
91
52
0
14 Feb 2024
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for
  Hallucination Mitigation
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation
Yuxin Liang
Zhuoyang Song
Hao Wang
Jiaxing Zhang
HILM
102
36
0
27 Jan 2024
Examining LLMs' Uncertainty Expression Towards Questions Outside
  Parametric Knowledge
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
Genglin Liu
Xingyao Wang
Lifan Yuan
Yangyi Chen
Hao Peng
100
19
0
16 Nov 2023
Fine-tuning Language Models for Factuality
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELMHILMSyDa
92
185
0
14 Nov 2023
Improving Factual Consistency for Knowledge-Grounded Dialogue Systems
  via Knowledge Enhancement and Alignment
Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment
Boyang Xue
Weichao Wang
Hongru Wang
Fei Mi
Rui Wang
Yasheng Wang
Lifeng Shang
Xin Jiang
Qun Liu
Kam-Fai Wong
KELMHILM
299
18
0
12 Oct 2023
Improving the Reliability of Large Language Models by Leveraging
  Uncertainty-Aware In-Context Learning
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning
Yuchen Yang
Houqiang Li
Yanfeng Wang
Yu Wang
80
25
0
07 Oct 2023
A Long Way to Go: Investigating Length Correlations in RLHF
A Long Way to Go: Investigating Length Correlations in RLHF
Prasann Singhal
Tanya Goyal
Jiacheng Xu
Greg Durrett
163
161
0
05 Oct 2023
Reinforced Self-Training (ReST) for Language Modeling
Reinforced Self-Training (ReST) for Language Modeling
Çağlar Gülçehre
T. Paine
S. Srinivasan
Ksenia Konyushkova
L. Weerts
...
Chenjie Gu
Wolfgang Macherey
Arnaud Doucet
Orhan Firat
Nando de Freitas
OffRL
140
309
0
17 Aug 2023
Investigating the Factual Knowledge Boundary of Large Language Models
  with Retrieval Augmentation
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
Ruiyang Ren
Yuhao Wang
Yingqi Qu
Wayne Xin Zhao
Qingbin Liu
Hao Tian
Huaqin Wu
Ji-Rong Wen
Haifeng Wang
RALMKELM
129
136
0
20 Jul 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of
  LLMs by Validating Low-Confidence Generation
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
132
175
0
08 Jul 2023
Can LLMs Express Their Uncertainty? An Empirical Evaluation of
  Confidence Elicitation in LLMs
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong
Zhiyuan Hu
Xinyang Lu
Yifei Li
Jie Fu
Junxian He
Bryan Hooi
241
452
0
22 Jun 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language
  Model
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELMHILM
195
584
0
06 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,190
0
29 May 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
  Scores from Language Models Fine-Tuned with Human Feedback
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
183
357
0
24 May 2023
Enhancing Large Language Models Against Inductive Instructions with
  Dual-critique Prompting
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting
Rui Wang
Hongru Wang
Fei Mi
Yi Chen
Boyang Xue
Kam-Fai Wong
Rui-Lan Xu
93
17
0
23 May 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
1.7K
13,558
0
27 Feb 2023
Navigating the Grey Area: How Expressions of Uncertainty and
  Overconfidence Affect Language Models
Navigating the Grey Area: How Expressions of Uncertainty and Overconfidence Affect Language Models
Kaitlyn Zhou
Dan Jurafsky
Tatsunori Hashimoto
125
69
0
26 Feb 2023
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDaMoMe
313
1,651
0
15 Dec 2022
CoP: Factual Inconsistency Detection by Controlling the Preference
CoP: Factual Inconsistency Detection by Controlling the Preference
Shuaijie She
Xiang Geng
Shujian Huang
Jiajun Chen
103
5
0
03 Dec 2022
Uncertainty Quantification with Pre-trained Language Models: A
  Large-Scale Empirical Analysis
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
Willie Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
273
99
0
10 Oct 2022
Bayesian Neural Network Language Modeling for Speech Recognition
Bayesian Neural Network Language Modeling for Speech Recognition
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCVBDL
127
18
0
28 Aug 2022
Language Models (Mostly) Know What They Know
Language Models (Mostly) Know What They Know
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
...
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
150
836
0
11 Jul 2022
Teaching Models to Express Their Uncertainty in Words
Teaching Models to Express Their Uncertainty in Words
Stephanie C. Lin
Jacob Hilton
Owain Evans
OOD
144
425
0
28 May 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
307
2,632
0
12 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
1.3K
13,290
0
04 Mar 2022
TruthfulQA: Measuring How Models Mimic Human Falsehoods
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Stephanie C. Lin
Jacob Hilton
Owain Evans
HILM
163
1,956
0
08 Sep 2021
Reducing conversational agents' overconfidence through linguistic
  calibration
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
335
171
0
30 Dec 2020
A Review of Uncertainty Quantification in Deep Learning: Techniques,
  Applications and Challenges
A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges
Moloud Abdar
Farhad Pourpanah
Sadiq Hussain
Dana Rezazadegan
Li Liu
...
Xiaochun Cao
Abbas Khosravi
U. Acharya
V. Makarenkov
S. Nahavandi
BDLUQCV
382
1,952
0
12 Nov 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.3K
42,754
0
28 May 2020
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
615
1,776
0
18 Sep 2019
Correcting Length Bias in Neural Machine Translation
Correcting Length Bias in Neural Machine Translation
Kenton W. Murray
David Chiang
AIMat
88
158
0
29 Aug 2018
A Simple Unified Framework for Detecting Out-of-Distribution Samples and
  Adversarial Attacks
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks
Kimin Lee
Kibok Lee
Honglak Lee
Jinwoo Shin
OODD
203
2,078
0
10 Jul 2018
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
715
19,378
0
20 Jul 2017
Crowdsourcing Multiple Choice Science Questions
Crowdsourcing Multiple Choice Science Questions
Johannes Welbl
Nelson F. Liu
Matt Gardner
AI4Ed
149
522
0
19 Jul 2017
On Calibration of Modern Neural Networks
On Calibration of Modern Neural Networks
Chuan Guo
Geoff Pleiss
Yu Sun
Kilian Q. Weinberger
UQCV
303
5,901
0
14 Jun 2017
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for
  Reading Comprehension
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
442
2,696
0
09 May 2017
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCVBDL
1.1K
5,863
0
05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in
  Deep Learning
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCVBDL
1.0K
9,393
0
06 Jun 2015
1