ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.04696
  4. Cited By
Fact-Checking the Output of Large Language Models via Token-Level
  Uncertainty Quantification
v1v2 (latest)

Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification

7 March 2024
Ekaterina Fadeeva
Aleksandr Rubashevskii
Artem Shelmanov
Sergey Petrakov
Haonan Li
Hamdy Mubarak
Evgenii Tsymbalov
Gleb Kuzmin
Alexander Panchenko
Timothy Baldwin
Preslav Nakov
Maxim Panov
    HILM
ArXiv (abs)PDFHTML

Papers citing "Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification"

27 / 27 papers shown
Title
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Dylan Bouchard
Mohit Singh Chauhan
HILM
143
0
0
27 Apr 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
144
7
0
24 Feb 2025
CER: Confidence Enhanced Reasoning in LLMs
CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi
Seyed Mohammad Hadi Hosseini
Mahdieh Soleymani Baghshah
LRM
167
5
0
20 Feb 2025
Can Your Uncertainty Scores Detect Hallucinated Entity?
Can Your Uncertainty Scores Detect Hallucinated Entity?
Min-Hsuan Yeh
Max Kamachee
Seongheon Park
Yixuan Li
HILM
119
3
0
17 Feb 2025
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents
Weiwei Sun
Lingyong Yan
Xinyu Ma
Shuaiqiang Wang
Pengjie Ren
Zhumin Chen
Dawei Yin
Zhaochun Ren
RALMALMELMLRMLM&MA
220
313
0
31 Dec 2024
Decoding Secret Memorization in Code LLMs Through Token-Level Characterization
Decoding Secret Memorization in Code LLMs Through Token-Level Characterization
Yuqing Nie
Chong Wang
Kaidi Wang
Guoai Xu
Guosheng Xu
Haoyu Wang
OffRL
432
3
0
11 Oct 2024
Loki: An Open-Source Tool for Fact Verification
Loki: An Open-Source Tool for Fact Verification
Haonan Li
Xudong Han
Hao Wang
Yuxia Wang
Minghan Wang
Rui Xing
Yilin Geng
Zenan Zhai
Preslav Nakov
Timothy Baldwin
SyDaHILM
330
5
0
02 Oct 2024
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang
Bo Huang
Yufei Wang
Xingshan Zeng
Liangyou Li
Yasheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wei Wang
123
7
0
14 Aug 2024
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin
Ekaterina Fadeeva
Artem Vazhentsev
Akim Tsvigun
Daniil Vasilev
...
Timothy Baldwin
Timothy Baldwin
Maxim Panov
Artem Shelmanov
Artem Shelmanov
HILM
149
28
0
21 Jun 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou
Yang Zhang
Jacob Andreas
Shiyu Chang
153
7
0
11 Jun 2024
LM-Polygraph: Uncertainty Estimation for Language Models
LM-Polygraph: Uncertainty Estimation for Language Models
Ekaterina Fadeeva
Roman Vashurin
Akim Tsvigun
Artem Vazhentsev
Sergey Petrakov
...
Elizaveta Goncharova
Alexander Panchenko
Maxim Panov
Timothy Baldwin
Artem Shelmanov
62
68
0
13 Nov 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of
  LLMs by Validating Low-Confidence Generation
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
111
175
0
08 Jul 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
469
4,444
0
09 Jun 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILMALM
153
703
0
23 May 2023
Fact-Checking Complex Claims with Program-Guided Reasoning
Fact-Checking Complex Claims with Program-Guided Reasoning
Liangming Pan
Xiaobao Wu
Xinyuan Lu
Anh Tuan Luu
William Yang Wang
Min-Yen Kan
Preslav Nakov
LRM
92
136
0
22 May 2023
Detecting and Mitigating Hallucinations in Machine Translation: Model
  Internal Workings Alone Do Well, Sentence Similarity Even Better
Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better
David Dale
Elena Voita
Loïc Barrault
Marta R. Costa-jussá
HILM
222
73
0
16 Dec 2022
Mutual Information Alleviates Hallucinations in Abstractive
  Summarization
Mutual Information Alleviates Hallucinations in Abstractive Summarization
Liam van der Poel
Ryan Cotterell
Clara Meister
HILM
100
61
0
24 Oct 2022
Out-of-Distribution Detection and Selective Generation for Conditional
  Language Models
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Jie Jessie Ren
Jiaming Luo
Yao-Min Zhao
Kundan Krishna
Mohammad Saleh
Balaji Lakshminarayanan
Peter J. Liu
OODD
124
114
0
30 Sep 2022
Language Models (Mostly) Know What They Know
Language Models (Mostly) Know What They Know
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
...
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
133
833
0
11 Jul 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
900
13,228
0
04 Mar 2022
A Survey on Automated Fact-Checking
A Survey on Automated Fact-Checking
Zhijiang Guo
Michael Schlichtkrull
Andreas Vlachos
109
495
0
26 Aug 2021
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
169
2,761
0
05 Jun 2020
Unsupervised Quality Estimation for Neural Machine Translation
Unsupervised Quality Estimation for Neural Machine Translation
M. Fomicheva
Shuo Sun
Lisa Yankovskaya
Frédéric Blain
Francisco Guzmán
Mark Fishel
Nikolaos Aletras
Vishrav Chaudhary
Lucia Specia
UQLM
91
209
0
21 May 2020
Fact or Fiction: Verifying Scientific Claims
Fact or Fiction: Verifying Scientific Claims
David Wadden
Shanchuan Lin
Kyle Lo
Lucy Lu Wang
Madeleine van Zuylen
Arman Cohan
Hannaneh Hajishirzi
HAI
186
465
0
30 Apr 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,332
0
27 Aug 2019
A Simple Unified Framework for Detecting Out-of-Distribution Samples and
  Adversarial Attacks
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks
Kimin Lee
Kibok Lee
Honglak Lee
Jinwoo Shin
OODD
199
2,064
0
10 Jul 2018
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCVBDL
850
5,849
0
05 Dec 2016
1