Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.03987
Cited By
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
8 July 2023
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation"
50 / 120 papers shown
Title
Adaptive Stress Testing Black-Box LLM Planners
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
28
0
0
08 May 2025
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
72
1
0
05 May 2025
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
Jihao Zhao
Chunlai Zhou
Biao Qin
52
0
0
05 May 2025
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Dylan Bouchard
Mohit Singh Chauhan
HILM
81
0
0
27 Apr 2025
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Atharva Kulkarni
Yuan-kang Zhang
Joel Ruben Antony Moniz
Xiou Ge
Bo-Hsiang Tseng
Dhivya Piraviperumal
Shri Kiran Srinivasan
Hong-ye Yu
HILM
86
0
0
25 Apr 2025
Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey
Ahsan Bilal
Muhammad Ahmed Mohsin
Muhammad Umer
Muhammad Awais Khan Bangash
Muhammad Ali Jamshed
LLMAG
LRM
AI4CE
53
0
0
20 Apr 2025
C-FAITH: A Chinese Fine-Grained Benchmark for Automated Hallucination Evaluation
Xu Zhang
Zhifei Liu
Jiahao Wang
Huixuan Zhang
Fan Xu
Junzhe Zhang
Xiaojun Wan
HILM
36
0
0
14 Apr 2025
How to Detect and Defeat Molecular Mirage: A Metric-Driven Benchmark for Hallucination in LLM-based Molecular Comprehension
Hao Li
Liuzhenghao Lv
He Cao
Zijing Liu
Zhiyuan Yan
Yu Wang
Yonghong Tian
Y. Li
Li Yuan
30
0
0
10 Apr 2025
Don't Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning
Yuehan Qin
Shawn Li
Yi Nian
Xinyan Velocity Yu
Yue Zhao
Xuezhe Ma
HILM
LRM
37
0
0
08 Apr 2025
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models
Z. Wang
Zhongxin Liu
Ying Li
Hongyu Sun
Meng Xu
Yuqing Zhang
HILM
53
0
0
25 Mar 2025
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
67
0
0
18 Mar 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
51
1
0
18 Mar 2025
HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
Samir Abdaljalil
Hasan Kurban
Erchin Serpedin
HILM
59
0
0
10 Mar 2025
Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Luyi Jiang
J. Chen
Lu Lu
Xinwei Peng
Lihao Liu
Junjun He
Jie Xu
ELM
LM&MA
37
0
0
10 Mar 2025
Bián: A Bilingual Benchmark and Model for Hallucination Detection in Retrieval-Augmented Generation
Zhouyu Jiang
Mengshu Sun
Qing Cui
Lei Liang
RALM
3DV
233
0
0
26 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
32
1
0
24 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations
Borui Yang
Md Afif Al Mamun
Jie M. Zhang
Gias Uddin
HILM
64
0
0
20 Feb 2025
Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection
Yihao Xue
Kristjan Greenewald
Youssef Mroueh
Baharan Mirzasoleiman
HILM
59
1
0
20 Feb 2025
Revisiting Robust RAG: Do We Still Need Complex Robust Training in the Era of Powerful LLMs?
Hanxing Ding
Shuchang Tao
Liang Pang
Zihao Wei
Liwei Chen
Kun Xu
Huawei Shen
Xueqi Cheng
31
0
0
17 Feb 2025
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Qiujie Xie
Qingqiu Li
Zhuohao Yu
Yuejie Zhang
Yue Zhang
Linyi Yang
ELM
63
1
0
15 Feb 2025
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
Boyang Xue
Fei Mi
Qi Zhu
Hongru Wang
Rui Wang
Sheng Wang
Erxin Yu
Xuming Hu
Kam-Fai Wong
HILM
74
0
0
16 Dec 2024
HalluCana: Fixing LLM Hallucination with A Canary Lookahead
Tianyi Li
Erenay Dayanik
Shubhi Tyagi
Andrea Pierleoni
HILM
75
0
0
10 Dec 2024
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Javier Ferrando
Oscar Obeso
Senthooran Rajamanoharan
Neel Nanda
82
10
0
21 Nov 2024
VALTEST: Automated Validation of Language Model Generated Test Cases
Hamed Taherkhani
Hadi Hemmati
50
2
0
13 Nov 2024
LLM-R: A Framework for Domain-Adaptive Maintenance Scheme Generation Combining Hierarchical Agents and RAG
Laifa Tao
Qixuan Huang
Xianjun Wu
Weiwei Zhang
Yunlong Wu
Bin Li
Chen Lu
Xingshuo Hai
49
0
0
07 Nov 2024
Provenance: A Light-weight Fact-checker for Retrieval Augmented LLM Generation Output
Hithesh Sankararaman
Mohammed Nasheed Yasin
Tanner Sorensen
Alessandro Di Bari
Andreas Stolcke
HILM
30
0
0
01 Nov 2024
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Qitan Lv
Jie Wang
Hanzhu Chen
Bin Li
Yongdong Zhang
Feng Wu
HILM
28
3
0
19 Oct 2024
MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models
Boyang Xue
Hongru Wang
Rui Wang
Sheng Wang
Zezhong Wang
Yiming Du
Bin Liang
Kam-Fai Wong
29
0
0
16 Oct 2024
The Effects of Hallucinations in Synthetic Training Data for Relation Extraction
Steven Rogulsky
Nicholas Popovic
Michael Färber
HILM
32
1
0
10 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
56
7
0
09 Oct 2024
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
34
15
0
03 Oct 2024
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
32
4
0
27 Sep 2024
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts
Taehun Cha
Donghun Lee
HILM
29
1
0
25 Sep 2024
Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling
Xinyue Fang
Zhen Huang
Zhiliang Tian
Minghui Fang
Ziyi Pan
Quntian Fang
Zhihua Wen
Hengyue Pan
Dongsheng Li
HILM
93
2
0
17 Sep 2024
CLUE: Concept-Level Uncertainty Estimation for Large Language Models
Yu-Hsiang Wang
Andrew Bai
Che-Ping Tsai
Cho-Jui Hsieh
LRM
27
0
0
04 Sep 2024
Defining Boundaries: A Spectrum of Task Feasibility for Large Language Models
Wenbo Zhang
Zihang Xu
Hengrui Cai
30
1
0
11 Aug 2024
Adaptive Retrieval-Augmented Generation for Conversational Systems
Xi Wang
Procheta Sen
Ruizhe Li
Emine Yilmaz
RALM
28
5
0
31 Jul 2024
On Mitigating Code LLM Hallucinations with API Documentation
Nihal Jain
Robert Kwiatkowski
Baishakhi Ray
M. K. Ramanathan
Varun Kumar
41
7
0
13 Jul 2024
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
Yuzhe Gu
Ziwei Ji
Wenwei Zhang
Chengqi Lyu
Dahua Lin
Kai Chen
HILM
39
5
0
05 Jul 2024
Leveraging Graph Structures to Detect Hallucinations in Large Language Models
Noa Nonkes
Sergei Agaronian
Evangelos Kanoulas
Roxana Petcu
26
1
0
05 Jul 2024
Entropy-Based Decoding for Retrieval-Augmented Large Language Models
Zexuan Qiu
Zijing Ou
Bin Wu
Jingjing Li
Aiwei Liu
Irwin King
KELM
RALM
43
5
0
25 Jun 2024
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs
Jannik Kossen
Jiatong Han
Muhammed Razzak
Lisa Schut
Shreshth A. Malik
Yarin Gal
HILM
60
34
0
22 Jun 2024
R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models
Shangqing Tu
Yuanchun Wang
Jifan Yu
Yuyang Xie
Yaran Shi
Xiaozhi Wang
Jing Zhang
Lei Hou
Juanzi Li
ELM
40
3
0
17 Jun 2024
Mitigating Large Language Model Hallucination with Faithful Finetuning
Minda Hu
Bowei He
Yufei Wang
Liangyou Li
Chen-li Ma
Irwin King
HILM
46
7
0
17 Jun 2024
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals
Lida Chen
Zujie Liang
Xintao Wang
Jiaqing Liang
Yanghua Xiao
Feng Wei
Jinglei Chen
Zhenghong Hao
Bing Han
Wei Wang
55
10
0
16 Jun 2024
Large language model validity via enhanced conformal prediction methods
John J. Cherian
Isaac Gibbs
Emmanuel J. Candès
34
21
0
14 Jun 2024
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
HILM
42
2
0
11 Jun 2024
Estimating the Hallucination Rate of Generative AI
Andrew Jesson
Nicolas Beltran-Velez
Quentin Chu
Sweta Karlekar
Jannik Kossen
Yarin Gal
John P. Cunningham
David M. Blei
51
6
0
11 Jun 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou
Yang Zhang
Jacob Andreas
Shiyu Chang
77
5
0
11 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Neeraj Varshney
Satyam Raj
Venkatesh Mishra
Agneet Chatterjee
Ritika Sarkar
Amir Saeidi
Chitta Baral
LRM
35
7
0
08 Jun 2024
1
2
3
Next