Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.08704
Cited By
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
18 April 2021
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation"
50 / 108 papers shown
Title
Hallucination Diversity-Aware Active Learning for Text Summarization
Yu Xia
Xu Liu
Tong Yu
Sungchul Kim
Ryan A. Rossi
Anup B. Rao
Tung Mai
Shuai Li
HILM
40
3
0
02 Apr 2024
FACTOID: FACtual enTailment fOr hallucInation Detection
Vipula Rawte
S. M. Towhidul
Krishnav Rajbangshi
Shravani Nag
Aman Chadha
Amit P. Sheth
Amitava Das
HILM
42
3
0
28 Mar 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
36
34
0
27 Mar 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
57
9
0
25 Mar 2024
DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models
Weihang Su
Yichen Tang
Qingyao Ai
Zhijing Wu
Yiqun Liu
3DV
RALM
AI4TS
SyDa
51
18
0
15 Mar 2024
SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
Timothee Mickus
Elaine Zosa
Raúl Vázquez
Teemu Vahtola
Jörg Tiedemann
Vincent Segonne
Alessandro Raganato
Marianna Apidianaki
HILM
LRM
43
21
0
12 Mar 2024
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models
Weihang Su
Changyue Wang
Qingyao Ai
Hu Yiran
Zhijing Wu
Yujia Zhou
Yiqun Liu
HILM
39
28
0
11 Mar 2024
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Jio Oh
Soyeon Kim
Junseok Seo
Jindong Wang
Ruochen Xu
Xing Xie
Steven Euijong Whang
38
1
0
08 Mar 2024
In Search of Truth: An Interrogation Approach to Hallucination Detection
Yakir Yehuda
Itzik Malkiel
Oren Barkan
Jonathan Weill
Royi Ronen
Noam Koenigstein
HILM
21
1
0
05 Mar 2024
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models
Kedi Chen
Qin Chen
Jie Zhou
Yishen He
Liang He
HILM
38
1
0
01 Mar 2024
Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections
Lingjun Zhao
Khanh Nguyen
Hal Daumé
37
1
0
26 Feb 2024
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan
Shoumik Saha
Gaurang Sriramanan
Priyatham Kattakinda
Atoosa Malemir Chegini
S. Feizi
MIALM
40
34
0
23 Feb 2024
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations
Cheng-Han Chiang
Hung-yi Lee
HILM
69
8
0
08 Feb 2024
Benchmarking Large Multimodal Models against Common Corruptions
Jiawei Zhang
Tianyu Pang
Chao Du
Yi Ren
Bo-wen Li
Min-Bin Lin
MLLM
32
14
0
22 Jan 2024
On the Audio Hallucinations in Large Audio-Video Language Models
Taichi Nishimura
Shota Nakada
Masayoshi Kondo
VLM
27
5
0
18 Jan 2024
Hallucination Detection and Hallucination Mitigation: An Investigation
Junliang Luo
Tianyu Li
Di Wu
Michael R. M. Jenkin
Steve Liu
Gregory Dudek
HILM
LLMAG
44
22
0
16 Jan 2024
Small Language Model Can Self-correct
Haixia Han
Jiaqing Liang
Jie Shi
Qi He
Yanghua Xiao
LRM
SyDa
ReLM
KELM
40
11
0
14 Jan 2024
Parameter-Efficient Detoxification with Contrastive Decoding
Tong Niu
Caiming Xiong
Semih Yavuz
Yingbo Zhou
25
12
0
13 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
57
56
0
11 Jan 2024
AI Hallucinations: A Misnomer Worth Clarifying
Negar Maleki
Balaji Padmanabhan
Kaushik Dutta
28
34
0
09 Jan 2024
User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan
Meng Jiang
28
8
0
11 Dec 2023
DelucionQA: Detecting Hallucinations in Domain-specific Question Answering
Mobashir Sadat
Zhengyu Zhou
Lukas Lange
Jun Araki
Arsalan Gundroo
Bingqing Wang
Rakesh R Menon
Md. Rizwan Parvez
Zhe Feng
HILM
37
36
0
08 Dec 2023
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
Xun Liang
Shichao Song
Simin Niu
Zhiyu Li
Feiyu Xiong
...
Zhaohui Wy
Dawei He
Peng Cheng
Zhonghao Wang
Haiying Deng
HILM
34
19
0
26 Nov 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng-Wei Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
77
48
0
22 Nov 2023
Extrinsically-Focused Evaluation of Omissions in Medical Summarization
Elliot Schumacher
Daniel Rosenthal
Varun Nair
Luladay Price
Geoffrey Tso
Anitha Kannan
16
2
0
14 Nov 2023
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Yi Liu
Lianzhe Huang
Shicheng Li
Sishuo Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
RALM
62
33
0
14 Nov 2023
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency
Jiaxin Zhang
Zhuohang Li
Kamalika Das
Bradley Malin
Kumar Sricharan
HILM
LRM
24
56
0
03 Nov 2023
Sequence-Level Certainty Reduces Hallucination In Knowledge-Grounded Dialogue Generation
Yixin Wan
Fanyou Wu
Weijie Xu
Srinivasan H. Sengamedu
HILM
24
5
0
28 Oct 2023
Hallucination Detection for Grounded Instruction Generation
Lingjun Zhao
Khanh Nguyen
Hal Daumé
HILM
39
7
0
23 Oct 2023
Language Models Hallucinate, but May Excel at Fact Verification
Jian-Yu Guan
Jesse Dodge
David Wadden
Minlie Huang
Hao Peng
LRM
HILM
31
28
0
23 Oct 2023
LUNA: A Model-Based Universal Analysis Framework for Large Language Models
Da Song
Xuan Xie
Jiayang Song
Derui Zhu
Yuheng Huang
Felix Juefei Xu
Lei Ma
ALM
35
3
0
22 Oct 2023
Core Building Blocks: Next Gen Geo Spatial GPT Application
Ashley Fernandez
Swaraj Dube
24
5
0
17 Oct 2023
DORIS-MAE: Scientific Document Retrieval using Multi-level Aspect-based Queries
Jianyou Wang
Kaicheng Wang
Xiaoyue Wang
Prudhviraj Naidu
Leon Bergen
R. Paturi
42
11
0
07 Oct 2023
ExpertQA: Expert-Curated Questions and Attributed Answers
Chaitanya Malaviya
Subin Lee
Sihao Chen
Elizabeth Sieber
Mark Yatskar
Dan Roth
ELM
HILM
22
50
0
14 Sep 2023
Zero-Resource Hallucination Prevention for Large Language Models
Junyu Luo
Cao Xiao
Fenglong Ma
HILM
29
16
0
06 Sep 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
46
520
0
03 Sep 2023
Med-HALT: Medical Domain Hallucination Test for Large Language Models
Ankit Pal
Logesh Kumar Umapathi
Malaikannan Sankarasubbu
HILM
LM&MA
VLM
28
128
0
28 Jul 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Seonghyeon Ye
Doyoung Kim
Sungdong Kim
Hyeonbin Hwang
Seungone Kim
Yongrae Jo
James Thorne
Juho Kim
Minjoon Seo
ALM
40
98
0
20 Jul 2023
Generating Benchmarks for Factuality Evaluation of Language Models
Dor Muhlgay
Ori Ram
Inbal Magar
Yoav Levine
Nir Ratner
Yonatan Belinkov
Omri Abend
Kevin Leyton-Brown
Amnon Shashua
Y. Shoham
HILM
25
91
0
13 Jul 2023
Red Teaming Language Model Detectors with Language Models
Zhouxing Shi
Yihan Wang
Fan Yin
Xiangning Chen
Kai-Wei Chang
Cho-Jui Hsieh
DeLMO
14
50
0
31 May 2023
Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Ximing Lu
Faeze Brahman
Peter West
Jaehun Jang
Khyathi Raghavi Chandu
...
Bill Yuchen Lin
Skyler Hallinan
Xiang Ren
Sean Welleck
Yejin Choi
22
26
0
24 May 2023
Distilling Script Knowledge from Large Language Models for Constrained Language Planning
Siyu Yuan
Jiangjie Chen
Ziquan Fu
Xuyang Ge
Soham Shah
C. R. Jankowski
Yanghua Xiao
Deqing Yang
43
47
0
09 May 2023
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder
Z. Fu
W. Lam
Qian Yu
Anthony Man-Cho So
Shengding Hu
Zhiyuan Liu
Nigel Collier
AuLLM
36
41
0
08 Apr 2023
Assessing Language Model Deployment with Risk Cards
Leon Derczynski
Hannah Rose Kirk
Vidhisha Balachandran
Sachin Kumar
Yulia Tsvetkov
M. Leiser
Saif Mohammad
28
42
0
31 Mar 2023
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Potsawee Manakul
Adian Liusie
Mark J. F. Gales
HILM
LRM
152
391
0
15 Mar 2023
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Daniel D. Johnson
Daniel Tarlow
Christian J. Walder
21
6
0
01 Mar 2023
Event knowledge in large language models: the gap between the impossible and the unlikely
Carina Kauf
Anna A. Ivanova
Giulia Rambelli
Emmanuele Chersoni
Jingyuan Selena She
Zawad Chowdhury
Evelina Fedorenko
Alessandro Lenci
37
67
0
02 Dec 2022
RuCoLA: Russian Corpus of Linguistic Acceptability
Vladislav Mikhailov
T. Shamardina
Max Ryabinin
A. Pestova
I. Smurov
Ekaterina Artemova
30
28
0
23 Oct 2022
ThinkSum: Probabilistic reasoning over sets using large language models
Batu Mehmet Ozturkler
Nikolay Malkin
Zhen Wang
Nebojsa Jojic
ReLM
LRM
49
22
0
04 Oct 2022
Generating Full Length Wikipedia Biographies: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies
Angela Fan
Claire Gardent
22
4
0
12 Apr 2022
Previous
1
2
3
Next