Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14251
Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
50 / 455 papers shown
Title
Comparing Hallucination Detection Metrics for Multilingual Generation
Haoqiang Kang
Terra Blevins
Luke Zettlemoyer
HILM
37
16
0
16 Feb 2024
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Jiaheng Wei
Yuanshun Yao
Jean-François Ton
Hongyi Guo
Andrew Estornell
Yang Liu
HILM
55
18
0
16 Feb 2024
Language Models with Conformal Factuality Guarantees
Christopher Mohri
Tatsunori Hashimoto
HILM
44
33
0
15 Feb 2024
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
Hanyu Duan
Yi Yang
Kar Yan Tam
HILM
27
28
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
42
45
0
14 Feb 2024
Into the Unknown: Self-Learning Large Language Models
Teddy Ferdinan
Jan Kocoñ
P. Kazienko
33
2
0
14 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
33
32
0
13 Feb 2024
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski
Jingwei Ni
Mathias Kraus
Elliott Ash
Markus Leippold
29
4
0
13 Feb 2024
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing
Hochul Hwang
Sunjae Kwon
Yekyung Kim
Donghyun Kim
32
11
0
09 Feb 2024
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
27
7
0
09 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
134
371
0
09 Feb 2024
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations
Cheng-Han Chiang
Hung-yi Lee
HILM
75
8
0
08 Feb 2024
Training Language Models to Generate Text with Citations via Fine-grained Rewards
Chengyu Huang
Zeqiu Wu
Yushi Hu
Wenya Wang
HILM
LRM
79
26
0
06 Feb 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRM
HILM
38
7
0
04 Feb 2024
How well do LLMs cite relevant medical references? An evaluation framework and analyses
Kevin Wu
Eric Wu
Ally Cassasola
Angela Zhang
Kevin Wei
Teresa Nguyen
Sith Riantawan
Patricia Shi Riantawan
Daniel E. Ho
James Zou
LM&MA
ELM
AI4MH
31
26
0
03 Feb 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim
Minjoon Seo
SyDa
ALM
31
0
0
02 Feb 2024
A Survey on Hallucination in Large Vision-Language Models
Hanchao Liu
Wenyuan Xue
Yifei Chen
Dapeng Chen
Xiutian Zhao
Ke Wang
Liping Hou
Rong-Zhi Li
Wei Peng
LRM
MLLM
32
115
0
01 Feb 2024
Corrective Retrieval Augmented Generation
Shi-Qi Yan
Jia-Chen Gu
Yun Zhu
Zhen-Hua Ling
RALM
155
72
0
29 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
43
12
0
26 Jan 2024
K-QA: A Real-World Medical Q&A Benchmark
Itay Manes
Naama Ronn
David Cohen
Ran Ilan Ber
Zehavi Horowitz-Kugler
Gabriel Stanovsky
LM&MA
HILM
AI4MH
28
11
0
25 Jan 2024
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Haritz Puerto
Martin Tutek
Somak Aditya
Xiaodan Zhu
Iryna Gurevych
ReCod
ReLM
LRM
56
10
0
18 Jan 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
M. A. D. L. Balaguer
Vinamra Benara
Renato Luiz de Freitas Cunha
Roberto de M. Estevao Filho
Todd Hendry
...
Morris Sharp
B. Silva
Swati Sharma
Vijay Aski
Ranveer Chandra
FaML
38
82
0
16 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MA
ELM
39
9
0
13 Jan 2024
Fine-grained Hallucination Detection and Editing for Language Models
Abhika Mishra
Akari Asai
Vidhisha Balachandran
Yizhong Wang
Graham Neubig
Yulia Tsvetkov
Hannaneh Hajishirzi
HILM
37
79
0
12 Jan 2024
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
Siyu Yuan
Kaitao Song
Jiangjie Chen
Xu Tan
Yongliang Shen
Ren Kan
Dongsheng Li
Deqing Yang
LLMAG
28
54
0
11 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
60
56
0
11 Jan 2024
LightHouse: A Survey of AGI Hallucination
Feng Wang
LRM
HILM
VLM
32
3
0
08 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
22
7
0
04 Jan 2024
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Matthew Dahl
Varun Magesh
Mirac Suzgun
Daniel E. Ho
HILM
AILaw
25
73
0
02 Jan 2024
Do Androids Know They're Only Dreaming of Electric Sheep?
Sky CH-Wang
Benjamin Van Durme
Jason Eisner
Chris Kedzie
HILM
35
27
0
28 Dec 2023
Alleviating Hallucinations of Large Language Models through Induced Hallucinations
Yue Zhang
Leyang Cui
Wei Bi
Shuming Shi
HILM
42
50
0
25 Dec 2023
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
Arnav Singhvi
Manish Shetty
Shangyin Tan
Christopher Potts
Koushik Sen
Matei A. Zaharia
Omar Khattab
25
16
0
20 Dec 2023
Towards Verifiable Text Generation with Evolving Memory and Self-Reflection
Hao Sun
Hengyi Cai
Bo Wang
Yingyan Hou
Xiaochi Wei
Shuaiqiang Wang
Yan Zhang
Dawei Yin
44
9
0
14 Dec 2023
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILM
ELM
38
2
0
14 Dec 2023
Alignment for Honesty
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
44
30
0
12 Dec 2023
Dense X Retrieval: What Retrieval Granularity Should We Use?
Tong Chen
Hongwei Wang
Sihao Chen
Wenhao Yu
Kaixin Ma
Xinran Zhao
Hongming Zhang
Dong Yu
27
30
0
11 Dec 2023
User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan
Meng Jiang
30
8
0
11 Dec 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Chenyu You
ELM
CLL
AI4MH
LRM
ALM
85
27
0
28 Nov 2023
RELIC: Investigating Large Language Model Responses using Self-Consistency
Furui Cheng
Vilém Zouhar
Simran Arora
Mrinmaya Sachan
Hendrik Strobelt
Mennatallah El-Assady
HILM
24
18
0
28 Nov 2023
A Survey of the Evolution of Language Model-Based Dialogue Systems
Hongru Wang
Lingzhi Wang
Yiming Du
Liang Chen
Jing Zhou
Yufei Wang
Kam-Fai Wong
LRM
67
21
0
28 Nov 2023
Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination
Haoqiang Kang
Xiao-Yang Liu
RALM
47
27
0
27 Nov 2023
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
Xun Liang
Shichao Song
Simin Niu
Zhiyu Li
Zhiyu Li
...
Zhaohui Wy
Dawei He
Peng Cheng
Zhonghao Wang
Haiying Deng
HILM
34
19
0
26 Nov 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng-Wei Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
77
49
0
22 Nov 2023
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?
Bangzheng Li
Ben Zhou
Fei Wang
Xingyu Fu
Dan Roth
Muhao Chen
HILM
LRM
29
15
0
16 Nov 2023
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation
Yiqing Xie
Sheng Zhang
Hao Cheng
Pengfei Liu
Zelalem Gero
Cliff Wong
Tristan Naumann
Hoifung Poon
Carolyn Rose
MedIm
24
4
0
16 Nov 2023
Effective Large Language Model Adaptation for Improved Grounding and Citation Generation
Xi Ye
Ruoxi Sun
Sercan Ö. Arik
Tomas Pfister
HILM
34
25
0
16 Nov 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
28
6
0
16 Nov 2023
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
Jon Saad-Falcon
Omar Khattab
Christopher Potts
Matei A. Zaharia
RALM
30
106
0
16 Nov 2023
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification
Haoqiang Kang
Juntong Ni
Huaxiu Yao
HILM
LRM
32
34
0
15 Nov 2023
How Well Do Large Language Models Truly Ground?
Hyunji Lee
Se June Joo
Chaeeun Kim
Joel Jang
Doyoung Kim
Kyoung-Woon On
Minjoon Seo
HILM
33
6
0
15 Nov 2023
Previous
1
2
3
...
10
7
8
9
Next