ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTML

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 513 papers shown
Title
A Survey of the Evolution of Language Model-Based Dialogue Systems
A Survey of the Evolution of Language Model-Based Dialogue Systems
Hongru Wang
Lingzhi Wang
Yiming Du
Liang Chen
Jing Zhou
Yufei Wang
Kam-Fai Wong
LRM
149
23
0
28 Nov 2023
Deficiency of Large Language Models in Finance: An Empirical Examination
  of Hallucination
Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination
Haoqiang Kang
Xiao-Yang Liu
RALM
90
33
0
27 Nov 2023
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models
  via Unconstrained Generation
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
Xun Liang
Shichao Song
Pengnian Qi
Zhiyu Li
Feiyu Xiong
...
Zhaohui Wy
Dawei He
Peng Cheng
Zhonghao Wang
Haiying Deng
HILM
80
22
0
26 Nov 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
138
59
0
22 Nov 2023
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go
  without Hallucination?
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?
Bangzheng Li
Ben Zhou
Fei Wang
Xingyu Fu
Dan Roth
Muhao Chen
HILMLRM
104
22
0
16 Nov 2023
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text
  Generation
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation
Yiqing Xie
Sheng Zhang
Hao Cheng
Pengfei Liu
Zelalem Gero
Cliff Wong
Tristan Naumann
Hoifung Poon
Carolyn Rose
MedIm
73
5
0
16 Nov 2023
Effective Large Language Model Adaptation for Improved Grounding and
  Citation Generation
Effective Large Language Model Adaptation for Improved Grounding and Citation Generation
Xi Ye
Ruoxi Sun
Sercan O. Arik
Tomas Pfister
HILM
112
30
0
16 Nov 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven
  Negative Samples Generation
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
72
6
0
16 Nov 2023
ARES: An Automated Evaluation Framework for Retrieval-Augmented
  Generation Systems
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
Jon Saad-Falcon
Omar Khattab
Christopher Potts
Matei A. Zaharia
RALM
108
121
0
16 Nov 2023
Ever: Mitigating Hallucination in Large Language Models through
  Real-Time Verification and Rectification
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification
Haoqiang Kang
Juntong Ni
Huaxiu Yao
HILMLRM
111
37
0
15 Nov 2023
How Well Do Large Language Models Truly Ground?
How Well Do Large Language Models Truly Ground?
Hyunji Lee
Se June Joo
Chaeeun Kim
Joel Jang
Doyoung Kim
Kyoung-Woon On
Minjoon Seo
HILM
95
8
0
15 Nov 2023
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic
  Fact-checkers
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers
Yuxia Wang
Revanth Gangi Reddy
Zain Muhammad Mujahid
Arnav Arora
Aleksandr Rubashevskii
...
Nadav Borenstein
Aditya Pillai
Isabelle Augenstein
Iryna Gurevych
Preslav Nakov
HILM
125
42
0
15 Nov 2023
Fine-tuning Language Models for Factuality
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELMHILMSyDa
85
185
0
14 Nov 2023
Extrinsically-Focused Evaluation of Omissions in Medical Summarization
Extrinsically-Focused Evaluation of Omissions in Medical Summarization
Elliot Schumacher
Daniel Rosenthal
Varun Nair
Luladay Price
Geoffrey Tso
Anitha Kannan
44
2
0
14 Nov 2023
LLatrieval: LLM-Verified Retrieval for Verifiable Generation
LLatrieval: LLM-Verified Retrieval for Verifiable Generation
Xiaonan Li
Changtai Zhu
Linyang Li
Zhangyue Yin
Tianxiang Sun
Xipeng Qiu
RALM
93
31
0
14 Nov 2023
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRMHILM
142
939
0
09 Nov 2023
SEMQA: Semi-Extractive Multi-Source Question Answering
SEMQA: Semi-Extractive Multi-Source Question Answering
Tal Schuster
Á. Lelkes
Haitian Sun
Jai Gupta
Jonathan Berant
W. Cohen
Donald Metzler
70
14
0
08 Nov 2023
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic
  Representations
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations
Sihao Chen
Hongming Zhang
Tong Chen
Ben Zhou
Wenhao Yu
Dian Yu
Baolin Peng
Hongwei Wang
Dan Roth
Dong Yu
SSL
102
15
0
07 Nov 2023
A Survey of Large Language Models Attribution
A Survey of Large Language Models Attribution
Dongfang Li
Zetian Sun
Xinshuo Hu
Zhenyu Liu
Ziyang Chen
Baotian Hu
Aiguo Wu
Min Zhang
HILM
96
54
0
07 Nov 2023
FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models
FAITHSCORE: Evaluating Hallucinations in Large Vision-Language Models
Liqiang Jing
Ruosen Li
Yunmo Chen
Mengzhao Jia
Xinya Du
MLLM
93
7
0
02 Nov 2023
LitCab: Lightweight Language Model Calibration over Short- and Long-form
  Responses
LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses
Xin Liu
Muhammad Khalifa
Lu Wang
ALM
86
27
0
30 Oct 2023
Davidsonian Scene Graph: Improving Reliability in Fine-grained
  Evaluation for Text-to-Image Generation
Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation
Jaemin Cho
Yushi Hu
Roopal Garg
Peter Anderson
Ranjay Krishna
Jason Baldridge
Mohit Bansal
Jordi Pont-Tuset
Su Wang
EGVM
84
81
0
27 Oct 2023
Language Models Hallucinate, but May Excel at Fact Verification
Language Models Hallucinate, but May Excel at Fact Verification
Jian Guan
Jesse Dodge
David Wadden
Minlie Huang
Hao Peng
LRMHILM
111
33
0
23 Oct 2023
Large Language Models Help Humans Verify Truthfulness -- Except When
  They Are Convincingly Wrong
Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong
Chenglei Si
Navita Goyal
Sherry Tongshuang Wu
Chen Zhao
Shi Feng
Hal Daumé
Jordan L. Boyd-Graber
LRM
107
41
0
19 Oct 2023
Quantifying Self-diagnostic Atomic Knowledge in Chinese Medical
  Foundation Model: A Computational Analysis
Quantifying Self-diagnostic Atomic Knowledge in Chinese Medical Foundation Model: A Computational Analysis
Yaxin Fan
Feng Jiang
Benyou Wang
Peifeng Li
Haizhou Li
78
1
0
18 Oct 2023
Self-RAG: Learning to Retrieve, Generate, and Critique through
  Self-Reflection
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai
Zeqiu Wu
Yizhong Wang
Avirup Sil
Hannaneh Hajishirzi
RALM
283
783
0
17 Oct 2023
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large
  Language Models
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models
Yuyang Bai
Shangbin Feng
Vidhisha Balachandran
Zhaoxuan Tan
Shiqi Lou
Tianxing He
Yulia Tsvetkov
ELM
95
3
0
15 Oct 2023
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level
  Hallucination Detection
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
Sehyun Choi
Tianqing Fang
Zhaowei Wang
Yangqiu Song
85
39
0
13 Oct 2023
Prometheus: Inducing Fine-grained Evaluation Capability in Language
  Models
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Seungone Kim
Jamin Shin
Yejin Cho
Joel Jang
Shayne Longpre
...
Sangdoo Yun
Seongjin Shin
Sungdong Kim
James Thorne
Minjoon Seo
ALMLM&MAELM
113
240
0
12 Oct 2023
Survey on Factuality in Large Language Models: Knowledge, Retrieval and
  Domain-Specificity
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
Cunxiang Wang
Xiaoze Liu
Yuanhao Yue
Xiangru Tang
Tianhang Zhang
...
Linyi Yang
Jindong Wang
Xing Xie
Zheng Zhang
Yue Zhang
HILMKELM
172
201
0
11 Oct 2023
Beyond Factuality: A Comprehensive Evaluation of Large Language Models
  as Knowledge Generators
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators
Liang Chen
Yang Deng
Yatao Bian
Zeyu Qin
Bingzhe Wu
Tat-Seng Chua
Kam-Fai Wong
HILMELM
119
47
0
11 Oct 2023
Teaching Language Models to Hallucinate Less with Synthetic Tasks
Teaching Language Models to Hallucinate Less with Synthetic Tasks
Erik Jones
Hamid Palangi
Clarisse Simoes
Varun Chandrasekaran
Subhabrata Mukherjee
Arindam Mitra
Ahmed Hassan Awadallah
Ece Kamar
HILM
87
27
0
10 Oct 2023
Factuality Challenges in the Era of Large Language Models
Factuality Challenges in the Era of Large Language Models
Isabelle Augenstein
Timothy Baldwin
Meeyoung Cha
Tanmoy Chakraborty
Giovanni Luca Ciampaglia
...
Rubén Míguez
Preslav Nakov
Dietram A. Scheufele
Shivam Sharma
Giovanni Zagni
HILM
81
44
0
08 Oct 2023
Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language
  Models
Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language Models
Wenxuan Ding
Shangbin Feng
Yuhan Liu
Zhaoxuan Tan
Vidhisha Balachandran
Tianxing He
Yulia Tsvetkov
LRM
77
6
0
02 Oct 2023
BooookScore: A systematic exploration of book-length summarization in
  the era of LLMs
BooookScore: A systematic exploration of book-length summarization in the era of LLMs
Yapei Chang
Kyle Lo
Tanya Goyal
Mohit Iyyer
ALM
154
117
0
01 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
FELM: Benchmarking Factuality Evaluation of Large Language Models
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
126
41
0
01 Oct 2023
STRONG -- Structure Controllable Legal Opinion Summary Generation
STRONG -- Structure Controllable Legal Opinion Summary Generation
Yang Zhong
Diane Litman
ELMAILaw
60
3
0
29 Sep 2023
Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI
Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI
Muhammad Aurangzeb Ahmad
Ilker Yaramis
Taposh Dutta Roy
LM&MAHILM
68
37
0
26 Sep 2023
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of
  Language Models
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Mert Yuksekgonul
Varun Chandrasekaran
Erik Jones
Suriya Gunasekar
Ranjita Naik
Hamid Palangi
Ece Kamar
Besmira Nushi
HILM
60
49
0
26 Sep 2023
Large Language Model Alignment: A Survey
Large Language Model Alignment: A Survey
Tianhao Shen
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
112
206
0
26 Sep 2023
Ragas: Automated Evaluation of Retrieval Augmented Generation
Ragas: Automated Evaluation of Retrieval Augmented Generation
ES Shahul
Jithin James
Luis Espinosa-Anke
Steven Schockaert
145
205
0
26 Sep 2023
Calibrating LLM-Based Evaluator
Calibrating LLM-Based Evaluator
Yuxuan Liu
Tianchi Yang
Shaohan Huang
Zihan Zhang
Haizhen Huang
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
114
33
0
23 Sep 2023
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive
  Summarisation
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive Summarisation
Jennifer A Bishop
Qianqian Xie
Sophia Ananiadou
HILM
82
12
0
21 Sep 2023
Chain-of-Verification Reduces Hallucination in Large Language Models
Chain-of-Verification Reduces Hallucination in Large Language Models
Shehzaad Dhuliawala
M. Komeili
Jing Xu
Roberta Raileanu
Xian Li
Asli Celikyilmaz
Jason Weston
LRMHILM
70
206
0
20 Sep 2023
Exploring the impact of low-rank adaptation on the performance,
  efficiency, and regularization of RLHF
Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Simeng Sun
Dhawal Gupta
Mohit Iyyer
89
20
0
16 Sep 2023
ExpertQA: Expert-Curated Questions and Attributed Answers
ExpertQA: Expert-Curated Questions and Attributed Answers
Chaitanya Malaviya
Subin Lee
Sihao Chen
Elizabeth Sieber
Mark Yatskar
Dan Roth
ELMHILM
116
58
0
14 Sep 2023
Zero-shot Audio Topic Reranking using Large Language Models
Zero-shot Audio Topic Reranking using Large Language Models
Mengjie Qian
Rao Ma
Adian Liusie
Erfan Loweimi
Kate Knill
Mark Gales
80
1
0
14 Sep 2023
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Cognitive Mirage: A Review of Hallucinations in Large Language Models
Hongbin Ye
Tong Liu
Aijia Zhang
Wei Hua
Weiqiang Jia
HILM
122
81
0
13 Sep 2023
Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges
Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges
Hiba Ahsan
Denis Jered McInerney
Jisoo Kim
Christopher Potter
Geoffrey S. Young
Silvio Amir
Byron C. Wallace
72
12
0
08 Sep 2023
Zero-Resource Hallucination Prevention for Large Language Models
Zero-Resource Hallucination Prevention for Large Language Models
Junyu Luo
Cao Xiao
Fenglong Ma
HILM
130
17
0
06 Sep 2023
Previous
123...10119
Next