ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTML

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 513 papers shown
Title
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level
  Hallucination Evaluation
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Wen Luo
Tianshu Shen
Wei Li
Guangyue Peng
Richeng Xuan
Houfeng Wang
Xi Yang
HILM
111
12
0
11 Jun 2024
Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document
  Comprehension: Task, Insights, and Challenges
Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges
Abhilasha Sancheti
Koustava Goswami
Balaji Vasan Srinivasan
RALM
92
1
0
11 Jun 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou
Yang Zhang
Jacob Andreas
Shiyu Chang
161
7
0
11 Jun 2024
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Joongwon Kim
Bhargavi Paranjape
Tushar Khot
Hannaneh Hajishirzi
LM&RoELMLLMAGLRM
85
9
0
10 Jun 2024
Verifiable Generation with Subsentence-Level Fine-Grained Citations
Verifiable Generation with Subsentence-Level Fine-Grained Citations
Shuyang Cao
Lu Wang
94
7
0
10 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving
  Negation
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Neeraj Varshney
Satyam Raj
Venkatesh Mishra
Agneet Chatterjee
Ritika Sarkar
Amir Saeidi
Chitta Baral
LRM
96
11
0
08 Jun 2024
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in
  the Wild
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Bill Yuchen Lin
Yuntian Deng
Khyathi Chandu
Faeze Brahman
Abhilasha Ravichander
Valentina Pyatkin
Nouha Dziri
Ronan Le Bras
Yejin Choi
108
82
0
07 Jun 2024
MAIRA-2: Grounded Radiology Report Generation
MAIRA-2: Grounded Radiology Report Generation
Shruthi Bannur
Kenza Bouzid
Daniel Coelho De Castro
Anton Schwaighofer
Sam Bond-Taylor
...
Anja Thieme
M. Lungren
Maria T. A. Wetscherek
Javier Alvarez-Valle
Stephanie L. Hyland
82
44
0
06 Jun 2024
PaCE: Parsimonious Concept Engineering for Large Language Models
PaCE: Parsimonious Concept Engineering for Large Language Models
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
D. Thaker
Aditya Chattopadhyay
Chris Callison-Burch
René Vidal
CVBM
98
12
0
06 Jun 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future
  Pathways
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
157
49
0
04 Jun 2024
Safeguarding Large Language Models: A Survey
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRLKELMAILaw
99
26
0
03 Jun 2024
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of
  Self-Correction of LLMs
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Ryo Kamoi
Yusen Zhang
Nan Zhang
Jiawei Han
Rui Zhang
LRM
173
85
0
03 Jun 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong Liu
3DV
94
1
0
29 May 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Minghan Li
Xilun Chen
Ari Holtzman
Beidi Chen
Jimmy Lin
Wen-tau Yih
Xi Lin
RALMBDL
240
14
0
29 May 2024
TimeChara: Evaluating Point-in-Time Character Hallucination of
  Role-Playing Large Language Models
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
Jaewoo Ahn
Taehyun Lee
Junyoung Lim
Jin-Hwa Kim
Sangdoo Yun
Hwaran Lee
Gunhee Kim
LLMAGHILM
88
14
0
28 May 2024
GRAG: Graph Retrieval-Augmented Generation
GRAG: Graph Retrieval-Augmented Generation
Yuntong Hu
Zhihan Lei
Zhengwu Zhang
Bo Pan
Chen Ling
Liang Zhao
121
31
0
26 May 2024
Accelerating Inference of Retrieval-Augmented Generation via Sparse
  Context Selection
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
Yun Zhu
Jia-Chen Gu
Caitlin Sikora
Ho Ko
Yinxiao Liu
...
Lei Shu
Liangchen Luo
Lei Meng
Bang Liu
Jindong Chen
RALM
99
19
0
25 May 2024
Certifiably Robust RAG against Retrieval Corruption
Certifiably Robust RAG against Retrieval Corruption
Chong Xiang
Tong Wu
Zexuan Zhong
David Wagner
Danqi Chen
Prateek Mittal
SILM
99
58
0
24 May 2024
AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings
AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings
R. Reddy
Omar Attia
Yunyao Li
Heng Ji
Saloni Potdar
61
1
0
23 May 2024
RefChecker: Reference-based Fine-grained Hallucination Checker and
  Benchmark for Large Language Models
RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Xiangkun Hu
Dongyu Ru
Lin Qiu
Qipeng Guo
Tianhang Zhang
Yang Xu
Yun Luo
Pengfei Liu
Yue Zhang
Zheng Zhang
HILMLRM
98
9
0
23 May 2024
Can LLMs Solve longer Math Word Problems Better?
Can LLMs Solve longer Math Word Problems Better?
Xin Xu
Tong Xiao
Zitong Chao
Zhenya Huang
Can Yang
Yang Wang
173
14
0
23 May 2024
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation
  Models
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models
Guangzhi Sun
Potsawee Manakul
Adian Liusie
Kunat Pipatanakul
Chao Zhang
P. Woodland
Mark Gales
HILMMLLM
84
9
0
22 May 2024
Automated Evaluation of Retrieval-Augmented Language Models with
  Task-Specific Exam Generation
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet
Behrooz Omidvar-Tehrani
Anoop Deoras
Laurent Callot
RALM
117
23
0
22 May 2024
Atomic Self-Consistency for Better Long Form Generations
Atomic Self-Consistency for Better Long Form Generations
Raghuveer Thirukovalluru
Yukun Huang
Bhuwan Dhingra
80
5
0
21 May 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
ALMLM&MAELMLRM
123
59
0
21 May 2024
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
Minbyul Jeong
Hyeon Hwang
Chanwoong Yoon
Taewhoo Lee
Jaewoo Kang
MedImHILMLM&MA
123
12
0
21 May 2024
Question-Based Retrieval using Atomic Units for Enterprise RAG
Question-Based Retrieval using Atomic Units for Enterprise RAG
Vatsal Raina
Mark Gales
74
14
0
20 May 2024
SciQAG: A Framework for Auto-Generated Science Question Answering
  Dataset with Fine-grained Evaluation
SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained Evaluation
Yuwei Wan
Yixuan Liu
Aswathy Ajith
Clara Grazian
B. Hoex
Wenjie Zhang
Chunyu Kit
Tong Xie
Ian Foster
95
10
0
16 May 2024
LLMs can learn self-restraint through iterative self-reflection
LLMs can learn self-restraint through iterative self-reflection
Alexandre Piché
Aristides Milios
Dzmitry Bahdanau
Chris Pal
74
6
0
15 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
146
137
0
09 May 2024
OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs
OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs
Yuxia Wang
Minghan Wang
Hasan Iqbal
Georgi Georgiev
Jiahui Geng
Preslav Nakov
HILM
107
2
0
09 May 2024
One vs. Many: Comprehending Accurate Information from Multiple Erroneous
  and Inconsistent AI Generations
One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations
Yoonjoo Lee
Kihoon Son
Tae Soo Kim
Jisu Kim
John Joon Young Chung
Eytan Adar
Juho Kim
93
13
0
09 May 2024
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive
  Language Model Pre-training
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training
Zexuan Zhong
Mengzhou Xia
Danqi Chen
Mike Lewis
MoE
110
19
0
06 May 2024
Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents
Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents
Sneha Singhania
Simon Razniewski
Gerhard Weikum
RALM
129
1
0
04 May 2024
FLAME: Factuality-Aware Alignment for Large Language Models
FLAME: Factuality-Aware Alignment for Large Language Models
Sheng-Chieh Lin
Luyu Gao
Barlas Oğuz
Wenhan Xiong
Jimmy Lin
Wen-tau Yih
Xilun Chen
HILM
95
20
0
02 May 2024
On the Evaluation of Machine-Generated Reports
On the Evaluation of Machine-Generated Reports
James Mayfield
Eugene Yang
Dawn J Lawrie
Sean MacAvaney
Paul McNamee
...
Orion Weller
Efsun Kayi
Kate Sanders
Marc Mason
Noah Hibbler
ALM
184
17
0
02 May 2024
GRAMMAR: Grounded and Modular Methodology for Assessment of
  Closed-Domain Retrieval-Augmented Language Model
GRAMMAR: Grounded and Modular Methodology for Assessment of Closed-Domain Retrieval-Augmented Language Model
Xinzhe Li
Ming Liu
Shang Gao
RALM
103
0
0
30 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
212
61
0
23 Apr 2024
ISQA: Informative Factuality Feedback for Scientific Summarization
ISQA: Informative Factuality Feedback for Scientific Summarization
Zekai Li
Yanxia Qin
Qian Liu
Min-Yen Kan
HILM
60
1
0
20 Apr 2024
AmbigDocs: Reasoning across Documents on Different Entities under the
  Same Name
AmbigDocs: Reasoning across Documents on Different Entities under the Same Name
Yoonsang Lee
Xi Ye
Eunsol Choi
79
14
0
18 Apr 2024
Unifying Bias and Unfairness in Information Retrieval: A Survey of
  Challenges and Opportunities with Large Language Models
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models
Sunhao Dai
Chen Xu
Shicheng Xu
Liang Pang
Zhenhua Dong
Jun Xu
117
83
0
17 Apr 2024
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out
  Document
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document
Joonho Yang
Seunghyun Yoon
Byeongjeong Kim
Hwanhee Lee
HILM
114
7
0
17 Apr 2024
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang
Philippe Laban
Greg Durrett
HILMSyDa
86
103
0
16 Apr 2024
NoticIA: A Clickbait Article Summarization Dataset in Spanish
NoticIA: A Clickbait Article Summarization Dataset in Spanish
Iker García-Ferrero
Begoña Altuna
93
2
0
11 Apr 2024
Best Practices and Lessons Learned on Synthetic Data for Language Models
Best Practices and Lessons Learned on Synthetic Data for Language Models
Ruibo Liu
Jerry W. Wei
Fangyu Liu
Chenglei Si
Yanzhe Zhang
...
Steven Zheng
Daiyi Peng
Diyi Yang
Denny Zhou
Andrew M. Dai
SyDaEgoV
126
96
0
11 Apr 2024
Pitfalls of Conversational LLMs on News Debiasing
Pitfalls of Conversational LLMs on News Debiasing
Ipek Baris Schlicht
Defne Altiok
Maryanne Taouk
Lucie Flek
95
4
0
09 Apr 2024
Characterizing Multimodal Long-form Summarization: A Case Study on
  Financial Reports
Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports
Tianyu Cao
Natraj Raman
Danial Dervovic
Chenhao Tan
62
5
0
09 Apr 2024
Know When To Stop: A Study of Semantic Drift in Text Generation
Know When To Stop: A Study of Semantic Drift in Text Generation
Ava Spataru
Eric Hambro
Elena Voita
Nicola Cancedda
61
3
0
08 Apr 2024
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Liqiang Jing
Xinya Du
187
18
0
07 Apr 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State
  Transition Dynamics
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Derui Zhu
Dingfan Chen
Qing Li
Zongxiong Chen
Lei Ma
Jens Grossklags
Mario Fritz
HILM
89
14
0
06 Apr 2024
Previous
123...10116789
Next