ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTML

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 513 papers shown
Title
Medico: Towards Hallucination Detection and Correction with Multi-source
  Evidence Fusion
Medico: Towards Hallucination Detection and Correction with Multi-source Evidence Fusion
Xinping Zhao
Jindi Yu
Zhenyu Liu
Jifang Wang
Dongfang Li
Yibin Chen
Baotian Hu
Min Zhang
HILM
53
0
0
14 Oct 2024
BookWorm: A Dataset for Character Description and Analysis
BookWorm: A Dataset for Character Description and Analysis
Argyrios Papoudakis
Mirella Lapata
Frank Keller
54
2
0
14 Oct 2024
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering
Yuan Sui
Yufei He
Zifeng Ding
Bryan Hooi
HILMRALMELM
150
10
0
10 Oct 2024
ReIFE: Re-evaluating Instruction-Following Evaluation
ReIFE: Re-evaluating Instruction-Following Evaluation
Yixin Liu
Kejian Shi
Alexander R. Fabbri
Yilun Zhao
Peifeng Wang
Chien-Sheng Wu
Shafiq Joty
Arman Cohan
95
6
0
09 Oct 2024
Uncovering Factor Level Preferences to Improve Human-Model Alignment
Uncovering Factor Level Preferences to Improve Human-Model Alignment
Juhyun Oh
Eunsu Kim
Jiseon Kim
Wenda Xu
Inha Cha
William Yang Wang
Alice Oh
75
1
0
09 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for
  Enhanced Following of Instructions with Multiple Constraints
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
104
14
0
09 Oct 2024
ReFIR: Grounding Large Restoration Models with Retrieval Augmentation
ReFIR: Grounding Large Restoration Models with Retrieval Augmentation
Hang Guo
Tao Dai
Zhihao Ouyang
Taolin Zhang
Yaohua Zha
Bin Chen
Shu-Tao Xia
DiffM
83
6
0
08 Oct 2024
Why am I seeing this: Democratizing End User Auditing for Online Content
  Recommendations
Why am I seeing this: Democratizing End User Auditing for Online Content Recommendations
Chaoran Chen
Leyang Li
Luke Cao
Yanfang Ye
Tianshi Li
Yaxing Yao
Toby Jia-jun Li
MLAU
85
2
0
07 Oct 2024
Realizing Video Summarization from the Path of Language-based Semantic
  Understanding
Realizing Video Summarization from the Path of Language-based Semantic Understanding
Kuan-Chen Mu
Zhi-Yi Chin
Wei-Chen Chiu
49
0
0
06 Oct 2024
Locating Information Gaps and Narrative Inconsistencies Across
  Languages: A Case Study of LGBT People Portrayals on Wikipedia
Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia
Farhan Samir
Chan Young Park
Anjalie Field
Vered Shwartz
Yulia Tsvetkov
122
2
0
05 Oct 2024
CS4: Measuring the Creativity of Large Language Models Automatically by
  Controlling the Number of Story-Writing Constraints
CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints
Anirudh Atmakuru
Jatin Nainani
Rohith Siddhartha Reddy Bheemreddy
Anirudh Lakkaraju
Zonghai Yao
Hamed Zamani
Haw-Shiuan Chang
201
3
0
05 Oct 2024
ECon: On the Detection and Resolution of Evidence Conflicts
ECon: On the Detection and Resolution of Evidence Conflicts
Cheng Jiayang
Chunkit Chan
Qianqian Zhuang
Lin Qiu
Tianhang Zhang
Tengxiao Liu
Yangqiu Song
Yue Zhang
Pengfei Liu
Zheng Zhang
106
5
0
05 Oct 2024
FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs
FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs
Deema Alnuhait
Neeraja Kirtane
Muhammad Khalifa
Hao Peng
LRMHILM
107
4
0
03 Oct 2024
Loki: An Open-Source Tool for Fact Verification
Loki: An Open-Source Tool for Fact Verification
Haonan Li
Xudong Han
Hao Wang
Yuxia Wang
Minghan Wang
Rui Xing
Yilin Geng
Zenan Zhai
Preslav Nakov
Timothy Baldwin
SyDaHILM
338
5
0
02 Oct 2024
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large
  Language Models
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models
Shayekh Bin Islam
Md Asib Rahman
K S M Tozammel Hossain
Enamul Hoque
Shafiq Joty
Md. Rizwan Parvez
RALMAIFinLRMVLM
74
16
0
02 Oct 2024
FactAlign: Long-form Factuality Alignment of Large Language Models
FactAlign: Long-form Factuality Alignment of Large Language Models
Chao-Wei Huang
Yun-Nung Chen
HILM
67
4
0
02 Oct 2024
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming
Senthil Purushwalkam
Shrey Pandit
Zixuan Ke
Xuan-Phi Nguyen
Caiming Xiong
Shafiq Joty
HILM
285
24
0
30 Sep 2024
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
Yike Wu
Yi Huang
Nan Hu
Yuncheng Hua
Guilin Qi
Jiaoyan Chen
Jeff Z. Pan
118
9
0
29 Sep 2024
Model-based Preference Optimization in Abstractive Summarization without
  Human Feedback
Model-based Preference Optimization in Abstractive Summarization without Human Feedback
Jaepill Choi
Kyubyung Chae
Jiwoo Song
Yohan Jo
Taesup Kim
68
2
0
27 Sep 2024
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination
  Detection
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection
Xuefeng Du
Chaowei Xiao
Yixuan Li
HILM
79
27
0
26 Sep 2024
Enhancing Post-Hoc Attributions in Long Document Comprehension via
  Coarse Grained Answer Decomposition
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition
Pritika Ramu
Koustava Goswami
Apoorv Saxena
Balaji Vasan Srinivavsan
66
3
0
25 Sep 2024
LINKAGE: Listwise Ranking among Varied-Quality References for
  Non-Factoid QA Evaluation via LLMs
LINKAGE: Listwise Ranking among Varied-Quality References for Non-Factoid QA Evaluation via LLMs
Sihui Yang
Keping Bi
Wanqing Cui
Jiafeng Guo
Xueqi Cheng
100
3
0
23 Sep 2024
The Ability of Large Language Models to Evaluate Constraint-satisfaction
  in Agent Responses to Open-ended Requests
The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests
Lior Madmoni
Amir Zait
Ilia Labzovsky
Danny Karmon
ELM
61
0
0
22 Sep 2024
The Factuality of Large Language Models in the Legal Domain
The Factuality of Large Language Models in the Legal Domain
Rajaa El Hamdani
Thomas Bonald
Fragkiskos D. Malliaros
Nils Holzenberger
Fabian M. Suchanek
AILawHILM
110
1
0
18 Sep 2024
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do
Guijin Son
Hyunwoo Ko
Hoyoung Lee
Yewon Kim
Seunghyeok Hong
ALMELM
100
11
0
17 Sep 2024
Gaps or Hallucinations? Gazing into Machine-Generated Legal Analysis for
  Fine-grained Text Evaluations
Gaps or Hallucinations? Gazing into Machine-Generated Legal Analysis for Fine-grained Text Evaluations
Abe Bohan Hou
William Jurayj
Nils Holzenberger
Andrew Blair-Stanek
Benjamin Van Durme
ELM
59
0
0
16 Sep 2024
NovAScore: A New Automated Metric for Evaluating Document Level Novelty
NovAScore: A New Automated Metric for Evaluating Document Level Novelty
Lin Ai
Ziwei Gong
Harshsaiprasad Deshpande
Alexander Johnson
Emmy Phung
Ahmad Emami
Julia Hirschberg
42
1
0
14 Sep 2024
When Context Leads but Parametric Memory Follows in Large Language
  Models
When Context Leads but Parametric Memory Follows in Large Language Models
Yufei Tao
Adam Hiatt
Erik Haake
Antonie J. Jetter
Ameeta Agrawal
KELM
94
1
0
13 Sep 2024
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su
Xuhui Zhou
Sanketh Rangreji
Anubha Kabra
Julia Mendelsohn
Faeze Brahman
Maarten Sap
LLMAG
185
7
0
13 Sep 2024
Synthetic continued pretraining
Synthetic continued pretraining
Zitong Yang
Neil Band
Shuangping Li
Emmanuel Candès
Tatsunori Hashimoto
CLLSyDa
100
16
0
11 Sep 2024
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering
Sacha Muller
António Loison
Bilel Omrani
Gautier Viaud
RALMELM
110
2
0
10 Sep 2024
Enhancing Temporal Understanding in Audio Question Answering for Large
  Audio Language Models
Enhancing Temporal Understanding in Audio Question Answering for Large Audio Language Models
A. Sridhar
Yinyi Guo
Erik M. Visser
AuLLM
105
0
0
10 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
248
32
0
10 Sep 2024
Hallucination Detection in LLMs: Fast and Memory-Efficient Finetuned
  Models
Hallucination Detection in LLMs: Fast and Memory-Efficient Finetuned Models
Gabriel Y. Arteaga
Thomas B. Schon
Nicolas Pielawski
115
9
0
04 Sep 2024
Generating Media Background Checks for Automated Source Critical
  Reasoning
Generating Media Background Checks for Automated Source Critical Reasoning
Michael Schlichtkrull
82
4
0
01 Sep 2024
ContextCite: Attributing Model Generation to Context
ContextCite: Attributing Model Generation to Context
Benjamin Cohen-Wang
Harshay Shah
Kristian Georgiev
Aleksander Madry
LRM
93
30
0
01 Sep 2024
LoraMap: Harnessing the Power of LoRA Connections
LoraMap: Harnessing the Power of LoRA Connections
Hyeryun Park
Jeongwon Kwak
Dongsuk Jang
Sumin Park
Jinwook Choi
MoMe
80
0
0
29 Aug 2024
Measuring text summarization factuality using atomic facts entailment
  metrics in the context of retrieval augmented generation
Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation
N. E. Kriman
HILM
89
0
0
27 Aug 2024
What Makes a Good Story and How Can We Measure It? A Comprehensive
  Survey of Story Evaluation
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation
Dingyi Yang
Qin Jin
130
7
0
26 Aug 2024
Claim Verification in the Age of Large Language Models: A Survey
Claim Verification in the Age of Large Language Models: A Survey
A. Dmonte
Roland Oruche
Marcos Zampieri
Prasad Calyam
Isabelle Augenstein
187
11
0
26 Aug 2024
SLM Meets LLM: Balancing Latency, Interpretability and Consistency in
  Hallucination Detection
SLM Meets LLM: Balancing Latency, Interpretability and Consistency in Hallucination Detection
Mengya Hu
Rui Xu
Deren Lei
Yaxi Li
Mingyu Wang
Emily Ching
Eslam Kamal
Alex Deng
67
3
0
22 Aug 2024
RAGLAB: A Modular and Research-Oriented Unified Framework for
  Retrieval-Augmented Generation
RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation
Xuanwang Zhang
Yunze Song
Yidong Wang
Shuyun Tang
Xinfeng Li
...
Wei Dong
Yue Zhang
Xinyu Dai
Shikun Zhang
Qingsong Wen
103
5
0
21 Aug 2024
Unconditional Truthfulness: Learning Conditional Dependency for
  Uncertainty Quantification of Large Language Models
Unconditional Truthfulness: Learning Conditional Dependency for Uncertainty Quantification of Large Language Models
Artem Vazhentsev
Ekaterina Fadeeva
Rui Xing
Alexander Panchenko
Preslav Nakov
Timothy Baldwin
Maxim Panov
Artem Shelmanov
HILM
64
3
0
20 Aug 2024
Analysis of Plan-based Retrieval for Grounded Text Generation
Analysis of Plan-based Retrieval for Grounded Text Generation
Ameya Godbole
Nicholas Monath
Seungyeon Kim
A. S. Rawat
Andrew McCallum
Manzil Zaheer
RALM
101
3
0
20 Aug 2024
Web Retrieval Agents for Evidence-Based Misinformation Detection
Web Retrieval Agents for Evidence-Based Misinformation Detection
Jacob-Junqi Tian
Hao Yu
Yury Orlovskiy
Tyler Vergho
Mauricio Rivera
Mayank Goel
Zachary Yang
Jean-Francois Godbout
Reihaneh Rabbany
Kellin Pelrine
LLMAGOffRL
77
6
0
15 Aug 2024
Zero-shot Factual Consistency Evaluation Across Domains
Zero-shot Factual Consistency Evaluation Across Domains
Raunak Agarwal
HILM
118
0
0
07 Aug 2024
DebateQA: Evaluating Question Answering on Debatable Knowledge
DebateQA: Evaluating Question Answering on Debatable Knowledge
Rongwu Xu
Xuan Qi
Zehan Qi
Wei Xu
Zhijiang Guo
ELM
83
7
0
02 Aug 2024
Misinforming LLMs: vulnerabilities, challenges and opportunities
Misinforming LLMs: vulnerabilities, challenges and opportunities
Jaroslaw Kornowicz
Daniel Geissler
Kirsten Thommes
58
3
0
02 Aug 2024
A Course Shared Task on Evaluating LLM Output for Clinical Questions
A Course Shared Task on Evaluating LLM Output for Clinical Questions
Yufang Hou
Thy Thy Tran
Doan Nam Long Vu
Yiwen Cao
Kai Li
Lukas Rohde
Iryna Gurevych
LM&MAELM
52
0
0
31 Jul 2024
CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large
  Language Models over Factual Knowledge
CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge
Tianshi Zheng
Jiaxin Bai
Yicheng Wang
Tianqing Fang
Yue Guo
Yauwai Yim
Yangqiu Song
ELMLRM
101
5
0
30 Jul 2024
Previous
123456...91011
Next