131

FaStFACT: Faster, Stronger Long-Form Factuality Evaluations in LLMs

Main:8 Pages
5 Figures
Bibliography:4 Pages
9 Tables
Appendix:30 Pages
Abstract

Evaluating the factuality of long-form generations from Large Language Models (LLMs) remains challenging due to accuracy issues and costly human assessment. Prior efforts attempt this by decomposing text into claims, searching for evidence, and verifying claims, but suffer from critical drawbacks: (1) inefficiency due to complex pipeline components unsuitable for long LLM outputs, and (2) ineffectiveness stemming from inaccurate claim sets and insufficient evidence collection of one-line snippets.

View on arXiv
Comments on this paper