ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTML

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 513 papers shown
Title
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
Jiahao Cheng
Tiancheng Su
Jia Yuan
Guoxiu He
Jiawei Liu
Xinqi Tao
Jingwen Xie
Huaxia Li
HILMLRM
26
0
0
20 Jun 2025
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
Yongqi Fan
Yating Wang
Guandong Wang
Jie Zhai
Jingping Liu
Qi Ye
Tong Ruan
23
0
0
18 Jun 2025
MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
Jushaan Singh Kalra
Xinran Zhao
To Eun Kim
Fengyu Cai
Fernando Diaz
Tongshuang Wu
VLM
20
0
0
18 Jun 2025
A Vision for Geo-Temporal Deep Research Systems: Towards Comprehensive, Transparent, and Reproducible Geo-Temporal Information Synthesis
A Vision for Geo-Temporal Deep Research Systems: Towards Comprehensive, Transparent, and Reproducible Geo-Temporal Information Synthesis
Bruno Martins
Piotr Szymañski
Piotr Gramacki
27
0
0
17 Jun 2025
GenerationPrograms: Fine-grained Attribution with Executable Programs
GenerationPrograms: Fine-grained Attribution with Executable Programs
David Wan
Eran Hirsch
Elias Stengel-Eskin
Ido Dagan
Mohit Bansal
25
0
0
17 Jun 2025
Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation
Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation
Xiangyan Chen
Yujian Gan
Matthew Purver
HILM
35
0
0
14 Jun 2025
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking
Shuo Yang
Yuqin Dai
Guoqing Wang
Xinran Zheng
Jinfeng Xu
Jinze Li
ZhenZhe Ying
Weiqiang Wang
Edith C. -H. Ngai
HILMLRM
26
0
0
14 Jun 2025
How Grounded is Wikipedia? A Study on Structured Evidential Support
How Grounded is Wikipedia? A Study on Structured Evidential Support
William Walden
Kathryn Ricci
Miriam Wanner
Zhengping Jiang
Chandler May
Rongkun Zhou
Benjamin Van Durme
HILM
31
0
0
14 Jun 2025
Textual Bayes: Quantifying Uncertainty in LLM-Based Systems
Textual Bayes: Quantifying Uncertainty in LLM-Based Systems
Brendan Leigh Ross
Noël Vouitsis
Atiyeh Ashari Ghomi
Rasa Hosseinzadeh
Ji Xin
...
Yi Sui
Shiyi Hou
Kin Kwan Leung
Gabriel Loaiza-Ganem
Jesse C. Cresswell
72
0
0
11 Jun 2025
KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs
KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs
Dingjun Wu
Y. Yan
Zhenghao Liu
Zhiyuan Liu
Maosong Sun
49
0
0
11 Jun 2025
LLM-as-a-qualitative-judge: automating error analysis in natural language generation
LLM-as-a-qualitative-judge: automating error analysis in natural language generation
Nadezhda Chirkova
Tunde Oluwaseyi Ajayi
Seth Aycock
Zain Muhammad Mujahid
Vladana Perlić
Ekaterina Borisova
Markarit Vartampetian
ELM
30
0
0
10 Jun 2025
MiniCPM4: Ultra-Efficient LLMs on End Devices
MiniCPM4: Ultra-Efficient LLMs on End Devices
MiniCPM Team
Chaojun Xiao
Yuxuan Li
Xu Han
Yuzhuo Bai
...
Zhiyuan Liu
Guoyang Zeng
Chao Jia
Dahai Li
Maosong Sun
MLLM
34
0
0
09 Jun 2025
ConfQA: Answer Only If You Are Confident
ConfQA: Answer Only If You Are Confident
Yin Huang
Yifan Ethan Xu
Kai Sun
Vera Yan
Alicia Sun
...
Yue Liu
Aaron Colak
Anuj Kumar
Wen-tau Yih
Xin Luna Dong
HILM
20
0
0
08 Jun 2025
Beyond Facts: Evaluating Intent Hallucination in Large Language Models
Beyond Facts: Evaluating Intent Hallucination in Large Language Models
Yijie Hao
Haofei Yu
Jiaxuan You
HILMLRM
28
0
0
06 Jun 2025
Generating Grounded Responses to Counter Misinformation via Learning Efficient Fine-Grained Critiques
Generating Grounded Responses to Counter Misinformation via Learning Efficient Fine-Grained Critiques
Xiaofei Xu
Xiuzhen Zhang
Ke Deng
HILM
50
0
0
06 Jun 2025
CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection
Ron Eliav
Arie Cattan
Eran Hirsch
Shahaf Bassan
Elias Stengel-Eskin
Mohit Bansal
Ido Dagan
LRM
86
0
0
05 Jun 2025
SPARTA ALIGNMENT: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang
Wenxuan Ding
Shangbin Feng
Greg Durrett
Yulia Tsvetkov
90
0
0
05 Jun 2025
SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing
Hongjun Liu
Yilun Zhao
Arman Cohan
Chen Zhao
AAMLLRM
106
0
0
05 Jun 2025
Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning
Nan Huo
Jinyang Li
Bowen Qin
Ge Qu
Xiaolong Li
Xiaodong Li
Chenhao Ma
Reynold Cheng
RALM
118
0
0
05 Jun 2025
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning
Tim Franzmeyer
Archie Sravankumar
Lijuan Liu
Yuning Mao
Rui Hou
Sinong Wang
Jakob Foerster
Luke Zettlemoyer
Madian Khabsa
KELMALM
83
0
0
04 Jun 2025
TracLLM: A Generic Framework for Attributing Long Context LLMs
TracLLM: A Generic Framework for Attributing Long Context LLMs
Yanting Wang
Wei Zou
Runpeng Geng
Jinyuan Jia
LLMAG
126
0
0
04 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Yixuan Li
Seongheon Park
Ling Chen
AAMLHILM
61
0
0
03 Jun 2025
Reconsidering LLM Uncertainty Estimation Methods in the Wild
Reconsidering LLM Uncertainty Estimation Methods in the Wild
Yavuz Faruk Bakman
D. Yaldiz
Sungmin Kang
Tuo Zhang
Baturalp Buyukates
Salman Avestimehr
Sai Praneeth Karimireddy
45
0
0
01 Jun 2025
LAQuer: Localized Attribution Queries in Content-grounded Generation
LAQuer: Localized Attribution Queries in Content-grounded Generation
Eran Hirsch
Aviv Slobodkin
David Wan
Elias Stengel-Eskin
Mohit Bansal
Ido Dagan
38
0
0
01 Jun 2025
Vid2Coach: Transforming How-To Videos into Task Assistants
Vid2Coach: Transforming How-To Videos into Task Assistants
Mina Huh
Zihui Xue
Ujjaini Das
Kumar Ashutosh
Kristen Grauman
Amy Pavel
19
0
0
31 May 2025
Inter-Passage Verification for Multi-evidence Multi-answer QA
Inter-Passage Verification for Multi-evidence Multi-answer QA
Bingsen Chen
Shengjie Wang
Xi Ye
Chen Zhao
RALM
35
0
0
31 May 2025
WikiGap: Promoting Epistemic Equity by Surfacing Knowledge Gaps Between English Wikipedia and other Language Editions
WikiGap: Promoting Epistemic Equity by Surfacing Knowledge Gaps Between English Wikipedia and other Language Editions
Zining Wang
Yuxuan Zhang
Dongwook Yoon
Nicholas Vincent
Farhan Samir
Vered Shwartz
KELM
26
0
0
30 May 2025
HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs
HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs
Qing Li
Jiahui Geng
Zongxiong Chen
Derui Zhu
Yuxia Wang
Congbo Ma
Chenyang Lyu
Fakhri Karray
9
0
0
30 May 2025
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs
Juraj Vladika
Annika Domres
Mai Nguyen
Rebecca Moser
Jana Nano
...
Denise Bernhardt
Stephanie E. Combs
Kai J. Borm
Florian Matthes
J. Peeken
HILM
24
0
0
30 May 2025
LaMP-QA: A Benchmark for Personalized Long-form Question Answering
LaMP-QA: A Benchmark for Personalized Long-form Question Answering
Alireza Salemi
Hamed Zamani
20
0
0
30 May 2025
Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation
Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation
Caiqi Zhang
Xiaochen Zhu
Chengzu Li
Nigel Collier
Andreas Vlachos
OffRLHILM
53
1
0
29 May 2025
How Does Response Length Affect Long-Form Factuality
How Does Response Length Affect Long-Form Factuality
James Xu Zhao
Jimmy Z.J. Liu
Bryan Hooi
See-Kiong Ng
HILMKELM
61
0
0
29 May 2025
ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs
ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs
Mohamed S. Elaraby
Diane Litman
LLMAG
36
0
0
29 May 2025
From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
Xuan Gong
Hanbo Huang
Shiyu Liang
42
0
0
29 May 2025
LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
Chaeeun Kim
Jinu Lee
Wonseok Hwang
AILawRALMELM
30
0
0
28 May 2025
Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers
Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers
Chaitanya Sharma
RALM3DV
40
0
0
28 May 2025
Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then Evaluate
Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then Evaluate
Ashim Gupta
Maitrey Mehta
Zhichao Xu
Vivek Srikumar
49
0
0
28 May 2025
MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks
MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks
Suhana Bedi
Hejie Cui
Miguel Fuentes
Alyssa Unell
Michael Wornow
...
M. Lungren
Eric Horvitz
Percy Liang
M. Pfeffer
N. Shah
ELMLM&MAAI4MH
49
0
0
26 May 2025
Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs
Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs
Artem Vazhentsev
Lyudmila Rvanova
Gleb Kuzmin
Ekaterina Fadeeva
Ivan Lazichny
...
Maxim Panov
Timothy Baldwin
Mrinmaya Sachan
Preslav Nakov
Artem Shelmanov
EDLHILM
84
0
0
26 May 2025
Does quantization affect models' performance on long-context tasks?
Does quantization affect models' performance on long-context tasks?
Anmol Mekala
Anirudh Atmakuru
Yixiao Song
Marzena Karpinska
Mohit Iyyer
MQ
47
0
0
26 May 2025
Reasoning Is Not All You Need: Examining LLMs for Multi-Turn Mental Health Conversations
Reasoning Is Not All You Need: Examining LLMs for Multi-Turn Mental Health Conversations
Mohit Chandra
Siddharth Sriraman
Harneet Singh Khanuja
Yiqiao Jin
Munmun De Choudhury
LM&MAAI4MHLRM
34
0
0
26 May 2025
ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models
ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models
Yachuan Liu
Xiaochun Wei
Lin Shi
Xinnuo Li
Bohan Zhang
Paramveer S. Dhillon
Qiaozhu Mei
65
0
0
26 May 2025
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Joao Coelho
Jingjie Ning
Jingyuan He
Kangrui Mao
Abhijay Paladugu
...
Jiahe Jin
Jamie Callan
João Magalhães
Bruno Martins
Chenyan Xiong
76
2
0
25 May 2025
Dynamic Manifold Evolution Theory: Modeling and Stability Analysis of Latent Representations in Large Language Models
Dynamic Manifold Evolution Theory: Modeling and Stability Analysis of Latent Representations in Large Language Models
Yukun Zhang
Qi Dong
AI4CE
28
0
0
24 May 2025
Writing Like the Best: Exemplar-Based Expository Text Generation
Writing Like the Best: Exemplar-Based Expository Text Generation
Yuxiang Liu
Kevin Chen-Chuan Chang
44
0
0
24 May 2025
MedScore: Factuality Evaluation of Free-Form Medical Answers
MedScore: Factuality Evaluation of Free-Form Medical Answers
Heyuan Huang
Alexandra DeLucia
Vijay Murari Tiyyala
Mark Dredze
HILMMedIm
51
0
0
24 May 2025
CUB: Benchmarking Context Utilisation Techniques for Language Models
CUB: Benchmarking Context Utilisation Techniques for Language Models
Lovisa Hagström
Youna Kim
Haeun Yu
Sang-goo Lee
Richard Johansson
Hyunsoo Cho
Isabelle Augenstein
63
1
0
22 May 2025
UNCLE: Uncertainty Expressions in Long-Form Generation
UNCLE: Uncertainty Expressions in Long-Form Generation
Ruihan Yang
Caiqi Zhang
Zhisong Zhang
Xinting Huang
Dong Yu
Nigel Collier
Deqing Yang
ELM
69
2
0
22 May 2025
Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery
Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery
Yanbo Zhang
S. Khan
Adnan Mahmud
Huck Yang
Alexander Lavin
...
James A. Evans
Alan R. Bundy
Jannis Brugger
Jesper Tegner
Hector Zenil
LM&MA
100
1
0
22 May 2025
HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation
HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation
Jinyu Guo
Xunlei Chen
Qiyang Xia
Zhaokun Wang
Jie Ou
Libo Qin
Shunyu Yao
Wenhong Tian
207
0
0
22 May 2025
1234...91011
Next