ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTML

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 513 papers shown
Title
Assisting in Writing Wikipedia-like Articles From Scratch with Large
  Language Models
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
Yijia Shao
Yucheng Jiang
Theodore A. Kanell
Peter Xu
Omar Khattab
Monica S. Lam
LLMAGKELM
120
51
0
22 Feb 2024
RefuteBench: Evaluating Refuting Instruction-Following for Large
  Language Models
RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Jianhao Yan
Yun Luo
Yue Zhang
ALMLRM
105
10
0
21 Feb 2024
Factual consistency evaluation of summarization in the Era of large language models
Factual consistency evaluation of summarization in the Era of large language models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
HILM
59
2
0
21 Feb 2024
Identifying Factual Inconsistencies in Summaries: Grounding Model
  Inference via Task Taxonomy
Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy
Liyan Xu
Zhenlin Su
Mo Yu
Jin Xu
Jinho D. Choi
Jie Zhou
Fei Liu
HILM
95
2
0
20 Feb 2024
GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence
GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence
Kundan Krishna
S. Ramprasad
Prakhar Gupta
Byron C. Wallace
Zachary Chase Lipton
Jeffrey P. Bigham
HILMKELMSyDa
126
9
0
19 Feb 2024
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When
  and What to Retrieve for LLMs
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
Jiejun Tan
Zhicheng Dou
Yutao Zhu
Peidong Guo
Kun Fang
Ji-Rong Wen
129
30
0
19 Feb 2024
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation
Chanwoong Yoon
Gangwoo Kim
Byeongguk Jeon
Sungdong Kim
Yohan Jo
Jaewoo Kang
KELMRALM
137
14
0
19 Feb 2024
FactPICO: Factuality Evaluation for Plain Language Summarization of
  Medical Evidence
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence
Sebastian Antony Joseph
Lily Chen
Jan Trienes
Hannah Louisa Göke
Monika Coers
Wei Xu
Byron C. Wallace
Junyi Jessy Li
LM&MAHILM
77
11
0
18 Feb 2024
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Yougang Lyu
Lingyong Yan
Shuaiqiang Wang
Haibo Shi
D. Yin
Fajie Yuan
Zhumin Chen
Maarten de Rijke
Zhaochun Ren
82
7
0
17 Feb 2024
GenRES: Rethinking Evaluation for Generative Relation Extraction in the
  Era of Large Language Models
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models
Pengcheng Jiang
Jiacheng Lin
Zifeng Wang
Jimeng Sun
Jiawei Han
64
6
0
16 Feb 2024
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for
  Hallucination Mitigation in Large Language Models
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models
Hanxing Ding
Liang Pang
Zihao Wei
Huawei Shen
Xueqi Cheng
HILMRALM
144
18
0
16 Feb 2024
Comparing Hallucination Detection Metrics for Multilingual Generation
Comparing Hallucination Detection Metrics for Multilingual Generation
Haoqiang Kang
Terra Blevins
Luke Zettlemoyer
HILM
105
20
0
16 Feb 2024
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Jiaheng Wei
Yuanshun Yao
Jean-François Ton
Hongyi Guo
Andrew Estornell
Yang Liu
HILM
137
26
0
16 Feb 2024
Language Models with Conformal Factuality Guarantees
Language Models with Conformal Factuality Guarantees
Christopher Mohri
Tatsunori Hashimoto
HILM
219
50
0
15 Feb 2024
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's
  Hidden States
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
Hanyu Duan
Yi Yang
Kar Yan Tam
HILM
82
38
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via
  Self-Evaluation
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
84
52
0
14 Feb 2024
Into the Unknown: Self-Learning Large Language Models
Into the Unknown: Self-Learning Large Language Models
Teddy Ferdinan
Jan Kocoñ
P. Kazienko
79
3
0
14 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric
  Instruction Tuning and Preference Alignment
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
96
35
0
13 Feb 2024
Towards Faithful and Robust LLM Specialists for Evidence-Based
  Question-Answering
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski
Jingwei Ni
Mathias Kraus
Elliott Ash
Markus Leippold
70
4
0
13 Feb 2024
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for
  Safety-Aware Street Crossing
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing
Hochul Hwang
Sunjae Kwon
Yekyung Kim
Donghyun Kim
67
13
0
09 Feb 2024
Calibrating Long-form Generations from Large Language Models
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
67
14
0
09 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALMLM&MAELM
248
425
0
09 Feb 2024
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature
  of Aggregated Factual Claims in Long-Form Generations
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations
Cheng-Han Chiang
Hung-yi Lee
HILM
92
9
0
08 Feb 2024
Training Language Models to Generate Text with Citations via
  Fine-grained Rewards
Training Language Models to Generate Text with Citations via Fine-grained Rewards
Chengyu Huang
Zeqiu Wu
Yushi Hu
Wenya Wang
HILMLRM
136
30
0
06 Feb 2024
Factuality of Large Language Models in the Year 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRMHILM
105
35
0
04 Feb 2024
How well do LLMs cite relevant medical references? An evaluation
  framework and analyses
How well do LLMs cite relevant medical references? An evaluation framework and analyses
Kevin Wu
Eric Wu
Ally Cassasola
Angela Zhang
Kevin Wei
Teresa Nguyen
Sith Riantawan
Patricia Shi Riantawan
Daniel E. Ho
James Zou
LM&MAELMAI4MH
89
32
0
03 Feb 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim
Minjoon Seo
SyDaALM
58
2
0
02 Feb 2024
A Survey on Hallucination in Large Vision-Language Models
A Survey on Hallucination in Large Vision-Language Models
Hanchao Liu
Wenyuan Xue
Yifei Chen
Dapeng Chen
Xiutian Zhao
Ke Wang
Liping Hou
Rong-Zhi Li
Wei Peng
LRMMLLM
85
137
0
01 Feb 2024
Corrective Retrieval Augmented Generation
Corrective Retrieval Augmented Generation
Shi-Qi Yan
Jia-Chen Gu
Yun Zhu
Zhen-Hua Ling
RALM
250
89
0
29 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
101
16
0
26 Jan 2024
K-QA: A Real-World Medical Q&A Benchmark
K-QA: A Real-World Medical Q&A Benchmark
Itay Manes
Naama Ronn
David Cohen
Ran Ilan Ber
Zehavi Horowitz-Kugler
Gabriel Stanovsky
LM&MAHILMAI4MH
95
13
0
25 Jan 2024
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Haritz Puerto
Martin Tutek
Somak Aditya
Xiaodan Zhu
Iryna Gurevych
ReCodReLMLRM
110
15
0
18 Jan 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on
  Agriculture
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
M. A. D. L. Balaguer
Vinamra Benara
Renato Luiz de Freitas Cunha
Roberto de M. Estevao Filho
Todd Hendry
...
Morris Sharp
B. Silva
Swati Sharma
Vijay Aski
Ranveer Chandra
FaML
119
92
0
16 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and
  Challenges
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MAELM
134
15
0
13 Jan 2024
Fine-grained Hallucination Detection and Editing for Language Models
Fine-grained Hallucination Detection and Editing for Language Models
Abhika Mishra
Akari Asai
Vidhisha Balachandran
Yizhong Wang
Graham Neubig
Yulia Tsvetkov
Hannaneh Hajishirzi
HILM
114
87
0
12 Jan 2024
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
Siyu Yuan
Kaitao Song
Jiangjie Chen
Xu Tan
Yongliang Shen
Ren Kan
Dongsheng Li
Deqing Yang
LLMAG
97
68
0
11 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
ZuJie Wen
Ke Xu
Qi Li
165
64
0
11 Jan 2024
LightHouse: A Survey of AGI Hallucination
LightHouse: A Survey of AGI Hallucination
Feng Wang
LRMHILMVLM
99
3
0
08 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and
  Solutions
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
70
9
0
04 Jan 2024
Large Legal Fictions: Profiling Legal Hallucinations in Large Language
  Models
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Matthew Dahl
Varun Magesh
Mirac Suzgun
Daniel E. Ho
HILMAILaw
126
86
0
02 Jan 2024
Do Androids Know They're Only Dreaming of Electric Sheep?
Do Androids Know They're Only Dreaming of Electric Sheep?
Sky CH-Wang
Benjamin Van Durme
Jason Eisner
Chris Kedzie
HILM
96
35
0
28 Dec 2023
Alleviating Hallucinations of Large Language Models through Induced
  Hallucinations
Alleviating Hallucinations of Large Language Models through Induced Hallucinations
Yue Zhang
Leyang Cui
Wei Bi
Shuming Shi
HILM
108
57
0
25 Dec 2023
DSPy Assertions: Computational Constraints for Self-Refining Language
  Model Pipelines
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
Arnav Singhvi
Manish Shetty
Shangyin Tan
Christopher Potts
Koushik Sen
Matei A. Zaharia
Omar Khattab
80
21
0
20 Dec 2023
Towards Verifiable Text Generation with Evolving Memory and
  Self-Reflection
Towards Verifiable Text Generation with Evolving Memory and Self-Reflection
Hao Sun
Hengyi Cai
Bo Wang
Yingyan Hou
Xiaochi Wei
Shuaiqiang Wang
Yan Zhang
D. Yin
148
10
0
14 Dec 2023
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILMELM
72
4
0
14 Dec 2023
Alignment for Honesty
Alignment for Honesty
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
87
35
0
12 Dec 2023
Dense X Retrieval: What Retrieval Granularity Should We Use?
Dense X Retrieval: What Retrieval Granularity Should We Use?
Tong Chen
Hongwei Wang
Sihao Chen
Wenhao Yu
Kaixin Ma
Xinran Zhao
Hongming Zhang
Dong Yu
107
36
0
11 Dec 2023
User Modeling in the Era of Large Language Models: Current Research and
  Future Directions
User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan
Meng Jiang
116
12
0
11 Dec 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Shafiq Joty
ELMCLLAI4MHLRMALM
146
27
0
28 Nov 2023
RELIC: Investigating Large Language Model Responses using
  Self-Consistency
RELIC: Investigating Large Language Model Responses using Self-Consistency
Furui Cheng
Vilém Zouhar
Simran Arora
Mrinmaya Sachan
Hendrik Strobelt
Mennatallah El-Assady
HILM
78
22
0
28 Nov 2023
Previous
123...101189
Next