ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05478
  4. Cited By
Evaluating Factuality in Generation with Dependency-level Entailment

Evaluating Factuality in Generation with Dependency-level Entailment

12 October 2020
Tanya Goyal
Greg Durrett
ArXivPDFHTML

Papers citing "Evaluating Factuality in Generation with Dependency-level Entailment"

30 / 30 papers shown
Title
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
Dingkang Yang
Dongling Xiao
Jinjie Wei
Mingcheng Li
Zhaoyu Chen
Ke Li
Li Zhang
HILM
94
3
0
28 Jan 2025
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
Catarina G. Belem
Pouya Pezeskhpour
Hayate Iso
Seiji Maekawa
Nikita Bhutani
Estevam R. Hruschka
HILM
73
1
0
17 Oct 2024
FineSurE: Fine-grained Summarization Evaluation using LLMs
FineSurE: Fine-grained Summarization Evaluation using LLMs
Hwanjun Song
Hang Su
Igor Shalyminov
Jason (Jinglun) Cai
Saab Mansour
HILM
41
31
0
01 Jul 2024
Factual Dialogue Summarization via Learning from Large Language Models
Factual Dialogue Summarization via Learning from Large Language Models
Rongxin Zhu
Jey Han Lau
Jianzhong Qi
HILM
52
1
0
20 Jun 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
55
321
0
06 Apr 2024
A Closer Look at Claim Decomposition
A Closer Look at Claim Decomposition
Miriam Wanner
Seth Ebner
Zhengping Jiang
Mark Dredze
Benjamin Van Durme
49
18
0
18 Mar 2024
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
Hanlei Jin
Yang Zhang
Dan Meng
Jun Wang
Jinghua Tan
68
80
0
05 Mar 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
35
12
0
26 Jan 2024
STRONG -- Structure Controllable Legal Opinion Summary Generation
STRONG -- Structure Controllable Legal Opinion Summary Generation
Yang Zhong
Diane Litman
ELM
AILaw
30
1
0
29 Sep 2023
A Critical Evaluation of Evaluations for Long-form Question Answering
A Critical Evaluation of Evaluations for Long-form Question Answering
Fangyuan Xu
Yixiao Song
Mohit Iyyer
Eunsol Choi
ELM
37
96
0
29 May 2023
Annotating and Detecting Fine-grained Factual Errors for Dialogue
  Summarization
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
Rongxin Zhu
Jianzhong Qi
Jey Han Lau
36
9
0
26 May 2023
Evaluating Factual Consistency of Summaries with Large Language Models
Evaluating Factual Consistency of Summaries with Large Language Models
Shiqi Chen
Siyang Gao
Junxian He
ELM
LRM
HILM
32
6
0
23 May 2023
Benchmarking Large Language Models for News Summarization
Benchmarking Large Language Models for News Summarization
Tianyi Zhang
Faisal Ladhak
Esin Durmus
Percy Liang
Kathleen McKeown
Tatsunori B. Hashimoto
ELM
28
478
0
31 Jan 2023
On Improving Summarization Factual Consistency from Natural Language
  Feedback
On Improving Summarization Factual Consistency from Natural Language Feedback
Yixin Liu
Budhaditya Deb
Milagro Teruel
Aaron L Halfaker
Dragomir R. Radev
Ahmed Hassan Awadallah
HILM
27
35
0
20 Dec 2022
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of
  Faithfulness Metrics
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics
Liang Ma
Shuyang Cao
IV RobertL.Logan
Di Lu
Shihao Ran
Kecheng Zhang
Joel R. Tetreault
A. Jaimes
17
6
0
20 Dec 2022
HaRiM$^+$: Evaluating Summary Quality with Hallucination Risk
HaRiM+^++: Evaluating Summary Quality with Hallucination Risk
Seonil Son
Junsoo Park
J. Hwang
Junghwa Lee
Hyungjong Noh
Yeonsoo Lee
HILM
6
8
0
22 Nov 2022
Questioning the Validity of Summarization Datasets and Improving Their
  Factual Consistency
Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency
Yanzhu Guo
Chloé Clavel
Moussa Kamal Eddine
Michalis Vazirgiannis
HILM
27
11
0
31 Oct 2022
How Far are We from Robust Long Abstractive Summarization?
How Far are We from Robust Long Abstractive Summarization?
Huan Yee Koh
Jiaxin Ju
He Zhang
Ming Liu
Shirui Pan
HILM
23
39
0
30 Oct 2022
Mutual Information Alleviates Hallucinations in Abstractive
  Summarization
Mutual Information Alleviates Hallucinations in Abstractive Summarization
Liam van der Poel
Ryan Cotterell
Clara Meister
HILM
16
56
0
24 Oct 2022
Analyzing and Evaluating Faithfulness in Dialogue Summarization
Analyzing and Evaluating Faithfulness in Dialogue Summarization
Bin Wang
Chen Zhang
Yan Zhang
Yiming Chen
Haizhou Li
HILM
41
14
0
21 Oct 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and
  Metrics
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
78
122
0
03 Jul 2022
QRelScore: Better Evaluating Generated Questions with Deeper
  Understanding of Context-aware Relevance
QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance
Xiaoqiang Wang
Bang Liu
Siliang Tang
Lingfei Wu
30
9
0
29 Apr 2022
Repro: An Open-Source Library for Improving the Reproducibility and
  Usability of Publicly Available Research Code
Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code
Daniel Deutsch
Dan Roth
AI4CE
39
2
0
29 Apr 2022
Evaluating Factuality in Text Simplification
Evaluating Factuality in Text Simplification
Ashwin Devaraj
William Sheffield
Byron C. Wallace
Junyi Jessy Li
HILM
21
41
0
15 Apr 2022
Evaluation of Automatic Text Summarization using Synthetic Facts
Evaluation of Automatic Text Summarization using Synthetic Facts
J. Ahn
Foaad Khosmood
HILM
13
0
0
11 Apr 2022
Survey of Hallucination in Natural Language Generation
Survey of Hallucination in Natural Language Generation
Ziwei Ji
Nayeon Lee
Rita Frieske
Tiezheng Yu
D. Su
...
Delong Chen
Wenliang Dai
Ho Shu Chan
Andrea Madotto
Pascale Fung
HILM
LRM
49
2,243
0
08 Feb 2022
Understanding Factuality in Abstractive Summarization with FRANK: A
  Benchmark for Factuality Metrics
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
231
305
0
27 Apr 2021
Improving Faithfulness in Abstractive Summarization with Contrast
  Candidate Generation and Selection
Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection
Sihao Chen
Fan Zhang
Kazoo Sone
Dan Roth
HILM
36
104
0
19 Apr 2021
Annotating and Modeling Fine-grained Factuality in Summarization
Annotating and Modeling Fine-grained Factuality in Summarization
Tanya Goyal
Greg Durrett
HILM
13
153
0
09 Apr 2021
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
19
376
0
26 Jun 2020
1