Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.10216
Cited By
Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics
21 April 2022
Daniel Deutsch
Rotem Dror
Dan Roth
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics"
11 / 11 papers shown
Title
Summarization Metrics for Spanish and Basque: Do Automatic Scores and LLM-Judges Correlate with Humans?
Jeremy Barnes
Naiara Perez
Alba Bonet-Jover
Begoña Altuna
62
1
0
21 Mar 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge
Aparna Elangovan
Jongwoo Ko
Lei Xu
Mahsa Elyasi
Ling Liu
S. Bodapati
Dan Roth
49
5
0
28 Jan 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
28
1
0
28 Jan 2025
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
72
38
0
03 Oct 2024
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
62
25
0
03 Oct 2024
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Evaluating Factual Consistency of Texts with Semantic Role Labeling
Jing Fan
Dennis Aumiller
Michael Gertz
HILM
34
4
0
22 May 2023
LENS: A Learnable Evaluation Metric for Text Simplification
Mounica Maddela
Yao Dou
David Heineman
Wei-ping Xu
29
62
0
19 Dec 2022
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Yixin Liu
Alexander R. Fabbri
Pengfei Liu
Yilun Zhao
Linyong Nan
...
Simeng Han
Shafiq R. Joty
Chien-Sheng Wu
Caiming Xiong
Dragomir R. Radev
ALM
24
132
0
15 Dec 2022
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
O. Yu. Golovneva
Moya Chen
Spencer Poff
Martin Corredor
Luke Zettlemoyer
Maryam Fazel-Zarandi
Asli Celikyilmaz
ReLM
LRM
34
138
0
15 Dec 2022
Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries
Daniel Deutsch
Dan Roth
56
7
0
23 Oct 2020
1