Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.09598
Cited By
Improving Statistical Significance in Human Evaluation of Automatic Metrics via Soft Pairwise Accuracy
15 September 2024
Brian Thompson
Nitika Mathur
Daniel Deutsch
Huda Khayrallah
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Statistical Significance in Human Evaluation of Automatic Metrics via Soft Pairwise Accuracy"
6 / 6 papers shown
Title
Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models
Tobias Domhan
Dawei Zhu
33
0
0
03 May 2025
Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
Shaomu Tan
Christof Monz
42
0
0
18 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
Adding Chocolate to Mint: Mitigating Metric Interference in Machine Translation
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
61
0
0
11 Mar 2025
MetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference Calibration
David Anugraha
Garry Kuwanto
Lucky Susanto
Derry Wijaya
Genta Indra Winata
OSLM
40
2
0
01 Nov 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
37
2
0
12 Oct 2024
1