Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration

23 May 2023

Papers citing "Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration"

5 / 5 papers shown

Title
CROC: Evaluating and Training T2I Metrics with Pseudo- and Human-Labeled Contrastive Robustness Checks Christoph Leiter Yuki M. Asano M. Keuper Steffen Eger 22 0 0 16 May 2025
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation Zhaopeng Feng Jiayuan Su Jiamei Zheng Jiahan Ren Yan Zhang Jian Wu Hongwei Wang Zuozhu Liu ELM 208 0 0 21 Feb 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge Aparna Elangovan Jongwoo Ko Lei Xu Mahsa Elyasi Ling Liu S. Bodapati Dan Roth 52 6 0 28 Jan 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation Mingqi Gao Xinyu Hu Li Lin Xiaojun Wan 28 1 0 28 Jan 2025
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models Qingyu Lu Baopu Qiu Liang Ding Liping Xie Tom Kocmi Dacheng Tao LRM ALM ELM 26 108 0 24 Mar 2023