Metrology for AI: From Benchmarks to Instruments

Metrology for AI: From Benchmarks to Instruments

5 November 2019

Praveen K. Paritosh

Lora Aroyo

ArXiv (abs)PDF HTML

Papers citing "Metrology for AI: From Benchmarks to Instruments"

5 / 5 papers shown

Title
Evaluation for Change Rishi Bommasani ELM 59 0 0 20 Dec 2022
DataPerf: Benchmarks for Data-Centric AI Development Mark Mazumder Colby R. Banbury Xiaozhe Yao Bojan Karlavs W. G. Rojas ... Carole-Jean Wu Cody Coleman Andrew Y. Ng Peter Mattson Vijay Janapa Reddi VLM 87 104 0 20 Jul 2022
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail Sam Bowman OffRL 115 45 0 15 Oct 2021
What Will it Take to Fix Benchmarking in Natural Language Understanding? Samuel R. Bowman George E. Dahl ELM ALM 76 164 0 05 Apr 2021
GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation Daniel Khashabi Gabriel Stanovsky Jonathan Bragg Nicholas Lourie Jungo Kasai Yejin Choi Noah A. Smith Daniel S. Weld 127 21 0 17 Jan 2021