Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia

Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia

21 February 2024

Aaron L Halfaker

Kenneth Holstein

Haiyi Zhu

Papers citing "Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia"

5 / 5 papers shown

Title
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection Yingli Shen Wen Lai Shuo Wang Xueren Zhang Kangyang Luo Alexander M. Fraser Maosong Sun 49 0 0 17 Feb 2025
Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review Rock Yuren Pang Hope Schroeder Kynnedy Simone Smith Solon Barocas Ziang Xiao Emily Tseng Danielle Bragg 77 3 0 22 Jan 2025
A Roadmap to Pluralistic Alignment Taylor Sorensen Jared Moore Jillian R. Fisher Mitchell L. Gordon Niloofar Mireshghallah ... Liwei Jiang Ximing Lu Nouha Dziri Tim Althoff Yejin Choi 65 80 0 07 Feb 2024
Discovering and Validating AI Errors With Crowdsourced Failure Reports Ángel Alexander Cabrera Abraham J. Druck Jason I. Hong Adam Perer HAI 60 54 0 23 Sep 2021
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers Kenny Peng Arunesh Mathur Arvind Narayanan 99 93 0 06 Aug 2021