CrowdWorkSheets: Accounting for Individual and Collective Identities Underlying Crowdsourced Dataset Annotation

9 June 2022

Vinodkumar Prabhakaran

Emily L. Denton

ArXiv (abs)PDF HTML

Papers citing "CrowdWorkSheets: Accounting for Individual and Collective Identities Underlying Crowdsourced Dataset Annotation"

50 / 50 papers shown

Title
How do datasets, developers, and models affect biases in a low-resourced language? Dipto Das Shion Guha Bryan Semaan 21 0 0 07 Jun 2025
TEDI: Trustworthy and Ethical Dataset Indicators to Analyze and Compare Dataset Documentation Wiebke Hutiri Mircea Cimpoi M. Scheuerman Victoria Matthews Alice Xiang 167 0 0 23 May 2025
Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models Lynnette Ng Kokil Jaidka Kaiyuan Tay Hansin Ahuja Niyati Chhaya 131 1 0 26 Mar 2025
The Case for "Thick Evaluations" of Cultural Representation in AI Rida Qadri Mark Díaz Ding Wang Michael Madaio 91 4 0 24 Mar 2025
Talking About the Assumption in the Room Ramaravind Kommiya Mothilal Faisal M. Lalani Syed Ishtiaque Ahmed Shion Guha Sharifa Sultana 90 0 0 20 Feb 2025
AI Alignment at Your Discretion Maarten Buyl Hadi Khalaf C. M. Verdun Lucas Monteiro Paes Caio Vieira Machado Flavio du Pin Calmon 114 1 0 10 Feb 2025
What Makes An Expert? Reviewing How ML Researchers Define "Expert" Mark Díaz Angela D. R. Smith 54 2 0 31 Oct 2024
Surveys Considered Harmful? Reflecting on the Use of Surveys in AI Research, Development, and Governance Mohammmad Tahaei Daricia Wilkinson Alisa Frik Chi Lok Yu Ruba Abu-Salma Lauren Wilcox 71 4 0 26 Jul 2024
Exploring the Capability of ChatGPT to Reproduce Human Labels for Social Computing Tasks (Extended Version) Yiming Zhu Peixian Zhang Ehsan-ul Haq Pan Hui Gareth Tyson ALM AI4MH 100 0 0 08 Jul 2024
Documentation Practices of Artificial Intelligence Stefan Arnold Dilara Yesilbas Rene Gröbner Dominik Riedelbauch Maik Horn Sven Weinzierl AI4TS 48 0 0 26 Jun 2024
Let Guidelines Guide You: A Prescriptive Guideline-Centered Data Annotation Methodology Federico Ruggeri Eleonora Misino Arianna Muti Katerina Korre Paolo Torroni Alberto Barrón-Cedeño 108 1 0 20 Jun 2024
Prompt Design Matters for Computational Social Science Tasks but in Unpredictable Ways Shubham Atreja Joshua Ashkinaze Lingyao Li Julia Mendelsohn Libby Hemphill 97 14 0 17 Jun 2024
A Taxonomy of Challenges to Curating Fair Datasets Dora Zhao M. Scheuerman Pooja Chitre Jerone T. A. Andrews Georgia Panagiotidou Shawn Walker Kathleen H. Pine Alice Xiang 97 2 0 10 Jun 2024
Real Risks of Fake Data: Synthetic Data, Diversity-Washing and Consent Circumvention Cedric Deslandes Whitney Justin Norman 77 24 0 03 May 2024
Blind Spots and Biases: Exploring the Role of Annotator Cognitive Biases in NLP Sanjana Gautam Mukund Srinath 100 6 0 29 Apr 2024
From Model Performance to Claim: How a Change of Focus in Machine Learning Replicability Can Help Bridge the Responsibility Gap Tianqi Kou 126 1 0 19 Apr 2024
D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation Aida Mostafazadeh Davani Mark Díaz Dylan K. Baker Vinodkumar Prabhakaran 74 10 0 16 Apr 2024
Racial/Ethnic Categories in AI and Algorithmic Fairness: Why They Matter and What They Represent Jennifer Mickel 60 7 0 10 Apr 2024
Using Large Language Models to Enrich the Documentation of Datasets for Machine Learning Joan Giner-Miguelez Abel Gómez Jordi Cabot LLMAG 78 4 0 04 Apr 2024
If in a Crowdsourced Data Annotation Pipeline, a GPT-4 Zeyu He Huang Chieh-Yang C. C. Ding Shaurya Rohatgi Ting-Hao 'Kenneth' Huang 112 33 0 26 Feb 2024
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang Chinmay Kulkarni Lauren Wilcox Michael Terry Michael A. Madaio 77 50 0 23 Feb 2024
Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation Andrew Smart Ding Wang Ellis Monk Mark Díaz Atoosa Kasirzadeh Erin van Liemt Sonja Schmer-Galunder 83 8 0 09 Feb 2024
Copycats: the many lives of a publicly available medical imaging dataset Amelia Jiménez-Sánchez Natalia-Rozalia Avlona Dovile Juodelyte Théo Sourget Caroline Vang-Larsen Anna Rogers Hubert Dariusz Zajkac Veronika Cheplygina 107 2 0 09 Feb 2024
A Scoping Study of Evaluation Practices for Responsible AI Tools: Steps Towards Effectiveness Evaluations G. Berman Nitesh Goyal Michael A. Madaio ELM 73 23 0 30 Jan 2024
Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images Hansa Srinivasan Candice Schumann Aradhana Sinha David Madras Gbolahan O. Olanubi Alex Beutel Susanna Ricco Jilin Chen 96 6 0 25 Jan 2024
On the Readiness of Scientific Data for a Fair and Transparent Use in Machine Learning Joan Giner-Miguelez Abel Gómez Jordi Cabot 60 0 0 18 Jan 2024
Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates Aida Mostafazadeh Davani Mark Díaz Dylan K. Baker Vinodkumar Prabhakaran AAML 63 18 0 11 Dec 2023
SoUnD Framework: Analyzing (So)cial Representation in (Un)structured (D)ata Mark Díaz Sunipa Dev Emily Reif Remi Denton Vinodkumar Prabhakaran 103 4 0 28 Nov 2023
Is a Seat at the Table Enough? Engaging Teachers and Students in Dataset Specification for ML in Education Mei Tan Hansol Lee Dakuo Wang Hariharan Subramonyam 60 8 0 09 Nov 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities Senjuti Dutta Sid Mittal Sherol Chen Deepak Ramachandran Ravi Rajakumar Ian D Kivlichan Sunny Mak Alena Butryna Praveen Paritosh University of Tennessee 106 7 0 01 Nov 2023
Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting Tilman Beck Hendrik Schuff Anne Lauscher Iryna Gurevych 103 37 0 13 Sep 2023
FACET: Fairness in Computer Vision Evaluation Benchmark Laura Gustafson Chloe Rolland Nikhila Ravi Quentin Duval Aaron B. Adcock Cheng-Yang Fu Melissa Hall Candace Ross VLM EGVM 106 40 0 31 Aug 2023
Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection Oana Inel Tim Draws Lora Aroyo 103 6 0 22 Aug 2023
Scaling Laws Do Not Scale Fernando Diaz Michael A. Madaio 104 12 0 05 Jul 2023
Going public: the role of public participation approaches in commercial AI labs Lara Groves Aidan Peppin A. Strait Jenny Brennan 72 30 0 16 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and Society Irene Solaiman Zeerak Talat William Agnew Lama Ahmad Dylan K. Baker ... Marie-Therese Png Shubham Singh A. Strait Lukas Struppek Arjun Subramonian ELM EGVM 139 117 0 09 Jun 2023
The ethical ambiguity of AI data enrichment: Measuring gaps in research ethics norms and practices Will Hawkins Brent Mittelstadt 97 10 0 01 Jun 2023
AI's Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia Rida Qadri Renee Shelby Cynthia L. Bennett Emily Denton 77 75 0 19 May 2023
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models Akshita Jha Aida Mostafazadeh Davani Chandan K. Reddy Shachi Dave Vinodkumar Prabhakaran Sunipa Dev 87 50 0 19 May 2023
PaLM 2 Technical Report Rohan Anil Andrew M. Dai Orhan Firat Melvin Johnson Dmitry Lepikhin ... Ce Zheng Wei Zhou Denny Zhou Slav Petrov Yonghui Wu ReLM LRM 260 1,209 0 17 May 2023
Consensus and Subjectivity of Skin Tone Annotation for ML Fairness Candice Schumann Gbolahan O. Olanubi Auriel Wright Ellis P. Monk Courtney Heldreth Susanna Ricco 110 24 0 16 May 2023
"HOT" ChatGPT: The promise of ChatGPT in detecting and discriminating hateful, offensive, and toxic comments on social media Lingyao Li Lizhou Fan Shubham Atreja Libby Hemphill AI4MH 99 97 0 20 Apr 2023
Segment Anything A. Kirillov Eric Mintun Nikhila Ravi Hanzi Mao Chloe Rolland ... Spencer Whitehead Alexander C. Berg Wan-Yen Lo Piotr Dollár Ross B. Girshick MLLM VLM 459 7,452 0 05 Apr 2023
City-Wide Perceptions of Neighbourhood Quality using Street View Images Emily Muller Emily Gemmell Ishmam Choudhury Ricky Nathvani A. B. Metzler J. Bennett Emily L. Denton Seth Flaxman M. Ezzati 28 2 0 22 Nov 2022
Cultural Incongruencies in Artificial Intelligence Vinodkumar Prabhakaran Rida Qadri Ben Hutchinson 56 22 0 19 Nov 2022
Making Intelligence: Ethical Values in IQ and ML Benchmarks Borhane Blili-Hamelin Leif Hancox-Li 69 18 0 01 Sep 2022
A domain-specific language for describing machine learning datasets Joan Giner-Miguelez Abel Gómez Jordi Cabot ALM 55 26 0 05 Jul 2022
Eliciting and Learning with Soft Labels from Every Annotator Katherine M. Collins Umang Bhatt Adrian Weller 86 47 0 02 Jul 2022
Saliency Cards: A Framework to Characterize and Compare Saliency Methods Angie Boggust Harini Suresh Hendrik Strobelt John Guttag Arvindmani Satyanarayan FAtt XAI 71 10 0 07 Jun 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text Sebastian Gehrmann Elizabeth Clark Thibault Sellam ELM AI4CE 157 193 0 14 Feb 2022