ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.05345
  4. Cited By
Data and its (dis)contents: A survey of dataset development and use in
  machine learning research

Data and its (dis)contents: A survey of dataset development and use in machine learning research

9 December 2020
Amandalynne Paullada
Inioluwa Deborah Raji
Emily M. Bender
Emily L. Denton
A. Hanna
ArXivPDFHTML

Papers citing "Data and its (dis)contents: A survey of dataset development and use in machine learning research"

28 / 78 papers shown
Title
Data Smells: Categories, Causes and Consequences, and Detection of
  Suspicious Data in AI-based Systems
Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems
Harald Foidl
Michael Felderer
Rudolf Ramler
13
31
0
19 Mar 2022
Sex Trouble: Common pitfalls in incorporating sex/gender in medical
  machine learning and how to avoid them
Sex Trouble: Common pitfalls in incorporating sex/gender in medical machine learning and how to avoid them
Kendra Albert
Maggie K. Delano
FaML
21
11
0
15 Mar 2022
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text
  Processing
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing
Ilias Chalkidis
Tommaso Pasini
Shenmin Zhang
Letizia Tomada
Sebastian Felix Schwemer
Anders Søgaard
AILaw
40
54
0
14 Mar 2022
A streamable large-scale clinical EEG dataset for Deep Learning
A streamable large-scale clinical EEG dataset for Deep Learning
Dung Truong
Manisha Sinha
K. Venkataraju
M. Milham
Arnaud Delorme
24
4
0
04 Mar 2022
3D Common Corruptions and Data Augmentation
3D Common Corruptions and Data Augmentation
Oğuzhan Fatih Kar
Teresa Yeo
Andrei Atanov
Amir Zamir
3DPC
45
107
0
02 Mar 2022
Language technology practitioners as language managers: arbitrating data
  bias and predictive bias in ASR
Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR
Nina Markl
S. McNulty
24
9
0
25 Feb 2022
The craft and coordination of data curation: complicating "workflow"
  views of data science
The craft and coordination of data curation: complicating "workflow" views of data science
A. Thomer
Dharma Akmon
J. York
Allison R. B. Tyler
Faye O. Polasek
Sara Lafia
Libby Hemphill
E. Yakel
16
20
0
09 Feb 2022
Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial
  Contexts
Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts
Sebastian Bordt
Michèle Finck
Eric Raidl
U. V. Luxburg
AILaw
39
77
0
25 Jan 2022
There is an elephant in the room: Towards a critique on the use of
  fairness in biometrics
There is an elephant in the room: Towards a critique on the use of fairness in biometrics
Ana Valdivia
Júlia Corbera Serrajòrdia
Aneta Swianiewicz
21
14
0
16 Dec 2021
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning
  Research
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research
Bernard Koch
Emily L. Denton
A. Hanna
J. Foster
41
140
0
03 Dec 2021
RedCaps: web-curated image-text data created by the people, for the
  people
RedCaps: web-curated image-text data created by the people, for the people
Karan Desai
Gaurav Kaul
Zubin Aysola
Justin Johnson
22
162
0
22 Nov 2021
Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing
Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing
Abhilash Mishra
Yash Gorana
19
3
0
16 Nov 2021
Building Legal Datasets
Building Legal Datasets
Jerrold Soh
ELM
AILaw
22
3
0
03 Nov 2021
A survey on datasets for fairness-aware machine learning
A survey on datasets for fairness-aware machine learning
Tai Le Quy
Arjun Roy
Vasileios Iosifidis
Wenbin Zhang
Eirini Ntoutsi
FaML
11
240
0
01 Oct 2021
PASS: An ImageNet replacement for self-supervised pretraining without
  humans
PASS: An ImageNet replacement for self-supervised pretraining without humans
Yuki M. Asano
Christian Rupprecht
Andrew Zisserman
Andrea Vedaldi
VLM
SSL
21
57
0
27 Sep 2021
Studying Up Machine Learning Data: Why Talk About Bias When We Mean
  Power?
Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?
Milagros Miceli
Julian Posada
Tianling Yang
22
60
0
16 Sep 2021
Just What do You Think You're Doing, Dave?' A Checklist for Responsible
  Data Use in NLP
Just What do You Think You're Doing, Dave?' A Checklist for Responsible Data Use in NLP
Anna Rogers
Timothy Baldwin
Kobi Leins
104
64
0
14 Sep 2021
Retiring Adult: New Datasets for Fair Machine Learning
Retiring Adult: New Datasets for Fair Machine Learning
Frances Ding
Moritz Hardt
John Miller
Ludwig Schmidt
57
428
0
10 Aug 2021
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Kenny Peng
Arunesh Mathur
Arvind Narayanan
99
93
0
06 Aug 2021
How to avoid machine learning pitfalls: a guide for academic researchers
How to avoid machine learning pitfalls: a guide for academic researchers
M. Lones
VLM
FaML
OnRL
62
77
0
05 Aug 2021
Process for Adapting Language Models to Society (PALMS) with
  Values-Targeted Datasets
Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets
Irene Solaiman
Christy Dennison
30
222
0
18 Jun 2021
Understanding and Evaluating Racial Biases in Image Captioning
Understanding and Evaluating Racial Biases in Image Captioning
Dora Zhao
Angelina Wang
Olga Russakovsky
24
134
0
16 Jun 2021
A Study of Face Obfuscation in ImageNet
A Study of Face Obfuscation in ImageNet
Kaiyu Yang
Jacqueline Yau
Li Fei-Fei
Jia Deng
Olga Russakovsky
PICV
CVBM
30
144
0
10 Mar 2021
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
Are We Modeling the Task or the Annotator? An Investigation of Annotator
  Bias in Natural Language Understanding Datasets
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva
Yoav Goldberg
Jonathan Berant
242
320
0
21 Aug 2019
Improving fairness in machine learning systems: What do industry
  practitioners need?
Improving fairness in machine learning systems: What do industry practitioners need?
Kenneth Holstein
Jennifer Wortman Vaughan
Hal Daumé
Miroslav Dudík
Hanna M. Wallach
FaML
HAI
192
742
0
13 Dec 2018
Hypothesis Only Baselines in Natural Language Inference
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak
Jason Naradowsky
Aparajita Haldar
Rachel Rudinger
Benjamin Van Durme
190
576
0
02 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Previous
12