Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.09010
Cited By
v1
v2
v3
v4
v5
v6
v7
v8 (latest)
Datasheets for Datasets
23 March 2018
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Datasheets for Datasets"
50 / 989 papers shown
Title
FigCaps-HF: A Figure-to-Caption Generative Framework and Benchmark with Human Feedback
Ashish Singh
Ashutosh Singh
Prateek R. Agarwal
Zixuan Huang
Arpita Singh
...
Ryan Rossi
Puneet Mathur
Erik Learned-Miller
Franck Dernoncourt
Ryan Rossi
104
8
0
20 Jul 2023
Europepolls: A Dataset of Country-Level Opinion Polling Data for the European Union and the UK
Konstantinos Pitas
15
0
0
19 Jul 2023
Test-takers have a say: understanding the implications of the use of AI in language tests
Dawen Zhang
Thong Hoang
Shidong Pan
Yongquan Hu
Zhenchang Xing
Mark Staples
Xiwei Xu
Qinghua Lu
Aaron Quigley
ELM
45
4
0
19 Jul 2023
Beyond the ML Model: Applying Safety Engineering Frameworks to Text-to-Image Development
Shalaleh Rismani
Renee Shelby
A. Smart
Renelito Delos Santos
AJung Moon
Negar Rostamzadeh
64
9
0
19 Jul 2023
Reflections from the Workshop on AI-Assisted Decision Making for Conservation
Lily Xu
Esther Rolf
Sara Beery
J. Bennett
T. Berger-Wolf
...
P. Moorcroft
Jonathan Palmer
Andrew Perrault
D. Thau
Milind Tambe
80
3
0
17 Jul 2023
Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms
M. Houtti
Isaac Johnson
Morten Warncke-Wang
Loren G. Terveen
38
1
0
17 Jul 2023
Fairness in KI-Systemen
Janine Strotherm
Alissa Müller
Barbara Hammer
Benjamin Paassen
FaML
58
1
0
17 Jul 2023
Where Did the President Visit Last Week? Detecting Celebrity Trips from News Articles
Kai Peng
Ying Zhang
Shuai Ling
Zhaoru Ke
Haipeng Zhang
GNN
51
1
0
17 Jul 2023
Analyzing Dataset Annotation Quality Management in the Wild
Jan-Christoph Klie
Richard Eckart de Castilho
Iryna Gurevych
84
26
0
16 Jul 2023
Bound by the Bounty: Collaboratively Shaping Evaluation Processes for Queer AI Harms
Organizers of QueerInAI
Nathaniel Dennler
Anaelia Ovalle
Ashwin Singh
Luca Soldaini
...
Kyra Yee
Irene Font Peradejordi
Zeerak Talat
Mayra Russo
Jessica de Jesus de Pinho Pinhal
69
16
0
15 Jul 2023
Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models
Yiwei Luo
Kristina Gligorić
Dan Jurafsky
57
3
0
14 Jul 2023
Robotic Manipulation Datasets for Offline Compositional Reinforcement Learning
Marcel Hussing
Jorge Armando Mendez Mendez
Anisha Singrodia
Cassandra Kent
Eric Eaton
OffRL
97
7
0
13 Jul 2023
IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation
Thiviyan Thanapalasingam
Emile van Krieken
Peter Bloem
Paul T. Groth
53
1
0
13 Jul 2023
Machine Learning practices and infrastructures
G. Berman
60
4
0
13 Jul 2023
Objaverse-XL: A Universe of 10M+ 3D Objects
Matt Deitke
Ruoshi Liu
Matthew Wallingford
Huong Ngo
Oscar Michel
...
Carl Vondrick
Georgia Gkioxari
Kiana Ehsani
Ludwig Schmidt
Ali Farhadi
104
422
0
11 Jul 2023
HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding
Hao Zheng
R. Lee
Yuqian Lu
VGen
28
17
0
09 Jul 2023
Frontier AI Regulation: Managing Emerging Risks to Public Safety
Markus Anderljung
Joslyn Barnhart
Anton Korinek
Jade Leung
Cullen O'Keefe
...
Jonas Schuett
Yonadav Shavit
Divya Siddarth
Robert F. Trager
Kevin J. Wolf
SILM
135
125
0
06 Jul 2023
BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting
Patrick Emami
A. Sahu
Peter Graf
AI4TS
102
15
0
30 Jun 2023
A Massive Scale Semantic Similarity Dataset of Historical English
Emily Silcock
Melissa Dell
66
5
0
30 Jun 2023
Next Steps for Human-Centered Generative AI: A Technical Perspective
Xiang Ánthony' Chen
Jeff Burke
Andrea Colaço
Matthew K. Hong
Jennifer Jacobs
...
Dingzeyu Li
Nanyun Peng
Karl D. D. Willis
Chien-Sheng Wu
Bolei Zhou
LLMAG
84
35
0
27 Jun 2023
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
H. Tung
Mingyu Ding
Zhenfang Chen
Daniel M. Bear
Chuang Gan
J. Tenenbaum
Daniel L. K. Yamins
Judy Fan
Kevin A. Smith
115
17
0
27 Jun 2023
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Kaiyu Yang
Aidan M. Swope
Alex Gu
Rahul Chalamala
Peiyang Song
Shixing Yu
Saad Godil
R. Prenger
Anima Anandkumar
RALM
132
245
0
27 Jun 2023
Use case cards: a use case reporting framework inspired by the European AI Act
Isabelle Hupont
David Fernández Llorca
S. Baldassarri
Emilia Gómez
65
20
0
23 Jun 2023
Critical-Reflective Human-AI Collaboration: Exploring Computational Tools for Art Historical Image Retrieval
Katrin Glinka
Claudia Muller-Birn
24
9
0
22 Jun 2023
Realistic Synthetic Financial Transactions for Anti-Money Laundering Models
Erik Altman
Jovan Blanuvsa
Luc von Niederhäusern
Béni Egressy
Andreea Anghel
Kubilay Atasu
153
44
0
22 Jun 2023
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Xudong Shen
H. Brown
Jiashu Tao
Martin Strobel
Yao Tong
Akshay Narayan
Harold Soh
Finale Doshi-Velez
91
3
0
22 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
Elizaveta Semenova
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
111
46
0
21 Jun 2023
Benchmark data to study the influence of pre-training on explanation performance in MR image classification
Marta Oliveira
Rick Wilming
Benedict Clark
Céline Budding
Fabian Eitel
K. Ritter
Stefan Haufe
50
1
0
21 Jun 2023
An Overview of Catastrophic AI Risks
Dan Hendrycks
Mantas Mazeika
Thomas Woodside
SILM
82
186
0
21 Jun 2023
Event Stream GPT: A Data Pre-processing and Modeling Library for Generative, Pre-trained Transformers over Continuous-time Sequences of Complex Events
Matthew B. A. McDermott
Bret A. Nestor
Peniel Argaw
I. Kohane
AI4TS
119
24
0
20 Jun 2023
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
323
125
0
20 Jun 2023
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification
Le-le Cao
Vilhelm von Ehrenheim
Mark Granroth-Wilding
Richard Anselmo Stahl
Andrew McCornack
Armin Catovic
Dhiana Deva Cavalcanti Rocha
99
4
0
18 Jun 2023
The Importance of Human-Labeled Data in the Era of LLMs
Yang Liu
ALM
75
10
0
18 Jun 2023
Reproducibility in NLP: What Have We Learned from the Checklist?
Ian H. Magnusson
Noah A. Smith
Jesse Dodge
27
12
0
16 Jun 2023
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events
Kazuki Shimada
Archontis Politis
Parthasaarathy Sudarsanam
D. Krause
Kengo Uchida
...
Yuichiro Koyama
Naoya Takahashi
Shusuke Takahashi
Tuomas Virtanen
Yuki Mitsufuji
112
43
0
15 Jun 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
55
3
0
15 Jun 2023
LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting
Xu Liu
Yutong Xia
Yuxuan Liang
Junfeng Hu
Yiwei Wang
Lei Bai
Chaoqin Huang
Zhenguang Liu
Bryan Hooi
Roger Zimmermann
AI4TS
81
81
0
14 Jun 2023
V-LoL: A Diagnostic Dataset for Visual Logical Learning
Lukas Helff
Wolfgang Stammer
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
NAI
89
5
0
13 Jun 2023
Unraveling the Interconnected Axes of Heterogeneity in Machine Learning for Democratic and Inclusive Advancements
Maryam Molamohammadi
Afaf Taik
Nicolas Le Roux
G. Farnadi
92
1
0
11 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and Society
Irene Solaiman
Zeerak Talat
William Agnew
Lama Ahmad
Dylan K. Baker
...
Marie-Therese Png
Shubham Singh
A. Strait
Lukas Struppek
Arjun Subramonian
ELM
EGVM
137
117
0
09 Jun 2023
AircraftVerse: A Large-Scale Multimodal Dataset of Aerial Vehicle Designs
Adam D. Cobb
Anirban Roy
Daniel Elenius
F. M. Heim
Brian Swenson
...
Theodore Bapty
Joseph Hite
K. Ramani
Christopher McComb
Susmit Jha
62
8
0
08 Jun 2023
Explainable Predictive Maintenance
Sepideh Pashami
Sławomir Nowaczyk
Yuantao Fan
Jakub Jakubowski
Nuno Paiva
...
Bruno Veloso
M. Sayed-Mouchaweh
L. Rajaoarisoa
Grzegorz J. Nalepa
João Gama
73
8
0
08 Jun 2023
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Jielin Qiu
Jiacheng Zhu
William Jongwon Han
Aditesh Kumar
Karthik Mittal
...
Linjie Li
Jianfeng Wang
Ding Zhao
Bo Li
Lijuan Wang
VGen
64
8
0
07 Jun 2023
Art and the science of generative AI: A deeper dive
Ziv Epstein
Aaron Hertzmann
L. Herman
Robert Mahari
M. Frank
...
Jessica Fjeld
Hany Farid
Neil Leach
Alex Pentland
Olga Russakovsky
98
320
0
07 Jun 2023
Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models
Jose Berengueres
Marybeth Sandell
59
0
0
06 Jun 2023
AVIDa-hIL6: A Large-Scale VHH Dataset Produced from an Immunized Alpaca for Predicting Antigen-Antibody Interactions
Hirofumi Tsuruta
Hiroyuki Yamazaki
R. Maeda
Ryotaro Tamura
Jennifer Wei
...
Poomarin Phloyphisut
H. Shimokawa
J. Ledsam
Lucy J. Colwell
Akihiro Imura
46
7
0
06 Jun 2023
AHA!: Facilitating AI Impact Assessment by Generating Examples of Harms
Zana Buçinca
Chau Minh Pham
Maurice Jakesch
Marco Tulio Ribeiro
Alexandra Olteanu
Saleema Amershi
69
37
0
05 Jun 2023
NLPositionality: Characterizing Design Biases of Datasets and Models
Sebastin Santy
Jenny T Liang
Ronan Le Bras
Katharina Reinecke
Maarten Sap
116
82
0
02 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
145
168
0
02 Jun 2023
Multilingual Conceptual Coverage in Text-to-Image Models
Michael Stephen Saxon
William Yang Wang
EGVM
78
9
0
02 Jun 2023
Previous
1
2
3
...
8
9
10
...
18
19
20
Next