Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.05764
Cited By
DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems
9 November 2022
Nabeel Seedat
F. Imrie
M. Schaar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems"
50 / 89 papers shown
Title
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Nabeel Seedat
Caterina Tozzi
Andrea Hita Ardiaca
Mihaela van der Schaar
James Weatherall
Adam Taylor
487
0
0
20 Nov 2024
Improving Adaptive Conformal Prediction Using Self-Supervised Learning
Nabeel Seedat
Alan Jeffares
F. Imrie
M. Schaar
SSL
98
16
0
23 Feb 2023
Synthcity: facilitating innovative use cases of synthetic data in different data modalities
Zhaozhi Qian
B. Cebere
M. Schaar
SyDa
86
63
0
18 Jan 2023
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data
Nabeel Seedat
Jonathan Crabbé
Ioana Bica
M. Schaar
74
25
0
24 Oct 2022
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection
Daniel Jarrett
B. Cebere
Tennison Liu
Alicia Curth
M. Schaar
48
78
0
15 Jun 2022
Domino: Discovering Systematic Errors with Cross-Modal Embeddings
Sabri Eyuboglu
M. Varma
Khaled Kamal Saab
Jean-Benoit Delbrouck
Christopher Lee-Messer
Jared A. Dunnmon
James Zou
Christopher Ré
89
148
0
24 Mar 2022
From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors
Firas Bayram
Bestoun S. Ahmed
A. Kassler
49
221
0
21 Mar 2022
Data-SUITE: Data-centric identification of in-distribution incongruous examples
Nabeel Seedat
Jonathan Crabbé
Mihaela van der Schaar
OOD
64
14
0
17 Feb 2022
Conditional Generation of Medical Time Series for Extrapolation to Underrepresented Populations
Simon Bing
Andrea Dittadi
Stefan Bauer
Patrick Schwab
SyDa
70
17
0
20 Jan 2022
MLOps -- Definitions, Tools and Challenges
Georgios Symeonidis
Evangelos Nerantzis
A. Kazakis
G. Papakostas
70
92
0
01 Jan 2022
What can Data-Centric AI Learn from Data and ML Engineering?
N. Polyzotis
Matei A. Zaharia
AI4CE
42
51
0
13 Dec 2021
Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models
David Nigenda
Zohar Karnin
Muhammad Bilal Zafar
Raghu Ramesha
Alan Tan
Michele Donini
K. Kenthapadi
VLM
44
42
0
26 Nov 2021
Sample Selection for Fair and Robust Training
Yuji Roh
Kangwook Lee
Steven Euijong Whang
Changho Suh
66
65
0
27 Oct 2021
DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks
A. Saha
Trent Kyono
J. Linmans
M. Schaar
CML
74
111
0
25 Oct 2021
Tabular Data: Deep Learning is Not All You Need
Ravid Shwartz-Ziv
Amitai Armon
LMTD
162
1,288
0
06 Jun 2021
Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests
Victor Veitch
Alexander DÁmour
Steve Yadlowsky
Jacob Eisenstein
OOD
68
93
0
31 May 2021
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks
Curtis G. Northcutt
Anish Athalye
Jonas W. Mueller
92
537
0
26 Mar 2021
Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology Systems
U. Mahmood
Robik Shrestha
D. Bates
L. Mannelli
G. Corrias
Y. Erdi
Christopher Kanan
60
16
0
04 Mar 2021
A Data Quality-Driven View of MLOps
Cédric Renggli
Luka Rimanic
Nezihe Merve Gürel
Bojan Karlavs
Wentao Wu
Ce Zhang
AI4TS
44
65
0
15 Feb 2021
dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python
Hubert Baniecki
Wojciech Kretowicz
Piotr Piątyszek
J. Wiśniewski
P. Biecek
FaML
80
97
0
28 Dec 2020
Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies
Weronika Hryniewska
Przemysław Bombiński
P. Szatkowski
Paulina Tomaszewska
A. Przelaskowski
P. Biecek
OOD
85
47
0
11 Dec 2020
No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems
N. Sohoni
Jared A. Dunnmon
Geoffrey Angus
Albert Gu
Christopher Ré
87
252
0
25 Nov 2020
Energy-based Out-of-distribution Detection
Weitang Liu
Xiaoyun Wang
John Douglas Owens
Yixuan Li
OODD
273
1,375
0
08 Oct 2020
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
127
452
0
22 Sep 2020
Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans
M. Roberts
D. Driggs
Matthew Thorpe
J. Gilbey
Michael Yeung
...
Kang Zhang
S. Stranks
James H. F. Rudd
Evis Sala
Carola-Bibiane Schönlieb
OOD
69
774
0
14 Aug 2020
Learning from Noisy Labels with Deep Neural Networks: A Survey
Hwanjun Song
Minseok Kim
Dongmin Park
Yooju Shin
Jae-Gil Lee
NoLa
117
998
0
16 Jul 2020
Normalized Loss Functions for Deep Learning with Noisy Labels
Xingjun Ma
Hanxun Huang
Yisen Wang
Simone Romano
S. Erfani
James Bailey
NoLa
76
445
0
24 Jun 2020
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Dimitris Tsipras
Shibani Santurkar
Logan Engstrom
Andrew Ilyas
Aleksander Madry
77
135
0
22 May 2020
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro
Tongshuang Wu
Carlos Guestrin
Sameer Singh
ELM
210
1,110
0
08 May 2020
Open Graph Benchmark: Datasets for Machine Learning on Graphs
Weihua Hu
Matthias Fey
Marinka Zitnik
Yuxiao Dong
Hongyu Ren
Bowen Liu
Michele Catasta
J. Leskovec
311
2,752
0
02 May 2020
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
Akshay Smit
Saahil Jain
Pranav Rajpurkar
Anuj Pareek
A. Ng
M. Lungren
MedIm
60
333
0
20 Apr 2020
Learning Deep Kernels for Non-Parametric Two-Sample Tests
Feng Liu
Wenkai Xu
Jie Lu
Guangquan Zhang
Arthur Gretton
Danica J. Sutherland
76
188
0
21 Feb 2020
Identifying Mislabeled Data using the Area Under the Margin Ranking
Geoff Pleiss
Tianyi Zhang
Ethan R. Elenberg
Kilian Q. Weinberger
NoLa
94
274
0
28 Jan 2020
DP-CGAN: Differentially Private Synthetic Data and Label Generation
Reihaneh Torkzadehmahani
Peter Kairouz
B. Paten
SyDa
72
239
0
27 Jan 2020
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
90
330
0
04 Dec 2019
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
108
1,249
0
20 Nov 2019
A Comprehensive Survey on Transfer Learning
Fuzhen Zhuang
Zhiyuan Qi
Keyu Duan
Dongbo Xi
Yongchun Zhu
Hengshu Zhu
Hui Xiong
Qing He
188
4,474
0
07 Nov 2019
Confident Learning: Estimating Uncertainty in Dataset Labels
Curtis G. Northcutt
Lu Jiang
Isaac L. Chuang
NoLa
181
699
0
31 Oct 2019
Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods
Eyke Hüllermeier
Willem Waegeman
PER
UD
255
1,427
0
21 Oct 2019
Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging
Luke Oakden-Rayner
Jared A. Dunnmon
G. Carneiro
Christopher Ré
OOD
76
384
0
27 Sep 2019
Data Valuation using Reinforcement Learning
Jinsung Yoon
Sercan O. Arik
Tomas Pfister
TDI
86
181
0
25 Sep 2019
Synthetic Data for Deep Learning
Sergey I. Nikolenko
138
357
0
25 Sep 2019
Modeling Tabular data using Conditional GAN
Lei Xu
Maria Skoularidou
Alfredo Cuesta-Infante
K. Veeramachaneni
CML
MU
SyDa
GAN
121
1,262
0
01 Jul 2019
Robust Bi-Tempered Logistic Loss Based on Bregman Divergences
Ehsan Amid
Manfred K. Warmuth
Rohan Anil
Tomer Koren
NoLa
55
131
0
08 Jun 2019
Likelihood Ratios for Out-of-Distribution Detection
Jie Jessie Ren
Peter J. Liu
Emily Fertig
Jasper Snoek
Ryan Poplin
M. DePristo
Joshua V. Dillon
Balaji Lakshminarayanan
OODD
209
728
0
07 Jun 2019
Evaluating time series forecasting models: An empirical study on performance estimation methods
Vítor Cerqueira
Luís Torgo
I. Mozetič
AI4TS
45
257
0
28 May 2019
Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets
Chris Dulhanty
A. Wong
64
42
0
03 May 2019
Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI Integration Approach
Ki Hyun Tae
Yuji Roh
Young H. Oh
Hyunsub Kim
Steven Euijong Whang
64
72
0
22 Apr 2019
Data Shapley: Equitable Valuation of Data for Machine Learning
Amirata Ghorbani
James Zou
TDI
FedML
85
791
0
05 Apr 2019
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
Dan Hendrycks
Thomas G. Dietterich
OOD
VLM
194
3,455
0
28 Mar 2019
1
2
Next