ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05764
  4. Cited By
DC-Check: A Data-Centric AI checklist to guide the development of
  reliable machine learning systems

DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems

9 November 2022
Nabeel Seedat
F. Imrie
M. Schaar
ArXiv (abs)PDFHTML

Papers citing "DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems"

50 / 89 papers shown
Title
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Nabeel Seedat
Caterina Tozzi
Andrea Hita Ardiaca
Mihaela van der Schaar
James Weatherall
Adam Taylor
487
0
0
20 Nov 2024
Improving Adaptive Conformal Prediction Using Self-Supervised Learning
Improving Adaptive Conformal Prediction Using Self-Supervised Learning
Nabeel Seedat
Alan Jeffares
F. Imrie
M. Schaar
SSL
98
16
0
23 Feb 2023
Synthcity: facilitating innovative use cases of synthetic data in
  different data modalities
Synthcity: facilitating innovative use cases of synthetic data in different data modalities
Zhaozhi Qian
B. Cebere
M. Schaar
SyDa
86
63
0
18 Jan 2023
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular
  data
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data
Nabeel Seedat
Jonathan Crabbé
Ioana Bica
M. Schaar
74
25
0
24 Oct 2022
HyperImpute: Generalized Iterative Imputation with Automatic Model
  Selection
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection
Daniel Jarrett
B. Cebere
Tennison Liu
Alicia Curth
M. Schaar
48
78
0
15 Jun 2022
Domino: Discovering Systematic Errors with Cross-Modal Embeddings
Domino: Discovering Systematic Errors with Cross-Modal Embeddings
Sabri Eyuboglu
M. Varma
Khaled Kamal Saab
Jean-Benoit Delbrouck
Christopher Lee-Messer
Jared A. Dunnmon
James Zou
Christopher Ré
89
148
0
24 Mar 2022
From Concept Drift to Model Degradation: An Overview on
  Performance-Aware Drift Detectors
From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors
Firas Bayram
Bestoun S. Ahmed
A. Kassler
49
221
0
21 Mar 2022
Data-SUITE: Data-centric identification of in-distribution incongruous
  examples
Data-SUITE: Data-centric identification of in-distribution incongruous examples
Nabeel Seedat
Jonathan Crabbé
Mihaela van der Schaar
OOD
64
14
0
17 Feb 2022
Conditional Generation of Medical Time Series for Extrapolation to
  Underrepresented Populations
Conditional Generation of Medical Time Series for Extrapolation to Underrepresented Populations
Simon Bing
Andrea Dittadi
Stefan Bauer
Patrick Schwab
SyDa
70
17
0
20 Jan 2022
MLOps -- Definitions, Tools and Challenges
MLOps -- Definitions, Tools and Challenges
Georgios Symeonidis
Evangelos Nerantzis
A. Kazakis
G. Papakostas
73
92
0
01 Jan 2022
What can Data-Centric AI Learn from Data and ML Engineering?
What can Data-Centric AI Learn from Data and ML Engineering?
N. Polyzotis
Matei A. Zaharia
AI4CE
42
51
0
13 Dec 2021
Amazon SageMaker Model Monitor: A System for Real-Time Insights into
  Deployed Machine Learning Models
Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models
David Nigenda
Zohar Karnin
Muhammad Bilal Zafar
Raghu Ramesha
Alan Tan
Michele Donini
K. Kenthapadi
VLM
44
42
0
26 Nov 2021
Sample Selection for Fair and Robust Training
Sample Selection for Fair and Robust Training
Yuji Roh
Kangwook Lee
Steven Euijong Whang
Changho Suh
66
65
0
27 Oct 2021
DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative
  Networks
DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks
A. Saha
Trent Kyono
J. Linmans
M. Schaar
CML
74
111
0
25 Oct 2021
Tabular Data: Deep Learning is Not All You Need
Tabular Data: Deep Learning is Not All You Need
Ravid Shwartz-Ziv
Amitai Armon
LMTD
162
1,288
0
06 Jun 2021
Counterfactual Invariance to Spurious Correlations: Why and How to Pass
  Stress Tests
Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests
Victor Veitch
Alexander DÁmour
Steve Yadlowsky
Jacob Eisenstein
OOD
68
93
0
31 May 2021
Pervasive Label Errors in Test Sets Destabilize Machine Learning
  Benchmarks
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks
Curtis G. Northcutt
Anish Athalye
Jonas W. Mueller
94
537
0
26 Mar 2021
Detecting Spurious Correlations with Sanity Tests for Artificial
  Intelligence Guided Radiology Systems
Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology Systems
U. Mahmood
Robik Shrestha
D. Bates
L. Mannelli
G. Corrias
Y. Erdi
Christopher Kanan
60
16
0
04 Mar 2021
A Data Quality-Driven View of MLOps
A Data Quality-Driven View of MLOps
Cédric Renggli
Luka Rimanic
Nezihe Merve Gürel
Bojan Karlavs
Wentao Wu
Ce Zhang
AI4TS
44
65
0
15 Feb 2021
dalex: Responsible Machine Learning with Interactive Explainability and
  Fairness in Python
dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python
Hubert Baniecki
Wojciech Kretowicz
Piotr Piątyszek
J. Wiśniewski
P. Biecek
FaML
80
97
0
28 Dec 2020
Checklist for responsible deep learning modeling of medical images based
  on COVID-19 detection studies
Checklist for responsible deep learning modeling of medical images based on COVID-19 detection studies
Weronika Hryniewska
Przemysław Bombiński
P. Szatkowski
Paulina Tomaszewska
A. Przelaskowski
P. Biecek
OOD
85
47
0
11 Dec 2020
No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained
  Classification Problems
No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems
N. Sohoni
Jared A. Dunnmon
Geoffrey Angus
Albert Gu
Christopher Ré
87
252
0
25 Nov 2020
Energy-based Out-of-distribution Detection
Energy-based Out-of-distribution Detection
Weitang Liu
Xiaoyun Wang
John Douglas Owens
Yixuan Li
OODD
273
1,375
0
08 Oct 2020
Dataset Cartography: Mapping and Diagnosing Datasets with Training
  Dynamics
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
129
452
0
22 Sep 2020
Common pitfalls and recommendations for using machine learning to detect
  and prognosticate for COVID-19 using chest radiographs and CT scans
Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans
M. Roberts
D. Driggs
Matthew Thorpe
J. Gilbey
Michael Yeung
...
Kang Zhang
S. Stranks
James H. F. Rudd
Evis Sala
Carola-Bibiane Schönlieb
OOD
69
774
0
14 Aug 2020
Learning from Noisy Labels with Deep Neural Networks: A Survey
Learning from Noisy Labels with Deep Neural Networks: A Survey
Hwanjun Song
Minseok Kim
Dongmin Park
Yooju Shin
Jae-Gil Lee
NoLa
117
998
0
16 Jul 2020
Normalized Loss Functions for Deep Learning with Noisy Labels
Normalized Loss Functions for Deep Learning with Noisy Labels
Xingjun Ma
Hanxun Huang
Yisen Wang
Simone Romano
S. Erfani
James Bailey
NoLa
76
445
0
24 Jun 2020
From ImageNet to Image Classification: Contextualizing Progress on
  Benchmarks
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Dimitris Tsipras
Shibani Santurkar
Logan Engstrom
Andrew Ilyas
Aleksander Madry
77
135
0
22 May 2020
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro
Tongshuang Wu
Carlos Guestrin
Sameer Singh
ELM
210
1,110
0
08 May 2020
Open Graph Benchmark: Datasets for Machine Learning on Graphs
Open Graph Benchmark: Datasets for Machine Learning on Graphs
Weihua Hu
Matthias Fey
Marinka Zitnik
Yuxiao Dong
Hongyu Ren
Bowen Liu
Michele Catasta
J. Leskovec
311
2,752
0
02 May 2020
CheXbert: Combining Automatic Labelers and Expert Annotations for
  Accurate Radiology Report Labeling Using BERT
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
Akshay Smit
Saahil Jain
Pranav Rajpurkar
Anuj Pareek
A. Ng
M. Lungren
MedIm
62
333
0
20 Apr 2020
Learning Deep Kernels for Non-Parametric Two-Sample Tests
Learning Deep Kernels for Non-Parametric Two-Sample Tests
Feng Liu
Wenkai Xu
Jie Lu
Guangquan Zhang
Arthur Gretton
Danica J. Sutherland
81
188
0
21 Feb 2020
Identifying Mislabeled Data using the Area Under the Margin Ranking
Identifying Mislabeled Data using the Area Under the Margin Ranking
Geoff Pleiss
Tianyi Zhang
Ethan R. Elenberg
Kilian Q. Weinberger
NoLa
94
274
0
28 Jan 2020
DP-CGAN: Differentially Private Synthetic Data and Label Generation
DP-CGAN: Differentially Private Synthetic Data and Label Generation
Reihaneh Torkzadehmahani
Peter Kairouz
B. Paten
SyDa
72
239
0
27 Jan 2020
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DVAI4TSMedIm
93
330
0
04 Dec 2019
Distributionally Robust Neural Networks for Group Shifts: On the
  Importance of Regularization for Worst-Case Generalization
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
108
1,249
0
20 Nov 2019
A Comprehensive Survey on Transfer Learning
A Comprehensive Survey on Transfer Learning
Fuzhen Zhuang
Zhiyuan Qi
Keyu Duan
Dongbo Xi
Yongchun Zhu
Hengshu Zhu
Hui Xiong
Qing He
192
4,474
0
07 Nov 2019
Confident Learning: Estimating Uncertainty in Dataset Labels
Confident Learning: Estimating Uncertainty in Dataset Labels
Curtis G. Northcutt
Lu Jiang
Isaac L. Chuang
NoLa
191
699
0
31 Oct 2019
Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction
  to Concepts and Methods
Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods
Eyke Hüllermeier
Willem Waegeman
PERUD
255
1,427
0
21 Oct 2019
Hidden Stratification Causes Clinically Meaningful Failures in Machine
  Learning for Medical Imaging
Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging
Luke Oakden-Rayner
Jared A. Dunnmon
G. Carneiro
Christopher Ré
OOD
76
384
0
27 Sep 2019
Data Valuation using Reinforcement Learning
Data Valuation using Reinforcement Learning
Jinsung Yoon
Sercan O. Arik
Tomas Pfister
TDI
86
181
0
25 Sep 2019
Synthetic Data for Deep Learning
Synthetic Data for Deep Learning
Sergey I. Nikolenko
138
357
0
25 Sep 2019
Modeling Tabular data using Conditional GAN
Modeling Tabular data using Conditional GAN
Lei Xu
Maria Skoularidou
Alfredo Cuesta-Infante
K. Veeramachaneni
CMLMUSyDaGAN
121
1,262
0
01 Jul 2019
Robust Bi-Tempered Logistic Loss Based on Bregman Divergences
Robust Bi-Tempered Logistic Loss Based on Bregman Divergences
Ehsan Amid
Manfred K. Warmuth
Rohan Anil
Tomer Koren
NoLa
58
131
0
08 Jun 2019
Likelihood Ratios for Out-of-Distribution Detection
Likelihood Ratios for Out-of-Distribution Detection
Jie Jessie Ren
Peter J. Liu
Emily Fertig
Jasper Snoek
Ryan Poplin
M. DePristo
Joshua V. Dillon
Balaji Lakshminarayanan
OODD
209
728
0
07 Jun 2019
Evaluating time series forecasting models: An empirical study on
  performance estimation methods
Evaluating time series forecasting models: An empirical study on performance estimation methods
Vítor Cerqueira
Luís Torgo
I. Mozetič
AI4TS
48
257
0
28 May 2019
Auditing ImageNet: Towards a Model-driven Framework for Annotating
  Demographic Attributes of Large-Scale Image Datasets
Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets
Chris Dulhanty
A. Wong
64
42
0
03 May 2019
Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI
  Integration Approach
Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI Integration Approach
Ki Hyun Tae
Yuji Roh
Young H. Oh
Hyunsub Kim
Steven Euijong Whang
64
72
0
22 Apr 2019
Data Shapley: Equitable Valuation of Data for Machine Learning
Data Shapley: Equitable Valuation of Data for Machine Learning
Amirata Ghorbani
James Zou
TDIFedML
85
791
0
05 Apr 2019
Benchmarking Neural Network Robustness to Common Corruptions and
  Perturbations
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
Dan Hendrycks
Thomas G. Dietterich
OODVLM
196
3,455
0
28 Mar 2019
12
Next