Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.10795
Cited By
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
22 September 2020
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics"
50 / 115 papers shown
Title
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
30
1
0
07 Feb 2023
FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing
Chen Cecilia Liu
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
CLL
26
9
0
13 Jan 2023
Understanding Difficulty-based Sample Weighting with a Universal Difficulty Measure
Xiaoling Zhou
Ou Wu
Weiyao Zhu
Ziyang Liang
27
2
0
12 Jan 2023
DISCO: Distilling Counterfactuals with Large Language Models
Zeming Chen
Qiyue Gao
Antoine Bosselut
Ashish Sabharwal
Kyle Richardson
34
25
0
20 Dec 2022
Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Qingrui Jia
Xuhong Li
Lei Yu
Jiang Bian
Penghao Zhao
Shupeng Li
Haoyi Xiong
Dejing Dou
NoLa
35
5
0
19 Dec 2022
Azimuth: Systematic Error Analysis for Text Classification
Gabrielle Gauthier Melançon
Orlando Marquez Ayala
Lindsay D. Brin
Chris Tyler
Frederic Branchaud-Charron
Joseph Marinier
Karine Grande
Dieu-Thu Le
16
3
0
16 Dec 2022
Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods
Josip Jukić
Martin Tutek
Jan Snajder
FAtt
21
0
0
15 Nov 2022
BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning
Mohsen Fayyaz
Ehsan Aghazadeh
Ali Modarressi
Mohammad Taher Pilehvar
Yadollah Yaghoobzadeh
Samira Ebrahimi Kahou
22
18
0
10 Nov 2022
DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems
Nabeel Seedat
F. Imrie
M. Schaar
27
12
0
09 Nov 2022
The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
Barbara Plank
30
97
0
04 Nov 2022
Exploring Mode Connectivity for Pre-trained Language Models
Yujia Qin
Cheng Qian
Jing Yi
Weize Chen
Yankai Lin
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
29
20
0
25 Oct 2022
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU
Fenia Christopoulou
Gerasimos Lampouras
Ignacio Iacobacci
45
3
0
22 Oct 2022
SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval
Kun Zhou
Yeyun Gong
Xiao Liu
Wayne Xin Zhao
Yelong Shen
...
Jing Lu
Rangan Majumder
Ji-Rong Wen
Nan Duan
Weizhu Chen
34
33
0
21 Oct 2022
Improving Data Quality with Training Dynamics of Gradient Boosting Decision Trees
M. Ponti
L. Oliveira
Mathias Esteban
Valentina Garcia
J. Román
Luis Argerich
TDI
30
4
0
20 Oct 2022
A Survey of Active Learning for Natural Language Processing
Zhisong Zhang
Emma Strubell
Eduard H. Hovy
LM&MA
33
65
0
18 Oct 2022
TiDAL: Learning Training Dynamics for Active Learning
Seong Min Kye
Kwanghee Choi
Hyeongmin Byun
Buru Chang
34
13
0
13 Oct 2022
SEAL : Interactive Tool for Systematic Error Analysis and Labeling
Nazneen Rajani
Weixin Liang
Lingjiao Chen
Margaret Mitchell
James Zou
45
16
0
11 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
Tanay Dixit
Bhargavi Paranjape
Hannaneh Hajishirzi
Luke Zettlemoyer
SyDa
146
23
0
10 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
119
93
0
06 Oct 2022
PROD: Progressive Distillation for Dense Retrieval
Zhenghao Lin
Yeyun Gong
Xiao Liu
Hang Zhang
Chen Lin
...
Jian Jiao
Jing Lu
Daxin Jiang
Rangan Majumder
Nan Duan
51
27
0
27 Sep 2022
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
44
27
0
20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
30
109
0
31 Aug 2022
The Value of Out-of-Distribution Data
Ashwin De Silva
Rahul Ramesh
Carey E. Priebe
Pratik Chaudhari
Joshua T. Vogelstein
OODD
23
11
0
23 Aug 2022
Evaluating and Crafting Datasets Effective for Deep Learning With Data Maps
Jay Bishnu
Andrew Gondoputro
13
1
0
22 Aug 2022
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
Jan-Christoph Klie
Bonnie Webber
Iryna Gurevych
40
43
0
05 Jun 2022
Re-Examining Calibration: The Case of Question Answering
Chenglei Si
Chen Zhao
Sewon Min
Jordan L. Boyd-Graber
61
30
0
25 May 2022
An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs
Jiarui Zhang
Filip Ilievski
Kaixin Ma
Jonathan M Francis
A. Oltramari
SSL
16
5
0
21 May 2022
Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation
Wanyu Du
Hanjie Chen
Yangfeng Ji
21
1
0
19 May 2022
ALLSH: Active Learning Guided by Local Sensitivity and Hardness
Shujian Zhang
Chengyue Gong
Xingchao Liu
Pengcheng He
Weizhu Chen
Mingyuan Zhou
27
26
0
10 May 2022
A Data Cartography based MixUp for Pre-trained Language Models
Seohong Park
Cornelia Caragea
13
6
0
06 May 2022
Optimising Equal Opportunity Fairness in Model Training
Aili Shen
Xudong Han
Trevor Cohn
Timothy Baldwin
Lea Frermann
FaML
32
28
0
05 May 2022
Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions
Mihir Parmar
Swaroop Mishra
Mor Geva
Chitta Baral
30
55
0
01 May 2022
Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees
Jonathan Brophy
Zayd Hammoudeh
Daniel Lowd
TDI
27
22
0
30 Apr 2022
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
32
25
0
27 Apr 2022
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Swaroop Mishra
Arindam Mitra
Neeraj Varshney
Bhavdeep Singh Sachdeva
Peter Clark
Chitta Baral
A. Kalyan
AIMat
ReLM
ELM
LRM
27
102
0
12 Apr 2022
Adaptor: Objective-Centric Adaptation Framework for Language Models
Michal vStefánik
Vít Novotný
Nikola Groverová
Petr Sojka
32
10
0
08 Mar 2022
Feeding What You Need by Understanding What You Learned
Xiaoqiang Wang
Bang Liu
Fangli Xu
Bowei Long
Siliang Tang
Lingfei Wu
62
6
0
05 Mar 2022
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts
Weixin Liang
James Zou
OOD
40
82
0
14 Feb 2022
FORML: Learning to Reweight Data for Fairness
Bobby Yan
Skyler Seto
N. Apostoloff
FaML
23
11
0
03 Feb 2022
Handling Bias in Toxic Speech Detection: A Survey
Tanmay Garg
Sarah Masud
Tharun Suresh
Tanmoy Chakraborty
17
91
0
26 Jan 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
64
211
0
16 Jan 2022
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
Alon Talmor
Ori Yoran
Ronan Le Bras
Chandrasekhar Bhagavatula
Yoav Goldberg
Yejin Choi
Jonathan Berant
ELM
19
141
0
14 Jan 2022
On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training
Chen Liu
Zhichao Huang
Mathieu Salzmann
Tong Zhang
Sabine Süsstrunk
AAML
23
13
0
14 Dec 2021
Dataset Geography: Mapping Language Data to Language Users
Fahim Faisal
Yinkai Wang
Antonios Anastasopoulos
62
23
0
07 Dec 2021
Multi-View Active Learning for Short Text Classification in User-Generated Data
Payam Karisani
Negin Karisani
Li Xiong
VLM
15
4
0
05 Dec 2021
Understanding Out-of-distribution: A Perspective of Data Dynamics
Dyah Adila
Dongyeop Kang
38
12
0
29 Nov 2021
Clean or Annotate: How to Spend a Limited Data Collection Budget
Derek Chen
Zhou Yu
Samuel R. Bowman
35
13
0
15 Oct 2021
Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks
Chin-Chia Michael Yeh
Zhongfang Zhuang
Junpeng Wang
Yan Zheng
J. Ebrahimi
Ryan Mercer
Liang Wang
Wei Zhang
AI4TS
24
4
0
21 Sep 2021
Training Dynamic based data filtering may not work for NLP datasets
Arka Talukdar
Monika Dagar
Prachi Gupta
Varun G. Menon
NoLa
45
3
0
19 Sep 2021
The Grammar-Learning Trajectories of Neural Language Models
Leshem Choshen
Guy Hacohen
D. Weinshall
Omri Abend
29
28
0
13 Sep 2021
Previous
1
2
3
Next