ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.10795
  4. Cited By
Dataset Cartography: Mapping and Diagnosing Datasets with Training
  Dynamics

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

22 September 2020
Swabha Swayamdipta
Roy Schwartz
Nicholas Lourie
Yizhong Wang
Hannaneh Hajishirzi
Noah A. Smith
Yejin Choi
ArXivPDFHTML

Papers citing "Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics"

50 / 115 papers shown
Title
Revisiting Pre-training in Audio-Visual Learning
Revisiting Pre-training in Audio-Visual Learning
Ruoxuan Feng
Wenke Xia
Di Hu
30
1
0
07 Feb 2023
FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual
  Transfer with Scheduled Unfreezing
FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing
Chen Cecilia Liu
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
CLL
26
9
0
13 Jan 2023
Understanding Difficulty-based Sample Weighting with a Universal
  Difficulty Measure
Understanding Difficulty-based Sample Weighting with a Universal Difficulty Measure
Xiaoling Zhou
Ou Wu
Weiyao Zhu
Ziyang Liang
27
2
0
12 Jan 2023
DISCO: Distilling Counterfactuals with Large Language Models
DISCO: Distilling Counterfactuals with Large Language Models
Zeming Chen
Qiyue Gao
Antoine Bosselut
Ashish Sabharwal
Kyle Richardson
34
25
0
20 Dec 2022
Learning from Training Dynamics: Identifying Mislabeled Data Beyond
  Manually Designed Features
Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Qingrui Jia
Xuhong Li
Lei Yu
Jiang Bian
Penghao Zhao
Shupeng Li
Haoyi Xiong
Dejing Dou
NoLa
35
5
0
19 Dec 2022
Azimuth: Systematic Error Analysis for Text Classification
Azimuth: Systematic Error Analysis for Text Classification
Gabrielle Gauthier Melançon
Orlando Marquez Ayala
Lindsay D. Brin
Chris Tyler
Frederic Branchaud-Charron
Joseph Marinier
Karine Grande
Dieu-Thu Le
16
3
0
16 Dec 2022
Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency
  Methods
Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods
Josip Jukić
Martin Tutek
Jan Snajder
FAtt
21
0
0
15 Nov 2022
BERT on a Data Diet: Finding Important Examples by Gradient-Based
  Pruning
BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning
Mohsen Fayyaz
Ehsan Aghazadeh
Ali Modarressi
Mohammad Taher Pilehvar
Yadollah Yaghoobzadeh
Samira Ebrahimi Kahou
22
18
0
10 Nov 2022
DC-Check: A Data-Centric AI checklist to guide the development of
  reliable machine learning systems
DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems
Nabeel Seedat
F. Imrie
M. Schaar
27
12
0
09 Nov 2022
The 'Problem' of Human Label Variation: On Ground Truth in Data,
  Modeling and Evaluation
The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
Barbara Plank
30
97
0
04 Nov 2022
Exploring Mode Connectivity for Pre-trained Language Models
Exploring Mode Connectivity for Pre-trained Language Models
Yujia Qin
Cheng Qian
Jing Yi
Weize Chen
Yankai Lin
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
29
20
0
25 Oct 2022
Training Dynamics for Curriculum Learning: A Study on Monolingual and
  Cross-lingual NLU
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU
Fenia Christopoulou
Gerasimos Lampouras
Ignacio Iacobacci
45
3
0
22 Oct 2022
SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval
SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval
Kun Zhou
Yeyun Gong
Xiao Liu
Wayne Xin Zhao
Yelong Shen
...
Jing Lu
Rangan Majumder
Ji-Rong Wen
Nan Duan
Weizhu Chen
34
33
0
21 Oct 2022
Improving Data Quality with Training Dynamics of Gradient Boosting
  Decision Trees
Improving Data Quality with Training Dynamics of Gradient Boosting Decision Trees
M. Ponti
L. Oliveira
Mathias Esteban
Valentina Garcia
J. Román
Luis Argerich
TDI
30
4
0
20 Oct 2022
A Survey of Active Learning for Natural Language Processing
A Survey of Active Learning for Natural Language Processing
Zhisong Zhang
Emma Strubell
Eduard H. Hovy
LM&MA
33
65
0
18 Oct 2022
TiDAL: Learning Training Dynamics for Active Learning
TiDAL: Learning Training Dynamics for Active Learning
Seong Min Kye
Kwanghee Choi
Hyeongmin Byun
Buru Chang
34
13
0
13 Oct 2022
SEAL : Interactive Tool for Systematic Error Analysis and Labeling
SEAL : Interactive Tool for Systematic Error Analysis and Labeling
Nazneen Rajani
Weixin Liang
Lingjiao Chen
Margaret Mitchell
James Zou
45
16
0
11 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
Tanay Dixit
Bhargavi Paranjape
Hannaneh Hajishirzi
Luke Zettlemoyer
SyDa
146
23
0
10 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
119
93
0
06 Oct 2022
PROD: Progressive Distillation for Dense Retrieval
PROD: Progressive Distillation for Dense Retrieval
Zhenghao Lin
Yeyun Gong
Xiao Liu
Hang Zhang
Chen Lin
...
Jian Jiao
Jing Lu
Daxin Jiang
Rangan Majumder
Nan Duan
51
27
0
27 Sep 2022
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training
  Dynamics
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
44
27
0
20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
30
109
0
31 Aug 2022
The Value of Out-of-Distribution Data
The Value of Out-of-Distribution Data
Ashwin De Silva
Rahul Ramesh
Carey E. Priebe
Pratik Chaudhari
Joshua T. Vogelstein
OODD
23
11
0
23 Aug 2022
Evaluating and Crafting Datasets Effective for Deep Learning With Data
  Maps
Evaluating and Crafting Datasets Effective for Deep Learning With Data Maps
Jay Bishnu
Andrew Gondoputro
13
1
0
22 Aug 2022
Annotation Error Detection: Analyzing the Past and Present for a More
  Coherent Future
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
Jan-Christoph Klie
Bonnie Webber
Iryna Gurevych
40
43
0
05 Jun 2022
Re-Examining Calibration: The Case of Question Answering
Re-Examining Calibration: The Case of Question Answering
Chenglei Si
Chen Zhao
Sewon Min
Jordan L. Boyd-Graber
61
30
0
25 May 2022
An Empirical Investigation of Commonsense Self-Supervision with
  Knowledge Graphs
An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs
Jiarui Zhang
Filip Ilievski
Kaixin Ma
Jonathan M Francis
A. Oltramari
SSL
16
5
0
21 May 2022
Self-training with Two-phase Self-augmentation for Few-shot Dialogue
  Generation
Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation
Wanyu Du
Hanjie Chen
Yangfeng Ji
21
1
0
19 May 2022
ALLSH: Active Learning Guided by Local Sensitivity and Hardness
ALLSH: Active Learning Guided by Local Sensitivity and Hardness
Shujian Zhang
Chengyue Gong
Xingchao Liu
Pengcheng He
Weizhu Chen
Mingyuan Zhou
27
26
0
10 May 2022
A Data Cartography based MixUp for Pre-trained Language Models
A Data Cartography based MixUp for Pre-trained Language Models
Seohong Park
Cornelia Caragea
13
6
0
06 May 2022
Optimising Equal Opportunity Fairness in Model Training
Optimising Equal Opportunity Fairness in Model Training
Aili Shen
Xudong Han
Trevor Cohn
Timothy Baldwin
Lea Frermann
FaML
32
28
0
05 May 2022
Don't Blame the Annotator: Bias Already Starts in the Annotation
  Instructions
Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions
Mihir Parmar
Swaroop Mishra
Mor Geva
Chitta Baral
30
55
0
01 May 2022
Adapting and Evaluating Influence-Estimation Methods for
  Gradient-Boosted Decision Trees
Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees
Jonathan Brophy
Zayd Hammoudeh
Daniel Lowd
TDI
27
22
0
30 Apr 2022
On the Limitations of Dataset Balancing: The Lost Battle Against
  Spurious Correlations
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
32
25
0
27 Apr 2022
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning
  Tasks
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
Swaroop Mishra
Arindam Mitra
Neeraj Varshney
Bhavdeep Singh Sachdeva
Peter Clark
Chitta Baral
A. Kalyan
AIMat
ReLM
ELM
LRM
27
102
0
12 Apr 2022
Adaptor: Objective-Centric Adaptation Framework for Language Models
Adaptor: Objective-Centric Adaptation Framework for Language Models
Michal vStefánik
Vít Novotný
Nikola Groverová
Petr Sojka
32
10
0
08 Mar 2022
Feeding What You Need by Understanding What You Learned
Feeding What You Need by Understanding What You Learned
Xiaoqiang Wang
Bang Liu
Fangli Xu
Bowei Long
Siliang Tang
Lingfei Wu
62
6
0
05 Mar 2022
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution
  Shifts and Training Conflicts
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts
Weixin Liang
James Zou
OOD
40
82
0
14 Feb 2022
FORML: Learning to Reweight Data for Fairness
FORML: Learning to Reweight Data for Fairness
Bobby Yan
Skyler Seto
N. Apostoloff
FaML
23
11
0
03 Feb 2022
Handling Bias in Toxic Speech Detection: A Survey
Handling Bias in Toxic Speech Detection: A Survey
Tanmay Garg
Sarah Masud
Tharun Suresh
Tanmoy Chakraborty
17
91
0
26 Jan 2022
WANLI: Worker and AI Collaboration for Natural Language Inference
  Dataset Creation
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
64
211
0
16 Jan 2022
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
Alon Talmor
Ori Yoran
Ronan Le Bras
Chandrasekhar Bhagavatula
Yoav Goldberg
Yejin Choi
Jonathan Berant
ELM
19
141
0
14 Jan 2022
On the Impact of Hard Adversarial Instances on Overfitting in
  Adversarial Training
On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training
Chen Liu
Zhichao Huang
Mathieu Salzmann
Tong Zhang
Sabine Süsstrunk
AAML
23
13
0
14 Dec 2021
Dataset Geography: Mapping Language Data to Language Users
Dataset Geography: Mapping Language Data to Language Users
Fahim Faisal
Yinkai Wang
Antonios Anastasopoulos
62
23
0
07 Dec 2021
Multi-View Active Learning for Short Text Classification in
  User-Generated Data
Multi-View Active Learning for Short Text Classification in User-Generated Data
Payam Karisani
Negin Karisani
Li Xiong
VLM
15
4
0
05 Dec 2021
Understanding Out-of-distribution: A Perspective of Data Dynamics
Understanding Out-of-distribution: A Perspective of Data Dynamics
Dyah Adila
Dongyeop Kang
38
12
0
29 Nov 2021
Clean or Annotate: How to Spend a Limited Data Collection Budget
Clean or Annotate: How to Spend a Limited Data Collection Budget
Derek Chen
Zhou Yu
Samuel R. Bowman
35
13
0
15 Oct 2021
Online Multi-horizon Transaction Metric Estimation with Multi-modal
  Learning in Payment Networks
Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks
Chin-Chia Michael Yeh
Zhongfang Zhuang
Junpeng Wang
Yan Zheng
J. Ebrahimi
Ryan Mercer
Liang Wang
Wei Zhang
AI4TS
24
4
0
21 Sep 2021
Training Dynamic based data filtering may not work for NLP datasets
Training Dynamic based data filtering may not work for NLP datasets
Arka Talukdar
Monika Dagar
Prachi Gupta
Varun G. Menon
NoLa
45
3
0
19 Sep 2021
The Grammar-Learning Trajectories of Neural Language Models
The Grammar-Learning Trajectories of Neural Language Models
Leshem Choshen
Guy Hacohen
D. Weinshall
Omri Abend
29
28
0
13 Sep 2021
Previous
123
Next