Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.02280
Cited By
v1
v2 (latest)
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
5 June 2022
Jan-Christoph Klie
Bonnie Webber
Iryna Gurevych
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future"
31 / 31 papers shown
Title
Validating LLM-as-a-Judge Systems in the Absence of Gold Labels
Luke M. Guerdan
Solon Barocas
Kenneth Holstein
Hanna M. Wallach
Zhiwei Steven Wu
Alexandra Chouldechova
ALM
ELM
527
2
0
13 Mar 2025
VideoA11y: Method and Dataset for Accessible Video Description
Chaoyu Li
Sid Padmanabhuni
Maryam Cheema
H. Seifi
Pooyan Fazli
VGen
99
2
0
27 Feb 2025
On Evaluation of Vision Datasets and Models using Human Competency Frameworks
Rahul Ramachandran
Tejal Kulkarni
Charchit Sharma
Deepak Vijaykeerthy
Vineeth N Balasubramanian
59
1
0
06 Sep 2024
Poor-Supervised Evaluation for SuperLLM via Mutual Consistency
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Xinglin Wang
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
77
1
0
25 Aug 2024
CoverBench: A Challenging Benchmark for Complex Claim Verification
Alon Jacovi
Moran Ambar
Eyal Ben-David
Uri Shaham
Amir Feder
Mor Geva
Dror Marcus
Avi Caciularu
LMTD
96
4
0
06 Aug 2024
Let Guidelines Guide You: A Prescriptive Guideline-Centered Data Annotation Methodology
Federico Ruggeri
Eleonora Misino
Arianna Muti
Katerina Korre
Paolo Torroni
Alberto Barrón-Cedeño
110
1
0
20 Jun 2024
DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang
Yingzhuo Yu
Jin Huang
Xingjian Zhang
Jiaqi Ma
122
1
0
11 Jun 2024
Multi-Label Classification for Implicit Discourse Relation Recognition
Wanqiu Long
N. Siddharth
Bonnie Webber
103
7
0
06 Jun 2024
Large Language Models: A New Approach for Privacy Policy Analysis at Scale
David Rodriguez
Ian Yang
J. M. D. Álamo
Norman M. Sadeh
62
11
0
31 May 2024
On Efficient and Statistical Quality Estimation for Data Annotation
Jan-Christoph Klie
Juan Haladjian
Marc Kirchner
Rahul Nair
75
3
0
20 May 2024
NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity Recognition
Elena Merdjanovska
Ansar Aynetdinov
Alan Akbik
NoLa
79
1
0
13 May 2024
VariErr NLI: Separating Annotation Error from Human Label Variation
Leon Weber-Genzel
Siyao Peng
M. Marneffe
Barbara Plank
91
11
0
04 Mar 2024
Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?
Joris Baan
Raquel Fernández
Barbara Plank
Wilker Aziz
114
11
0
25 Feb 2024
ToMBench: Benchmarking Theory of Mind in Large Language Models
Zhuang Chen
Jincenzi Wu
Jinfeng Zhou
Bosi Wen
Guanqun Bi
...
Yaru Cao
Mengting Hu
Yunghwei Lai
Zexuan Xiong
Minlie Huang
111
21
0
23 Feb 2024
Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations
Siyao Peng
Zihang Sun
Sebastian Loftus
Barbara Plank
69
3
0
02 Feb 2024
Elucidating and Overcoming the Challenges of Label Noise in Supervised Contrastive Learning
Zijun Long
George Killick
Lipeng Zhuang
R. McCreadie
Gerardo Aragon Camarasa
Paul Henderson
60
5
0
25 Nov 2023
Data Cleaning and Machine Learning: A Systematic Literature Review
Pierre-Olivier Coté
Amin Nikanjam
Nafisa Ahmed
D. Humeniuk
Foutse Khomh
91
31
0
03 Oct 2023
Donkii: Can Annotation Error Detection Methods Find Errors in Instruction-Tuning Datasets?
Leon Weber-Genzel
Robert Litschko
Ekaterina Artemova
Barbara Plank
100
2
0
04 Sep 2023
ObjectLab: Automated Diagnosis of Mislabeled Images in Object Detection Data
Ulyana Tkachenko
Aditya Thyagarajan
Jonas W. Mueller
125
5
0
02 Sep 2023
Controlling Federated Learning for Covertness
Adit Jain
Vikram Krishnamurthy
FedML
61
6
0
17 Aug 2023
Analyzing Dataset Annotation Quality Management in the Wild
Jan-Christoph Klie
Richard Eckart de Castilho
Iryna Gurevych
88
26
0
16 Jul 2023
Estimating label quality and errors in semantic segmentation data via any model
Vedang Lad
Jonas W. Mueller
UQCV
73
7
0
11 Jul 2023
Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions
John Joon Young Chung
Ece Kamar
Saleema Amershi
ALM
106
121
0
07 Jun 2023
ActiveAED: A Human in the Loop Improves Annotation Error Detection
Leon Weber
Barbara Plank
76
11
0
31 May 2023
What's the Meaning of Superhuman Performance in Today's NLU?
Simone Tedeschi
Johan Bos
T. Declerck
Jan Hajic
Daniel Hershcovich
...
Simon Krek
Steven Schockaert
Rico Sennrich
Ekaterina Shutova
Roberto Navigli
ELM
LM&MA
VLM
ReLM
LRM
96
27
0
15 May 2023
Identifying Incorrect Annotations in Multi-Label Classification Data
Aditya Thyagarajan
Elías Snorrason
Curtis G. Northcutt
Jonas W. Mueller
80
11
0
25 Nov 2022
The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
Barbara Plank
93
100
0
04 Nov 2022
Stop Measuring Calibration When Humans Disagree
Joris Baan
Wilker Aziz
Barbara Plank
Raquel Fernández
94
56
0
28 Oct 2022
Detecting Label Errors in Token Classification Data
Wei-Chen Wang
Jonas W. Mueller
127
14
0
08 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
156
114
0
31 Aug 2022
A Survey of Intent Classification and Slot-Filling Datasets for Task-Oriented Dialog
Stefan Larson
Kevin Leach
96
21
0
26 Jul 2022
1