Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles

29 June 2021

Papers citing "Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles"

44 / 44 papers shown

Title
Performance Estimation in Binary Classification Using Calibrated Confidence Juhani Kivimäki Jakub Białek W. Kuberski J. Nurminen 50 0 0 08 May 2025
Towards Unsupervised Model Selection for Domain Adaptive Object Detection Hengfu Yu Jinhong Deng Wen Li Lixin Duan 45 0 0 23 Dec 2024
Sequential Harmful Shift Detection Without Labels Salim I. Amoukou Tom Bewley Saumitra Mishra Freddy Lecue Daniele Magazzeni Manuela Veloso 88 1 0 17 Dec 2024
Poor-Supervised Evaluation for SuperLLM via Mutual Consistency Peiwen Yuan Shaoxiong Feng Yiwei Li Xinglin Wang Boyuan Pan Heda Wang Yao Hu Kan Li 33 1 0 25 Aug 2024
DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation Rakshith Subramanyam Kowshik Thopalli V. Narayanaswamy Jayaraman J.Thiagarajan 27 2 0 01 Aug 2024
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts Renchunzi Xie Ambroise Odonnat Vasilii Feofanov Weijian Deng Jianfeng Zhang Bo An 51 2 0 29 May 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward Xuan Xie Jiayang Song Zhehua Zhou Yuheng Huang Da Song Lei Ma OffRL 53 6 0 12 Apr 2024
Predicting the Performance of Foundation Models via Agreement-on-the-Line Aman Mehra Rahul Saxena Taeyoun Kim Christina Baek Zico Kolter Aditi Raghunathan UQCV 46 1 0 02 Apr 2024
AETTA: Label-Free Accuracy Estimation for Test-Time Adaptation Taeckyung Lee Sorn Chottananurak Taesik Gong Sung-Ju Lee 37 2 0 01 Apr 2024
A Survey on Evaluation of Out-of-Distribution Generalization Han Yu Jiashuo Liu Xingxuan Zhang Jiayun Wu Peng Cui OOD 47 8 0 04 Mar 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift Renchunzi Xie Ambroise Odonnat Vasilii Feofanov I. Redko Jianfeng Zhang Bo An UQCV 80 1 0 17 Jan 2024
A Review of Hybrid and Ensemble in Deep Learning for Natural Language Processing Jianguo Jia Wen-Chieh Liang Youzhi Liang VLM 17 17 0 09 Dec 2023
Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection Bias Ambroise Odonnat Vasilii Feofanov I. Redko 36 7 0 23 Oct 2023
Large Language Model Routing with Benchmark Datasets Tal Shnitzer Anthony Ou Mírian Silva Kate Soule Yuekai Sun Justin Solomon Neil Thompson Mikhail Yurochkin RALM 16 56 0 27 Sep 2023
Selecting which Dense Retriever to use for Zero-Shot Search Ekaterina Khramtsova Shengyao Zhuang Mahsa Baktashmotlagh Xi Wang Guido Zuccon 32 6 0 18 Sep 2023
Effective Proxy for Human Labeling: Ensemble Disagreement Scores in Large Language Models for Industrial NLP Wei Du Laksh Advani Yashmeet Gambhir Daniel J. Perry Prashant Shiralkar Zhengzheng Xing Aaron Colak ALM 30 1 0 11 Sep 2023
CAME: Contrastive Automated Model Evaluation Ru Peng Qiuyang Duan Haobo Wang Jiachen Ma Yanbo Jiang Yongjun Tu Xiu Jiang J. Zhao ELM 31 4 0 22 Aug 2023
Distance Matters For Improving Performance Estimation Under Covariate Shift Mélanie Roschewitz Ben Glocker 25 1 0 14 Aug 2023
Unsupervised Accuracy Estimation of Deep Visual Models using Domain-Adaptive Adversarial Perturbation without Source Samples JoonHo Lee J. Woo H. Moon Kwonho Lee 27 2 0 19 Jul 2023
LOVM: Language-Only Vision Model Selection O. Zohar Shih-Cheng Huang Kuan-Chieh Jackson Wang Serena Yeung MLLM 42 13 0 15 Jun 2023
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy Elan Rosenfeld Saurabh Garg UQCV 34 4 0 01 Jun 2023
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction Jiefeng Chen Jinsung Yoon Sayna Ebrahimi Sercan Ö. Arik S. Jha Tomas Pfister 36 1 0 07 Apr 2023
On the Efficacy of Generalization Error Prediction Scoring Functions Puja Trivedi Danai Koutra Jayaraman J. Thiagarajan 26 0 0 23 Mar 2023
A Bag-of-Prototypes Representation for Dataset-Level Applications Wei-Chih Tu Weijian Deng Tom Gedeon Liang Zheng 38 9 0 23 Mar 2023
Unsupervised Evaluation of Out-of-distribution Detection: A Data-centric Perspective Yuhang Zhang Weihong Deng Liang Zheng OODD 32 4 0 16 Feb 2023
RLSbench: Domain Adaptation Under Relaxed Label Shift Saurabh Garg Nick Erickson James Sharpnack Alexander J. Smola Sivaraman Balakrishnan Zachary Chase Lipton VLM 33 31 0 06 Feb 2023
Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness Ailin Deng Shen Li Miao Xiong Zhirui Chen Bryan Hooi 16 4 0 06 Feb 2023
Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation Weijian Deng Yumin Suh Stephen Gould Liang Zheng UQCV 29 12 0 02 Feb 2023
Demystifying Disagreement-on-the-Line in High Dimensions Dong-Hwan Lee Behrad Moniri Xinmeng Huang Yan Sun Hamed Hassani 21 8 0 31 Jan 2023
Improving the Reliability for Confidence Estimation Haoxuan Qu Yanchao Li Lin Geng Foo Jason Kuen Jiuxiang Gu Jun Liu UQCV 29 9 0 13 Oct 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions Lingjiao Chen Zhihua Jin Sabri Eyuboglu Christopher Ré Matei A. Zaharia James Zou 48 9 0 18 Sep 2022
Estimating and Explaining Model Performance When Both Covariates and Labels Shift Lingjiao Chen Matei A. Zaharia James Zou 30 15 0 18 Sep 2022
On the Strong Correlation Between Model Invariance and Generalization Weijian Deng Stephen Gould Liang Zheng OOD 32 16 0 14 Jul 2022
Agreement-on-the-Line: Predicting the Performance of Neural Networks under Distribution Shift Christina Baek Yiding Jiang Aditi Raghunathan Zico Kolter 29 79 0 27 Jun 2022
Understanding new tasks through the lens of training data via exponential tilting Subha Maity Mikhail Yurochkin Moulinath Banerjee Yuekai Sun 34 10 0 26 May 2022
Detecting Label Errors by using Pre-Trained Language Models Derek Chong Jenny Hong Christopher D. Manning NoLa 38 21 0 25 May 2022
Predicting Out-of-Distribution Error with the Projection Norm Yaodong Yu Zitong Yang Alexander Wei Yi Ma Jacob Steinhardt OODD 14 43 0 11 Feb 2022
A Note on "Assessing Generalization of SGD via Disagreement" Andreas Kirsch Y. Gal FedML UQCV 23 15 0 03 Feb 2022
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance Saurabh Garg Sivaraman Balakrishnan Zachary Chase Lipton Behnam Neyshabur Hanie Sedghi OODD OOD 45 125 0 11 Jan 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations Xiaoxiao Sun Yunzhong Hou Hongdong Li Liang Zheng 13 11 0 01 Dec 2021
DAPPER: Label-Free Performance Estimation after Personalization for Heterogeneous Mobile Sensing Taesik Gong Yewon Kim Adiba Orzikulova Yunxin Liu Sung Ju Hwang Jinwoo Shin Sung-Ju Lee 22 8 0 22 Nov 2021
Assessing Generalization of SGD via Disagreement Yiding Jiang Vaishnavh Nagarajan Christina Baek J. Zico Kolter 67 108 0 25 Jun 2021
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles Balaji Lakshminarayanan Alexander Pritzel Charles Blundell UQCV BDL 276 5,661 0 05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning Y. Gal Zoubin Ghahramani UQCV BDL 285 9,138 0 06 Jun 2015