Machine Learning Testing: Survey, Landscapes and Horizons

19 June 2019

Yang Liu

Papers citing "Machine Learning Testing: Survey, Landscapes and Horizons"

50 / 224 papers shown

Title
Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine Learning Jarren Briscoe Assefaw Gebremedhin FaML 93 3 0 08 May 2025
Testing Individual Fairness in Graph Neural Networks Roya Nasiri 24 0 0 25 Apr 2025
JurisCTC: Enhancing Legal Judgment Prediction via Cross-Domain Transfer and Contrastive Learning Zhaolu Kang Hongtian Cai Xiangyang Ji Jinzhe Li Nanfei Gu AILaw ELM 55 0 0 24 Apr 2025
Scalability and Maintainability Challenges and Solutions in Machine Learning: Systematic Literature Review Karthik Shivashankar Ghadi S. Al Hajj Antonio Martini 20 0 0 15 Apr 2025
Towards Assessing Deep Learning Test Input Generators Seif Mzoughi Ahmed Hajyahmed Mohamed Elshafei Foutse Khomh anb Diego Elias Costa D. Costa AAML 37 0 0 03 Apr 2025
Rule-Guided Reinforcement Learning Policy Evaluation and Improvement Martin Tappler Ignacio D. Lopez-Miguel Sebastian Tschiatschek Ezio Bartocci 64 0 0 13 Mar 2025
Verification and Validation for Trustworthy Scientific Machine Learning John D. Jakeman Lorena A. Barba J. Martins Thomas O'Leary-Roseberry AI4CE 58 0 0 21 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations Borui Yang Md Afif Al Mamun Jie M. Zhang Gias Uddin HILM 64 0 0 20 Feb 2025
Hierarchical Fallback Architecture for High Risk Online Machine Learning Inference Gustavo Polleti Marlesson Santana Felipe Sassi Del Sant Eduardo Fontes 38 1 0 29 Jan 2025
Path Analysis for Effective Fault Localization in Deep Neural Networks Soroush Hashemifar Saeed Parsa A. Kalaee AAML 44 0 0 28 Jan 2025
Do Existing Testing Tools Really Uncover Gender Bias in Text-to-Image Models? Yunbo Lyu Zhou Yang Yuqing Niu Jing Jiang David Lo 37 1 0 28 Jan 2025
Predictable Artificial Intelligence Lexin Zhou Pablo Antonio Moreno Casares Fernando Martínez-Plumed John Burden Ryan Burnell ... Seán Ó hÉigeartaigh Danaja Rutar Wout Schellaert Konstantinos Voudouris José Hernández-Orallo 51 2 0 08 Jan 2025
Benchmarking Generative AI Models for Deep Learning Test Input Generation Maryam Matteo Biagiola Andrea Stocco Vincenzo Riccio VLM 41 3 0 23 Dec 2024
A Coverage-Guided Testing Framework for Quantum Neural Networks Minqi Shao Jianjun Zhao AAML 31 1 0 03 Nov 2024
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers Lam Nguyen Tung Steven Cho Xiaoning Du Neelofar Neelofar Valerio Terragni Stefano Ruberto Aldeida Aleti 150 2 0 30 Oct 2024
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework Esteban Garces Arias Hannah Blocher Julian Rodemann Meimingwei Li Christian Heumann Matthias Aßenmacher 28 1 0 24 Oct 2024
Time to Retrain? Detecting Concept Drifts in Machine Learning Systems Tri Minh Triet Pham Karthikeyan Premkumar Mohamed Naili Jinqiu Yang AI4TS 23 0 0 11 Oct 2024
Leveraging generative models to characterize the failure conditions of image classifiers Adrien Le Coz Stéphane Herbin Faouzi Adjed GAN 29 1 0 01 Oct 2024
MILE: A Mutation Testing Framework of In-Context Learning Systems Zeming Wei Yihao Zhang Meng Sun 48 0 0 07 Sep 2024
Large Language Model-Based Agents for Software Engineering: A Survey Junwei Liu Kaixin Wang Yixuan Chen Xin Peng Zhenpeng Chen Lingming Zhang Yiling Lou AI4CE LLMAG LM&Ro 42 37 0 04 Sep 2024
A Catalog of Fairness-Aware Practices in Machine Learning Engineering Gianmario Voria Giulia Sellitto Carmine Ferrara Francesco Abate A. Lucia F. Ferrucci Gemma Catolino Fabio Palomba FaML 39 3 0 29 Aug 2024
Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems Mohammad Hossein Amini Shiva Nejati 35 1 0 25 Aug 2024
LeCov: Multi-level Testing Criteria for Large Language Models Xuan Xie Jiayang Song Yuheng Huang Da Song Fuyuan Zhang Felix Juefei-Xu Lei Ma ELM 31 0 0 20 Aug 2024
Maintainability Challenges in ML: A Systematic Literature Review Karthik Shivashankar Antonio Martini 37 9 0 17 Aug 2024
A Conceptual Framework for Ethical Evaluation of Machine Learning Systems Neha R. Gupta Jessica Hullman Hari Subramonyam FaML 42 3 0 05 Aug 2024
Evaluating Human Trajectory Prediction with Metamorphic Testing Helge Spieker Nassim Belmecheri Arnaud Gotlieb Nadjib Lazaar 31 0 0 26 Jul 2024
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective Yu-An Liu Ruqing Zhang Jiafeng Guo Maarten de Rijke Yixing Fan Xueqi Cheng 35 6 0 09 Jul 2024
Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs Shiyu Zhang Haoyang Song Qixin Wang Yu Pei 42 0 0 28 Jun 2024
Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine Learning System Decisions Hussaini Mamman S. Basri A. Balogun Abubakar Abdullahi Imam Ganesh M. Kumar L. F. Capretz FaML 33 0 0 25 Jun 2024
On Security Weaknesses and Vulnerabilities in Deep Learning Systems Zhongzheng Lai Huaming Chen Ruoxi Sun Yu Zhang Minhui Xue Dong Yuan AAML 43 2 0 12 Jun 2024
Statistical Multicriteria Benchmarking via the GSD-Front Christoph Jansen G. Schollmeyer Julian Rodemann Hannah Blocher Thomas Augustin 46 4 0 06 Jun 2024
System Safety Monitoring of Learned Components Using Temporal Metric Forecasting Sepehr Sharifi Andrea Stocco Lionel C. Briand AI4TS 48 1 0 21 May 2024
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks Guanhua Zhang Moritz Hardt 42 7 0 02 May 2024
Predicting Fairness of ML Software Configurations Salvador Robles Herrera Verya Monjezi V. Kreinovich Ashutosh Trivedi Saeid Tizpaz-Niari 29 1 0 29 Apr 2024
Deep Learning Library Testing: Definition, Methods and Challenges Xiaoyu Zhang Weipeng Jiang Chao Shen Qi Li Qian Wang Chenhao Lin Xiaohong Guan AAML 38 1 0 27 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward Xuan Xie Jiayang Song Zhehua Zhou Yuheng Huang Da Song Lei Ma OffRL 53 6 0 12 Apr 2024
A Survey of Neural Network Robustness Assessment in Image Recognition Jie Wang Jun Ai Minyan Lu Haoran Su Dan Yu Yutao Zhang Junda Zhu Jingyu Liu AAML 30 3 0 12 Apr 2024
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation Olivier Binette Youngsoo Baek Siddharth Engineer Christina Jones Abel Dasylva Jerome P. Reiter 37 2 0 08 Apr 2024
DeepKnowledge: Generalisation-Driven Deep Learning Testing S. Missaoui Simos Gerasimou Nikolaos Matragkas 40 0 0 25 Mar 2024
How do Machine Learning Projects use Continuous Integration Practices? An Empirical Study on GitHub Actions Joao Helis Bernardo Daniel Alencar Da Costa Sérgio Queiroz de Medeiros U. Kulesza 18 3 0 14 Mar 2024
Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep Learning Projects Han Wang Sijia Yu Chunyang Chen Burak Turhan Xiaodong Zhu ELM MLAU 23 2 0 26 Feb 2024
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models Minsuk Kahng Ian Tenney Mahima Pushkarna Michael Xieyang Liu James Wexler Emily Reif Krystal Kallarackal Minsuk Chang Michael Terry Lucas Dixon 56 21 0 16 Feb 2024
What About the Data? A Mapping Study on Data Engineering for AI Systems Petra Heck 26 3 0 07 Feb 2024
Outline of an Independent Systematic Blackbox Test for ML-based Systems H. Wiesbrock Jürgen Grossmann 27 0 0 30 Jan 2024
Data vs. Model Machine Learning Fairness Testing: An Empirical Study Arumoy Shome Luís Cruz A. van Deursen 42 3 0 15 Jan 2024
A Survey on Verification and Validation, Testing and Evaluations of Neurosymbolic Artificial Intelligence Justus Renkhoff Ke-ke Feng Marc Meier-Doernberg Alvaro Velasquez Houbing Herbert Song 34 8 0 06 Jan 2024
New Job, New Gender? Measuring the Social Bias in Image Generation Models Wenxuan Wang Haonan Bai Jen-tse Huang Yuxuan Wan Youliang Yuan Haoyi Qiu Nanyun Peng Michael R. Lyu 47 20 0 01 Jan 2024
The Earth is Flat? Unveiling Factual Errors in Large Language Models Wenxuan Wang Juluan Shi Zhaopeng Tu Youliang Yuan Jen-tse Huang Wenxiang Jiao Michael R. Lyu KELM HILM SyDa 47 1 0 01 Jan 2024
FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions Nikolaos Louloudakis Perry Gibson José Cano Ajitha Rajan 17 0 0 22 Dec 2023
FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs S. Kadhe Anisa Halimi Ambrish Rawat Nathalie Baracaldo MU 22 7 0 12 Dec 2023