ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.10742
  4. Cited By
Machine Learning Testing: Survey, Landscapes and Horizons

Machine Learning Testing: Survey, Landscapes and Horizons

19 June 2019
Jie M. Zhang
Mark Harman
Lei Ma
Yang Liu
    VLM
    AILaw
ArXivPDFHTML

Papers citing "Machine Learning Testing: Survey, Landscapes and Horizons"

50 / 224 papers shown
Title
Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine Learning
Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine Learning
Jarren Briscoe
Assefaw Gebremedhin
FaML
93
3
0
08 May 2025
Testing Individual Fairness in Graph Neural Networks
Testing Individual Fairness in Graph Neural Networks
Roya Nasiri
24
0
0
25 Apr 2025
JurisCTC: Enhancing Legal Judgment Prediction via Cross-Domain Transfer and Contrastive Learning
JurisCTC: Enhancing Legal Judgment Prediction via Cross-Domain Transfer and Contrastive Learning
Zhaolu Kang
Hongtian Cai
Xiangyang Ji
Jinzhe Li
Nanfei Gu
AILaw
ELM
55
0
0
24 Apr 2025
Scalability and Maintainability Challenges and Solutions in Machine Learning: Systematic Literature Review
Scalability and Maintainability Challenges and Solutions in Machine Learning: Systematic Literature Review
Karthik Shivashankar
Ghadi S. Al Hajj
Antonio Martini
20
0
0
15 Apr 2025
Towards Assessing Deep Learning Test Input Generators
Towards Assessing Deep Learning Test Input Generators
Seif Mzoughi
Ahmed Hajyahmed
Mohamed Elshafei
Foutse Khomh anb Diego Elias Costa
D. Costa
AAML
37
0
0
03 Apr 2025
Rule-Guided Reinforcement Learning Policy Evaluation and Improvement
Martin Tappler
Ignacio D. Lopez-Miguel
Sebastian Tschiatschek
Ezio Bartocci
64
0
0
13 Mar 2025
Verification and Validation for Trustworthy Scientific Machine Learning
Verification and Validation for Trustworthy Scientific Machine Learning
John D. Jakeman
Lorena A. Barba
J. Martins
Thomas O'Leary-Roseberry
AI4CE
58
0
0
21 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations
Hallucination Detection in Large Language Models with Metamorphic Relations
Borui Yang
Md Afif Al Mamun
Jie M. Zhang
Gias Uddin
HILM
64
0
0
20 Feb 2025
Hierarchical Fallback Architecture for High Risk Online Machine Learning Inference
Hierarchical Fallback Architecture for High Risk Online Machine Learning Inference
Gustavo Polleti
Marlesson Santana
Felipe Sassi Del Sant
Eduardo Fontes
38
1
0
29 Jan 2025
Path Analysis for Effective Fault Localization in Deep Neural Networks
Path Analysis for Effective Fault Localization in Deep Neural Networks
Soroush Hashemifar
Saeed Parsa
A. Kalaee
AAML
44
0
0
28 Jan 2025
Do Existing Testing Tools Really Uncover Gender Bias in Text-to-Image Models?
Yunbo Lyu
Zhou Yang
Yuqing Niu
Jing Jiang
David Lo
37
1
0
28 Jan 2025
Predictable Artificial Intelligence
Predictable Artificial Intelligence
Lexin Zhou
Pablo Antonio Moreno Casares
Fernando Martínez-Plumed
John Burden
Ryan Burnell
...
Seán Ó hÉigeartaigh
Danaja Rutar
Wout Schellaert
Konstantinos Voudouris
José Hernández-Orallo
51
2
0
08 Jan 2025
Benchmarking Generative AI Models for Deep Learning Test Input
  Generation
Benchmarking Generative AI Models for Deep Learning Test Input Generation
Maryam
Matteo Biagiola
Andrea Stocco
Vincenzo Riccio
VLM
41
3
0
23 Dec 2024
A Coverage-Guided Testing Framework for Quantum Neural Networks
A Coverage-Guided Testing Framework for Quantum Neural Networks
Minqi Shao
Jianjun Zhao
AAML
31
1
0
03 Nov 2024
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers
Lam Nguyen Tung
Steven Cho
Xiaoning Du
Neelofar Neelofar
Valerio Terragni
Stefano Ruberto
Aldeida Aleti
150
2
0
30 Oct 2024
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework
Esteban Garces Arias
Hannah Blocher
Julian Rodemann
Meimingwei Li
Christian Heumann
Matthias Aßenmacher
28
1
0
24 Oct 2024
Time to Retrain? Detecting Concept Drifts in Machine Learning Systems
Time to Retrain? Detecting Concept Drifts in Machine Learning Systems
Tri Minh Triet Pham
Karthikeyan Premkumar
Mohamed Naili
Jinqiu Yang
AI4TS
23
0
0
11 Oct 2024
Leveraging generative models to characterize the failure conditions of
  image classifiers
Leveraging generative models to characterize the failure conditions of image classifiers
Adrien Le Coz
Stéphane Herbin
Faouzi Adjed
GAN
29
1
0
01 Oct 2024
MILE: A Mutation Testing Framework of In-Context Learning Systems
MILE: A Mutation Testing Framework of In-Context Learning Systems
Zeming Wei
Yihao Zhang
Meng Sun
48
0
0
07 Sep 2024
Large Language Model-Based Agents for Software Engineering: A Survey
Large Language Model-Based Agents for Software Engineering: A Survey
Junwei Liu
Kaixin Wang
Yixuan Chen
Xin Peng
Zhenpeng Chen
Lingming Zhang
Yiling Lou
AI4CE
LLMAG
LM&Ro
42
37
0
04 Sep 2024
A Catalog of Fairness-Aware Practices in Machine Learning Engineering
A Catalog of Fairness-Aware Practices in Machine Learning Engineering
Gianmario Voria
Giulia Sellitto
Carmine Ferrara
Francesco Abate
A. Lucia
F. Ferrucci
Gemma Catolino
Fabio Palomba
FaML
39
3
0
29 Aug 2024
Bridging the Gap between Real-world and Synthetic Images for Testing
  Autonomous Driving Systems
Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems
Mohammad Hossein Amini
Shiva Nejati
35
1
0
25 Aug 2024
LeCov: Multi-level Testing Criteria for Large Language Models
LeCov: Multi-level Testing Criteria for Large Language Models
Xuan Xie
Jiayang Song
Yuheng Huang
Da Song
Fuyuan Zhang
Felix Juefei-Xu
Lei Ma
ELM
31
0
0
20 Aug 2024
Maintainability Challenges in ML: A Systematic Literature Review
Maintainability Challenges in ML: A Systematic Literature Review
Karthik Shivashankar
Antonio Martini
37
9
0
17 Aug 2024
A Conceptual Framework for Ethical Evaluation of Machine Learning
  Systems
A Conceptual Framework for Ethical Evaluation of Machine Learning Systems
Neha R. Gupta
Jessica Hullman
Hari Subramonyam
FaML
42
3
0
05 Aug 2024
Evaluating Human Trajectory Prediction with Metamorphic Testing
Evaluating Human Trajectory Prediction with Metamorphic Testing
Helge Spieker
Nassim Belmecheri
Arnaud Gotlieb
Nadjib Lazaar
31
0
0
26 Jul 2024
Robust Neural Information Retrieval: An Adversarial and
  Out-of-distribution Perspective
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective
Yu-An Liu
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
35
6
0
09 Jul 2024
Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing
  Reinforcement Learning Programs
Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs
Shiyu Zhang
Haoyang Song
Qixin Wang
Yu Pei
42
0
0
28 Jun 2024
Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine
  Learning System Decisions
Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine Learning System Decisions
Hussaini Mamman
S. Basri
A. Balogun
Abubakar Abdullahi Imam
Ganesh M. Kumar
L. F. Capretz
FaML
33
0
0
25 Jun 2024
On Security Weaknesses and Vulnerabilities in Deep Learning Systems
On Security Weaknesses and Vulnerabilities in Deep Learning Systems
Zhongzheng Lai
Huaming Chen
Ruoxi Sun
Yu Zhang
Minhui Xue
Dong Yuan
AAML
43
2
0
12 Jun 2024
Statistical Multicriteria Benchmarking via the GSD-Front
Statistical Multicriteria Benchmarking via the GSD-Front
Christoph Jansen
G. Schollmeyer
Julian Rodemann
Hannah Blocher
Thomas Augustin
46
4
0
06 Jun 2024
System Safety Monitoring of Learned Components Using Temporal Metric
  Forecasting
System Safety Monitoring of Learned Components Using Temporal Metric Forecasting
Sepehr Sharifi
Andrea Stocco
Lionel C. Briand
AI4TS
48
1
0
21 May 2024
Inherent Trade-Offs between Diversity and Stability in Multi-Task
  Benchmarks
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
Guanhua Zhang
Moritz Hardt
42
7
0
02 May 2024
Predicting Fairness of ML Software Configurations
Predicting Fairness of ML Software Configurations
Salvador Robles Herrera
Verya Monjezi
V. Kreinovich
Ashutosh Trivedi
Saeid Tizpaz-Niari
29
1
0
29 Apr 2024
Deep Learning Library Testing: Definition, Methods and Challenges
Deep Learning Library Testing: Definition, Methods and Challenges
Xiaoyu Zhang
Weipeng Jiang
Chao Shen
Qi Li
Qian Wang
Chenhao Lin
Xiaohong Guan
AAML
38
1
0
27 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path
  Forward
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
53
6
0
12 Apr 2024
A Survey of Neural Network Robustness Assessment in Image Recognition
A Survey of Neural Network Robustness Assessment in Image Recognition
Jie Wang
Jun Ai
Minyan Lu
Haoran Su
Dan Yu
Yutao Zhang
Junda Zhu
Jingyu Liu
AAML
30
3
0
12 Apr 2024
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework
  with Application to Inventor Name Disambiguation
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation
Olivier Binette
Youngsoo Baek
Siddharth Engineer
Christina Jones
Abel Dasylva
Jerome P. Reiter
37
2
0
08 Apr 2024
DeepKnowledge: Generalisation-Driven Deep Learning Testing
DeepKnowledge: Generalisation-Driven Deep Learning Testing
S. Missaoui
Simos Gerasimou
Nikolaos Matragkas
40
0
0
25 Mar 2024
How do Machine Learning Projects use Continuous Integration Practices?
  An Empirical Study on GitHub Actions
How do Machine Learning Projects use Continuous Integration Practices? An Empirical Study on GitHub Actions
Joao Helis Bernardo
Daniel Alencar Da Costa
Sérgio Queiroz de Medeiros
U. Kulesza
18
3
0
14 Mar 2024
Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep
  Learning Projects
Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep Learning Projects
Han Wang
Sijia Yu
Chunyang Chen
Burak Turhan
Xiaodong Zhu
ELM
MLAU
23
2
0
26 Feb 2024
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large
  Language Models
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Minsuk Kahng
Ian Tenney
Mahima Pushkarna
Michael Xieyang Liu
James Wexler
Emily Reif
Krystal Kallarackal
Minsuk Chang
Michael Terry
Lucas Dixon
56
21
0
16 Feb 2024
What About the Data? A Mapping Study on Data Engineering for AI Systems
What About the Data? A Mapping Study on Data Engineering for AI Systems
Petra Heck
26
3
0
07 Feb 2024
Outline of an Independent Systematic Blackbox Test for ML-based Systems
Outline of an Independent Systematic Blackbox Test for ML-based Systems
H. Wiesbrock
Jürgen Grossmann
27
0
0
30 Jan 2024
Data vs. Model Machine Learning Fairness Testing: An Empirical Study
Data vs. Model Machine Learning Fairness Testing: An Empirical Study
Arumoy Shome
Luís Cruz
A. van Deursen
42
3
0
15 Jan 2024
A Survey on Verification and Validation, Testing and Evaluations of
  Neurosymbolic Artificial Intelligence
A Survey on Verification and Validation, Testing and Evaluations of Neurosymbolic Artificial Intelligence
Justus Renkhoff
Ke-ke Feng
Marc Meier-Doernberg
Alvaro Velasquez
Houbing Herbert Song
34
8
0
06 Jan 2024
New Job, New Gender? Measuring the Social Bias in Image Generation
  Models
New Job, New Gender? Measuring the Social Bias in Image Generation Models
Wenxuan Wang
Haonan Bai
Jen-tse Huang
Yuxuan Wan
Youliang Yuan
Haoyi Qiu
Nanyun Peng
Michael R. Lyu
47
20
0
01 Jan 2024
The Earth is Flat? Unveiling Factual Errors in Large Language Models
The Earth is Flat? Unveiling Factual Errors in Large Language Models
Wenxuan Wang
Juluan Shi
Zhaopeng Tu
Youliang Yuan
Jen-tse Huang
Wenxiang Jiao
Michael R. Lyu
KELM
HILM
SyDa
47
1
0
01 Jan 2024
FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions
FetaFix: Automatic Fault Localization and Repair of Deep Learning Model Conversions
Nikolaos Louloudakis
Perry Gibson
José Cano
Ajitha Rajan
17
0
0
22 Dec 2023
FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in
  LLMs
FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs
S. Kadhe
Anisa Halimi
Ambrish Rawat
Nathalie Baracaldo
MU
22
7
0
12 Dec 2023
12345
Next