ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.01653
93
235
v1v2v3v4v5v6v7v8 (latest)

Metrics reloaded: Pitfalls and recommendations for image analysis validation

3 June 2022
Lena Maier-Hein
Annika Reinke
Evangelia Christodoulou
M. Tizabi
Patrick Godau
E. Christodoulou
Ben Glocker
Fabian Isensee
Jens Kleesiek
Michal Kozubek
M. Reyes
Michael Baumgartner
Manuel Wiesenfarth
A. Emre Kavur
A. Emre Kavur
Michael Baumgartner
Matthias Eisenmann
Doreen Heckmann-Notzel
Tim Radsch
Laura Acion
Michela Antonelli
Tal Arbel
Arriel Benis
Allison Benis
P. Bankhead
M. Jorge Cardoso
Veronika Cheplygina
Beth A. Cimini
Gary S. Collins
Keyvan Farahani
Luciana Ferrer
Adrian Galdran
Pierre Jannin
Robert Haase
Daniel A. Hashimoto
Michael M. Hoffman
M. Huisman
Pierre Jannin
Charles E. Kahn
Dagmar Kainmueller
Bernhard Kainz
Alexandros Karargyris
Alan Karthikesalingam
H. Kenngott
D. Moher
A. Kopp-Schneider
Anna Kreshuk
Tahsin M. Kurc
David Moher
G. Litjens
Amin Madani
Felix Nickel
Anne L. Martel
Peter Mattson
Erik H. W. Meijering
Bjoern Menze
Karel G. M. Moons
Henning Muller
Brennan Nichyporuk
Felix Nickel
Jens Petersen
Nasir M. Rajpoot
Nicola Rieke
Julio Saez-Rodriguez
Clarisa Sánchez Gutiérrez
S. Shetty
Maarten van Smeden
ArXiv (abs)PDFHTML
Abstract

The field of automatic biomedical image analysis crucially depends on robust and meaningful performance metrics for algorithm validation. Current metric usage, however, is often ill-informed and does not reflect the underlying domain interest. Here, we present a comprehensive framework that guides researchers towards choosing performance metrics in a problem-aware manner. Specifically, we focus on biomedical image analysis problems that can be interpreted as a classification task at image, object or pixel level. The framework first compiles domain interest-, target structure-, data set- and algorithm output-related properties of a given problem into a problem fingerprint, while also mapping it to the appropriate problem category, namely image-level classification, semantic segmentation, instance segmentation, or object detection. It then guides users through the process of selecting and applying a set of appropriate validation metrics while making them aware of potential pitfalls related to individual choices. In this paper, we describe the current status of the Metrics Reloaded recommendation framework, with the goal of obtaining constructive feedback from the image analysis community. The current version has been developed within an international consortium of more than 60 image analysis experts and will be made openly available as a user-friendly toolkit after community-driven optimization.

View on arXiv
Comments on this paper