Understanding metric-related pitfalls in image analysis validation

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	REINKE Annika TIZABI Minu D BAUMGARTNER Michael EISENMANN Matthias HECKMANN-NÖTZEL Doreen KAVUR A Emre RÄDSCH Tim SUDRE Carole H ACION Laura ANTONELLI Michela ARBEL Tal BAKAS Spyridon BENIS Arriel BUETTNER Florian CARDOSO M Jorge CHEPLYGINA Veronika CHEN Jianxu CHRISTODOULOU Evangelia CIMINI Beth A FARAHANI Keyvan FERRER Luciana GALDRAN Adrian GINNEKEN Bram van GLOCKER Ben GODAU Patrick HASHIMOTO Daniel A HOFFMAN Michael M HUISMAN Merel ISENSEE Fabian JANNIN Pierre KAHN Charles E KAINMUELLER Dagmar KAINZ Bernhard KARARGYRIS Alexandros KLEESIEK Jens KOFLER Florian KOOI Thijs KOPP-SCHNEIDER Annette KOZUBEK Michal KRESHUK Anna KURC Tahsin LANDMAN Bennett A LITJENS Geert MADANI Amin MAIER-HEIN Klaus MARTEL Anne L MEIJERING Erik MENZE Bjoern MOONS Karel GM MÜLLER Henning NICHYPORUK Brennan NICKEL Felix PETERSEN Jens RAFELSKI Susanne M RAJPOOT Nasir REYES Mauricio RIEGLER Michael A RIEKE Nicola SAEZ-RODRIGUEZ Julio SÁNCHEZ Clara I SHETTY Shravya SUMMERS Ronald M TAHA Abdel A TIULPIN Aleksei TSAFTARIS Sotirios A CALSTER Ben Van VAROQUAUX Gaël YANIV Ziv R JÄGER Paul F MAIER-HEIN Lena
Year of publication	2024
Type	Article in Periodical
Magazine / Source	NATURE METHODS
MU Faculty or unit	Faculty of Informatics
Citation
Web	https://www.nature.com/articles/s41592-023-02150-0
Doi	http://dx.doi.org/10.1038/s41592-023-02150-0
Keywords	SEGMENTATION
Attached files	Reinke_NatMeth_2024.pdf
Description	Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.
Related projects:	National research infrastructure for biological and medical imaging