Every Component Counts: Rethinking the Measure of Success for Medical Semantic Segmentation in Multi-Instance Segmentation Tasks

Alexander Jaus; Constantin Marc Seibold; Simon Reiß; Zdravko Marinov; Keyi Li; Zeling Ye; Stefan Krieg; Jens Kleesiek; Rainer Stiefelhagen

doi:10.1609/aaai.v39i4.32408

Authors

Alexander Jaus Karlsruhe Institute of Technology, Karlsruhe, Germany
Constantin Marc Seibold Institute for AI in Medicine (IKIM), University Medicine Essen, Essen, Germany
Simon Reiß Karlsruhe Institute of Technology, Karlsruhe, Germany
Zdravko Marinov Karlsruhe Institute of Technology, Karlsruhe, Germany
Keyi Li Karlsruhe Institute of Technology, Karlsruhe, Germany
Zeling Ye Karlsruhe Institute of Technology, Karlsruhe, Germany
Stefan Krieg Karlsruhe Institute of Technology, Karlsruhe, Germany
Jens Kleesiek Institute for AI in Medicine (IKIM), University Medicine Essen, Essen, Germany
Rainer Stiefelhagen Karlsruhe Institute of Technology, Karlsruhe, Germany

DOI:

https://doi.org/10.1609/aaai.v39i4.32408

Abstract

We present Connected-Component (CC)-Metrics, a novel semantic segmentation evaluation protocol, targeted to align existing semantic segmentation metrics to a multi-instance detection scenario in which each connected component matters. We motivate this setup in the common medical scenario of semantic metastases segmentation in a full-body PET/CT. We show how existing semantic segmentation metrics suffer from a bias towards larger connected components contradicting the clinical assessment of scans in which tumor size and clinical relevance are uncorrelated. To rebalance existing segmentation metrics, we propose to evaluate them on a per-component basis thus giving each tumor the same weight irrespective of its size. To match predictions to ground-truth segments, we employ a proximity-based matching criterion, evaluating common metrics locally at the component of interest. Using this approach, we break free of biases introduced by large metastasis for overlap-based metrics such as Dice or Surface Dice. CC-Metrics also improves distance-based metrics such as Hausdorff Distances which are uninformative for small changes that do not influence the maximum or 95th percentile, and avoids pitfalls introduced by directly combining counting-based metrics with overlap-based metrics as it is done in Panoptic Quality.

Every Component Counts: Rethinking the Measure of Success for Medical Semantic Segmentation in Multi-Instance Segmentation Tasks

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information