CheXmask-U: Quantifying uncertainty in landmark-based anatomical segmentation for X-ray images
Matias Cosarinsky, Nicolas Gaggion, Rodrigo Echeveste, Enzo Ferrante
2025-12-15
Summary
This paper focuses on making medical image analysis, specifically segmenting parts of chest X-rays, more reliable by figuring out when the computer is *unsure* about its predictions.
What's the problem?
When computers analyze medical images to help doctors, it's crucial to know when the computer might be wrong. Existing methods often focus on whether each individual pixel is correctly identified, but a different approach uses 'landmarks' to define shapes, which is more accurate but hasn't been studied much in terms of how to measure uncertainty. Basically, we need a way to tell if the computer is confidently outlining the correct structures in an X-ray, or if it's guessing.
What's the solution?
The researchers used a special type of neural network that combines image analysis with a 'graph-based' approach, which is good at maintaining the correct shape of things. This network has a hidden 'latent space' that represents its understanding of the image. They developed two ways to measure uncertainty: one by looking at how spread out the data is in this latent space, and another by having the network make multiple slightly different predictions and seeing how much they vary. They tested this by intentionally adding noise to the X-rays and showed that the uncertainty measures increased as the images became more corrupted.
Why it matters?
This work is important because it provides a way to assess the reliability of computer-aided diagnosis using landmark-based segmentation of chest X-rays. It’s not enough for a computer to *make* a prediction; we need to know *how confident* it is. The researchers also released a large dataset of chest X-rays with uncertainty estimates, allowing other researchers to build upon this work and improve the safety and accuracy of medical image analysis, ultimately helping doctors make better decisions.
Abstract
Uncertainty estimation is essential for the safe clinical deployment of medical image segmentation systems, enabling the identification of unreliable predictions and supporting human oversight. While prior work has largely focused on pixel-level uncertainty, landmark-based segmentation offers inherent topological guarantees yet remains underexplored from an uncertainty perspective. In this work, we study uncertainty estimation for anatomical landmark-based segmentation on chest X-rays. Inspired by hybrid neural network architectures that combine standard image convolutional encoders with graph-based generative decoders, and leveraging their variational latent space, we derive two complementary measures: (i) latent uncertainty, captured directly from the learned distribution parameters, and (ii) predictive uncertainty, obtained by generating multiple stochastic output predictions from latent samples. Through controlled corruption experiments we show that both uncertainty measures increase with perturbation severity, reflecting both global and local degradation. We demonstrate that these uncertainty signals can identify unreliable predictions by comparing with manual ground-truth, and support out-of-distribution detection on the CheXmask dataset. More importantly, we release CheXmask-U (huggingface.co/datasets/mcosarinsky/CheXmask-U), a large scale dataset of 657,566 chest X-ray landmark segmentations with per-node uncertainty estimates, enabling researchers to account for spatial variations in segmentation quality when using these anatomical masks. Our findings establish uncertainty estimation as a promising direction to enhance robustness and safe deployment of landmark-based anatomical segmentation methods in chest X-ray. A fully working interactive demo of the method is available at huggingface.co/spaces/matiasky/CheXmask-U and the source code at github.com/mcosarinsky/CheXmask-U.