Tell me why: Visual foundation models as self-explainable classifiers

Hugues Turbé, Mina Bjelogrlic, Gianmarco Mengaldo, Christian Lovis

2025-03-03

Tell me why: Visual foundation models as self-explainable classifiers

Summary

This paper talks about a new way to make AI models that can recognize images explain their decisions, called ProtoFM. It combines powerful Visual Foundation Models (VFMs) with a special design that helps the AI break down its choices into understandable parts.

What's the problem?

While AI models have gotten really good at recognizing things in images, it's often hard to understand why they make certain decisions. This is a big issue for important applications where we need to trust and verify the AI's choices. Current methods that try to explain AI decisions aren't always accurate or reliable.

What's the solution?

The researchers created ProtoFM, which adds a small, clever part (about 1 million parameters) on top of existing powerful image recognition AI models. This new part is trained to explain the AI's decisions using easy-to-understand concepts. They also developed special training techniques to make sure these explanations are accurate and meaningful.

Why it matters?

This matters because it makes AI image recognition more trustworthy and transparent. By helping us understand why an AI makes certain decisions, ProtoFM could make it safer to use AI in critical areas like medical diagnosis or self-driving cars. It also performs well in both recognizing images and explaining its choices, which is a big step forward in creating AI that's both powerful and understandable.

Abstract

Visual foundation models (VFMs) have become increasingly popular due to their state-of-the-art performance. However, interpretability remains crucial for critical applications. In this sense, self-explainable models (SEM) aim to provide interpretable classifiers that decompose predictions into a weighted sum of interpretable concepts. Despite their promise, recent studies have shown that these explanations often lack faithfulness. In this work, we combine VFMs with a novel prototypical architecture and specialized training objectives. By training only a lightweight head (approximately 1M parameters) on top of frozen VFMs, our approach (ProtoFM) offers an efficient and interpretable solution. Evaluations demonstrate that our approach achieves competitive classification performance while outperforming existing models across a range of interpretability metrics derived from the literature. Code is available at https://github.com/hturbe/proto-fm.

View Paper