Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation

Reza Qorbani, Gianluca Villani, Theodoros Panagiotakopoulos, Marc Botet Colomer, Linus Härenstam-Nielsen, Mattia Segu, Pier Luigi Dovesi, Jussi Karlgren, Daniel Cremers, Federico Tombari, Matteo Poggi

2025-03-28

Semantic Library Adaptation: LoRA Retrieval and Fusion for
Open-Vocabulary Semantic Segmentation

Summary

This paper is about improving how AI can label different parts of images it hasn't seen before by using a library of pre-trained knowledge.

What's the problem?

AI models that label images often struggle when they encounter images that are different from what they were trained on.

What's the solution?

The researchers created a system called Semantic Library Adaptation (SemLA) that uses a library of pre-trained knowledge to quickly adapt to new types of images without needing to be retrained.

Why it matters?

This work matters because it allows AI to label images more accurately in real-world situations, even if those images are very different from what the AI was originally trained on.

Abstract

Open-vocabulary semantic segmentation models associate vision and text to label pixels from an undefined set of classes using textual queries, providing versatile performance on novel datasets. However, large shifts between training and test domains degrade their performance, requiring fine-tuning for effective real-world applications. We introduce Semantic Library Adaptation (SemLA), a novel framework for training-free, test-time domain adaptation. SemLA leverages a library of LoRA-based adapters indexed with CLIP embeddings, dynamically merging the most relevant adapters based on proximity to the target domain in the embedding space. This approach constructs an ad-hoc model tailored to each specific input without additional training. Our method scales efficiently, enhances explainability by tracking adapter contributions, and inherently protects data privacy, making it ideal for sensitive applications. Comprehensive experiments on a 20-domain benchmark built over 10 standard datasets demonstrate SemLA's superior adaptability and performance across diverse settings, establishing a new standard in domain adaptation for open-vocabulary semantic segmentation.

View Paper