Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Yueqian Lin, Qing Yu, Go Irie, Shafiq Joty, Yixuan Li, Hai Li, Ziwei Liu, Toshihiko Yamasaki, Kiyoharu Aizawa

2024-08-02

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Summary

This paper surveys the topic of out-of-distribution (OOD) detection in machine learning, especially in the context of vision-language models. It discusses how to identify data that doesn't match what a model was trained on and explores related problems like anomaly detection and open set recognition.

What's the problem?

Detecting out-of-distribution samples is essential for ensuring that machine learning systems work safely and effectively. However, many existing methods for OOD detection are limited and often confused with other related problems, such as anomaly detection and novelty detection. This confusion can lead to misunderstandings and ineffective solutions in real-world applications.

What's the solution?

The authors propose a generalized framework for OOD detection that categorizes five related problems: anomaly detection (AD), novelty detection (ND), open set recognition (OSR), outlier detection (OD), and OOD detection itself. By unifying these areas, the framework helps clarify their relationships and improves the understanding of how to tackle these challenges. The paper also reviews recent advancements in the field, particularly focusing on how large vision-language models have changed the landscape of OOD detection.

Why it matters?

This research is important because it provides a clearer understanding of OOD detection and its related challenges, which is crucial for developing safer and more reliable machine learning systems. By addressing these issues, the findings can help improve applications like autonomous driving, where recognizing unfamiliar or unexpected situations is critical for safety.

Abstract

Detecting out-of-distribution (OOD) samples is crucial for ensuring the safety of machine learning systems and has shaped the field of OOD detection. Meanwhile, several other problems are closely related to OOD detection, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). To unify these problems, a generalized OOD detection framework was proposed, taxonomically categorizing these five problems. However, Vision Language Models (VLMs) such as CLIP have significantly changed the paradigm and blurred the boundaries between these fields, again confusing researchers. In this survey, we first present a generalized OOD detection v2, encapsulating the evolution of AD, ND, OSR, OOD detection, and OD in the VLM era. Our framework reveals that, with some field inactivity and integration, the demanding challenges have become OOD detection and AD. In addition, we also highlight the significant shift in the definition, problem settings, and benchmarks; we thus feature a comprehensive review of the methodology for OOD detection, including the discussion over other related tasks to clarify their relationship to OOD detection. Finally, we explore the advancements in the emerging Large Vision Language Model (LVLM) era, such as GPT-4V. We conclude this survey with open challenges and future directions.

View Paper