MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes

Henghui Ding, Kaining Ying, Chang Liu, Shuting He, Xudong Jiang, Yu-Gang Jiang, Philip H. S. Torr, Song Bai

2025-08-08

MOSEv2: A More Challenging Dataset for Video Object Segmentation in
Complex Scenes

Summary

This paper talks about MOSEv2, a new and more difficult dataset designed to test how well video object segmentation (VOS) methods work in complex and realistic scenes with many challenges.

What's the problem?

The problem is that current VOS datasets mostly include clear and easy-to-see objects, so existing methods do very well there but struggle when applied to real-world videos that have crowded scenes, small or hidden objects, poor lighting, and other difficulties.

What's the solution?

The solution was to create MOSEv2, which contains thousands of videos with many objects facing real-world problems like disappearing and reappearing, occlusions, bad weather, nighttime scenes, and even camouflaged or non-physical things like shadows. This dataset makes it harder for VOS models and better shows their true limits.

Why it matters?

This matters because it pushes the development of better VOS methods that can handle real-life challenges, improving applications like video editing, autonomous driving, and surveillance where accurate object tracking is crucial.

Abstract

MOSEv2, a more challenging dataset, highlights the limitations of current VOS methods in real-world scenarios with increased complexity and diverse challenges.

View Paper