Segment Any Motion in Videos

NEW

The pipeline begins by taking 2D tracks and depth maps generated by off-the-shelf models as inputs. These are processed by a motion encoder to capture detailed motion patterns, producing featured tracks that represent dynamic trajectories. A tracks decoder then integrates DINO semantic features to decode these tracks by separating motion and semantic information, resulting in robust dynamic trajectories. Finally, the system utilizes SAM2, a state-of-the-art segmentation model, to group dynamic tracks belonging to the same object and generate precise pixel-level moving object masks through an iterative prompting strategy. This combination ensures high-quality segmentation masks that accurately reflect object boundaries and motion dynamics across video frames.


This approach has demonstrated state-of-the-art performance on diverse datasets, excelling in challenging scenarios involving multiple moving objects, occlusions, and complex background motions. It is particularly valuable for applications requiring high-level scene understanding such as autonomous driving, video editing, surveillance, and robotics. The open-source implementation supports preprocessing, training, inference, and evaluation, with detailed instructions for installation and usage on compatible hardware setups. By leveraging advanced motion and semantic modeling techniques, Segment Any Motion in Videos provides a robust and scalable solution for dynamic object segmentation in real-world video analysis tasks.


Key features include:


  • Combines long-range trajectory motion cues with DINO-based semantic features
  • Spatio-temporal trajectory attention and motion-semantic decoupled embedding
  • Processes 2D tracks and depth maps to capture detailed motion patterns
  • Uses SAM2 for pixel-level mask densification and fine-grained segmentation
  • Iterative prompting strategy for accurate grouping of dynamic tracks
  • State-of-the-art performance on complex, multi-object video datasets
  • Open-source with comprehensive preprocessing, training, and inference pipelines

Get more likes & reach the top of search results by adding this button on your site!

Featured on

AI Search

51

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!