VideoMaMa

NEW

Free Video Matting

LikeWebsite Promote

Key Features

Converts coarse segmentation masks into pixel-accurate alpha mattes using pretrained video diffusion priors.

Achieves strong zero-shot generalization to real-world videos trained only on synthetic data.

Develops a scalable pseudo-labeling pipeline for automatic high-quality video matting annotations.

Introduces the MA-V dataset with over 50K real-world videos featuring diverse scenes and motions.

Fine-tunes SAM2 into SAM2-Matte, outperforming baselines on in-the-wild video matting tasks.

Ensures temporal consistency and detail preservation through mask-guided diffusion refinement.

Supports interactive demos for video selection and in-the-wild result visualization.

Provides quantitative and qualitative comparisons validating superior performance metrics.

Building upon VideoMaMa's strong capabilities, the framework introduces a scalable pseudo-labeling pipeline that generates high-quality matting annotations automatically from accessible segmentation cues. This pipeline facilitates the creation of the Matting Anything in Video (MA-V) dataset, comprising over 50,000 real-world videos annotated with pixel-accurate alpha mattes, covering a broad spectrum of everyday scenarios, dynamic movements, and environmental variations. By democratizing access to large-scale training data, VideoMaMa paves the way for advancing video editing tools, compositing workflows, and augmented reality applications that demand seamless foreground-background separation.

To demonstrate practical impact, VideoMaMa fine-tunes the Segment Anything Model 2 (SAM2) on the MA-V dataset, resulting in SAM2-Matte, which exhibits superior robustness and accuracy on unseen in-the-wild videos compared to models trained on prior matting datasets. The architecture integrates mask-guided processing with diffusion-based refinement, ensuring temporal consistency and fine-grained detail preservation across video frames. All models, code, and the comprehensive MA-V dataset are set for public release, empowering researchers and developers to push the boundaries of generative video processing and scalable annotation strategies.

Get more likes & reach the top of search results by adding this button on your site!

VideoMaMa

Key Features

Zero to AI Engineer

Subscribe to the AI Search Newsletter