Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
Davide Lobba, Fulvio Sanguigni, Bin Ren, Marcella Cornia, Rita Cucchiara, Nicu Sebe
2025-05-28
Summary
This paper talks about TEMU-VTOFF, a new AI system that can take a photo of a person wearing clothes and generate images of those clothes as separate product pictures, even when there are different types of garments involved.
What's the problem?
The problem is that it's really hard for computers to take a picture of someone wearing several pieces of clothing and then create clean, clear images of each item by itself, especially when the clothes overlap or come from different categories like shirts, pants, and jackets.
What's the solution?
To solve this, the researchers built TEMU-VTOFF, which uses advanced AI techniques to separate out the features of each piece of clothing and pay attention to different types at the same time. This lets the system create high-quality product images for each garment, no matter how many are in the original photo.
Why it matters?
This is important because it can help online shopping by making it easier to show and sell individual clothing items, even if all you have is a photo of someone wearing them, which saves time for stores and makes things simpler for customers.
Abstract
TEMU-VTOFF, a novel multi-modal architecture using DiT and multimodal attention, achieves state-of-the-art performance on virtual try-off tasks by disentangling garment features and handling multiple garment categories.