InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Hao Kang, Xin Lu
2025-03-21
Summary
This paper is about creating a tool that can change photos in many ways while still making sure the person in the photo looks like themselves.
What's the problem?
It's hard to edit photos in a flexible way and still keep the person's identity consistent, especially with advanced AI image generators.
What's the solution?
The researchers developed a system called InfiniteYou that uses a special component to inject identity features into the AI, making it better at preserving identity while allowing for flexible edits.
Why it matters?
This work matters because it can lead to better photo editing tools that allow for more creative control while still ensuring that the person in the photo remains recognizable.
Abstract
Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce InfiniteYou (InfU), one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.