SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

Trong-Tung Nguyen, Quang Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham

2024-12-09

SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

Summary

This paper talks about SwiftEdit, a new tool that allows users to edit images quickly and easily using text instructions, achieving edits in just 0.23 seconds.

What's the problem?

While there have been advancements in tools that let users edit images by typing simple text commands, these tools often take too long to process the edits because they rely on complex multi-step methods. This slow speed makes them impractical for real-time use, especially on devices like smartphones.

What's the solution?

The authors introduce SwiftEdit, which uses a one-step process to edit images. This means that instead of going through multiple steps to make an edit, SwiftEdit can do it all at once. It includes a special technique called mask-guided editing that helps focus on specific areas of the image while keeping the background intact. This makes the editing process much faster—at least 50 times faster than previous methods—while still producing high-quality results.

Why it matters?

This research is important because it makes image editing more accessible and efficient. By allowing for instant edits based on text descriptions, SwiftEdit can be used in various applications, such as social media, content creation, and mobile apps, making it easier for anyone to enhance their images quickly.

Abstract

Recent advances in text-guided image editing enable users to perform image edits through simple text inputs, leveraging the extensive priors of multi-step diffusion-based text-to-image models. However, these methods often fall short of the speed demands required for real-world and on-device applications due to the costly multi-step inversion and sampling process involved. In response to this, we introduce SwiftEdit, a simple yet highly efficient editing tool that achieve instant text-guided image editing (in 0.23s). The advancement of SwiftEdit lies in its two novel contributions: a one-step inversion framework that enables one-step image reconstruction via inversion and a mask-guided editing technique with our proposed attention rescaling mechanism to perform localized image editing. Extensive experiments are provided to demonstrate the effectiveness and efficiency of SwiftEdit. In particular, SwiftEdit enables instant text-guided image editing, which is extremely faster than previous multi-step methods (at least 50 times faster) while maintain a competitive performance in editing results. Our project page is at: https://swift-edit.github.io/

View Paper