Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping
Jingyi Lu, Kai Han
2025-09-09
Summary
This paper introduces a new way to edit images by dragging parts of them around, making it more intuitive and faster than current methods.
What's the problem?
Existing image editing tools that use dragging often work by changing how a computer 'understands' the image, which can be slow, imprecise, and limited to specific types of images. They don't always give you instant feedback, and editing can take a long time – sometimes even minutes for a single change.
What's the solution?
The researchers developed a system called Inpaint4Drag that breaks down dragging into two steps: first, it smoothly warps the pixels in the image as if it were a flexible material, and then it fills in any gaps or missing parts using a technique called inpainting. This happens very quickly, providing a preview in just a fraction of a second and completing the edit in about 0.3 seconds for a standard image size. Importantly, it can work with *any* inpainting technology, so it automatically gets better as inpainting improves.
Why it matters?
This work is important because it makes drag-based image editing much more practical and user-friendly. It’s fast enough for real-time interaction, provides precise control over edits, and isn’t tied to a specific image model, meaning it can benefit from future advancements in image editing technology.
Abstract
Drag-based image editing has emerged as a powerful paradigm for intuitive image manipulation. However, existing approaches predominantly rely on manipulating the latent space of generative models, leading to limited precision, delayed feedback, and model-specific constraints. Accordingly, we present Inpaint4Drag, a novel framework that decomposes drag-based editing into pixel-space bidirectional warping and image inpainting. Inspired by elastic object deformation in the physical world, we treat image regions as deformable materials that maintain natural shape under user manipulation. Our method achieves real-time warping previews (0.01s) and efficient inpainting (0.3s) at 512x512 resolution, significantly improving the interaction experience compared to existing methods that require minutes per edit. By transforming drag inputs directly into standard inpainting formats, our approach serves as a universal adapter for any inpainting model without architecture modification, automatically inheriting all future improvements in inpainting technology. Extensive experiments demonstrate that our method achieves superior visual quality and precise control while maintaining real-time performance. Project page: https://visual-ai.github.io/inpaint4drag/