ProEdit introduces KV-mix, which mixes KV features of the source and the target in the edited region, mitigating the influence of the source image on the editing region while maintaining background consistency. It also proposes Latents-Shift, which perturbs the edited region of the source latent, eliminating the influence of the inverted latent on the sampling. This approach enables accurate attribute editing and background preservation simultaneously.
The ProEdit pipeline includes a mask extraction module that identifies the edited region based on source and target prompts during the first inversion step. After obtaining the inverted noise, Latents-Shift is applied to perturb the initial distribution in the edited region, reducing source image information. ProEdit has achieved state-of-the-art performance in several image and video editing benchmarks and can be seamlessly integrated into existing inversion and editing methods.


