TraDiffusion: Trajectory-Based Training-Free Image Generation

Mingrui Wu, Oucheng Huang, Jiayi Ji, Jiale Li, Xinyue Cai, Huafeng Kuang, Jianzhuang Liu, Xiaoshuai Sun, Rongrong Ji

2024-08-20

TraDiffusion: Trajectory-Based Training-Free Image Generation

Summary

This paper introduces TraDiffusion, a new method for generating images based on user-guided mouse movements without needing prior training.

What's the problem?

Generating images that match user expectations can be difficult, especially when traditional methods require extensive training and complex controls. Users often want to create images quickly and intuitively, but existing systems can be cumbersome and not very responsive to their input.

What's the solution?

TraDiffusion allows users to control the image generation process simply by moving their mouse. It uses a special energy function that helps the model focus on areas defined by the mouse trajectory, ensuring that the generated images align closely with what the user wants. This method makes it easier for users to manipulate important parts of the image and adjust various features without needing detailed knowledge or extensive training.

Why it matters?

This research is important because it simplifies the process of creating images, making it more accessible to everyone. By allowing intuitive control over image generation, TraDiffusion can be useful in fields like graphic design, gaming, and art, where quick and effective visual creation is essential.

Abstract

In this work, we propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories. To achieve precise control, we design a distance awareness energy function to effectively guide latent variables, ensuring that the focus of generation is within the areas defined by the trajectory. The energy function encompasses a control function to draw the generation closer to the specified trajectory and a movement function to diminish activity in areas distant from the trajectory. Through extensive experiments and qualitative assessments on the COCO dataset, the results reveal that TraDiffusion facilitates simpler, more natural image control. Moreover, it showcases the ability to manipulate salient regions, attributes, and relationships within the generated images, alongside visual input based on arbitrary or enhanced trajectories.

View Paper