InstanceGen: Image Generation with Instance-level Instructions
Etai Sella, Yanir Kleiman, Hadar Averbuch-Elor
2025-05-19
Summary
This paper talks about InstanceGen, a new way to create images that follow very specific and detailed instructions by combining information from both pictures and advanced language models.
What's the problem?
The problem is that most image generation tools struggle to make pictures that match complicated or highly detailed text prompts, especially when you want the image to look a certain way or include specific objects and layouts.
What's the solution?
The researchers designed InstanceGen to use both the structure of example images and instructions from powerful language models, so the system can generate new images that fit exactly what the user asks for, even if the request is very complex.
Why it matters?
This matters because it allows artists, designers, and anyone using AI to create images that are much closer to what they imagine, making creative work with AI more accurate and flexible.
Abstract
The proposed technique combines image-based structural guidance with LLM-based instructions to align generated images with complex, detailed text prompts.