HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Jiazi Bu, Pengyang Ling, Yujie Zhou, Pan Zhang, Tong Wu, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
2025-04-09
Summary
This paper talks about HiFlow, a tool that helps AI create sharp, detailed big images from text without needing extra training, by using smart tricks to keep the structure and details clear.
What's the problem?
Current AI image tools struggle to make big, high-quality images from text because they often lose details or mess up the layout when scaling up.
What's the solution?
HiFlow uses a small version of the image as a guide to help the AI build a larger version step-by-step, keeping the main shapes correct and adding fine details without breaking anything.
Why it matters?
This lets artists and designers create high-quality posters, game graphics, or product images with AI that stay sharp even when zoomed in or printed large.
Abstract
Text-to-image (T2I) diffusion/flow models have drawn considerable attention recently due to their remarkable ability to deliver flexible visual creations. Still, high-resolution image synthesis presents formidable challenges due to the scarcity and complexity of high-resolution content. To this end, we present HiFlow, a training-free and model-agnostic framework to unlock the resolution potential of pre-trained flow models. Specifically, HiFlow establishes a virtual reference flow within the high-resolution space that effectively captures the characteristics of low-resolution flow information, offering guidance for high-resolution generation through three key aspects: initialization alignment for low-frequency consistency, direction alignment for structure preservation, and acceleration alignment for detail fidelity. By leveraging this flow-aligned guidance, HiFlow substantially elevates the quality of high-resolution image synthesis of T2I models and demonstrates versatility across their personalized variants. Extensive experiments validate HiFlow's superiority in achieving superior high-resolution image quality over current state-of-the-art methods.