Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Jinho Jeong, Sangmin Han, Jinwoo Kim, Seon Joo Kim

2025-03-26

Latent Space Super-Resolution for Higher-Resolution Image Generation
with Diffusion Models

Summary

This paper is about making AI image generators create sharper, more detailed images, especially at very high resolutions.

What's the problem?

AI image generators often struggle to create realistic details when generating super high-resolution images. They can produce blurry or distorted results.

What's the solution?

The researchers developed a new technique called LSRNA that helps the AI add details in a way that preserves the image's sharpness and overall structure. It's like giving the AI a boost in its ability to see and create fine details.

Why it matters?

This work matters because it can lead to AI-generated images that are more realistic and visually appealing, which is important for applications like creating art, designing products, and generating realistic scenes for video games or movies.

Abstract

In this paper, we propose LSRNA, a novel framework for higher-resolution (exceeding 1K) image generation using diffusion models by leveraging super-resolution directly in the latent space. Existing diffusion models struggle with scaling beyond their training resolutions, often leading to structural distortions or content repetition. Reference-based methods address the issues by upsampling a low-resolution reference to guide higher-resolution generation. However, they face significant challenges: upsampling in latent space often causes manifold deviation, which degrades output quality. On the other hand, upsampling in RGB space tends to produce overly smoothed outputs. To overcome these limitations, LSRNA combines Latent space Super-Resolution (LSR) for manifold alignment and Region-wise Noise Addition (RNA) to enhance high-frequency details. Our extensive experiments demonstrate that integrating LSRNA outperforms state-of-the-art reference-based methods across various resolutions and metrics, while showing the critical role of latent space upsampling in preserving detail and sharpness. The code is available at https://github.com/3587jjh/LSRNA.

View Paper