Diffusion ConvNet (DiCo), using standard ConvNet modules with a compact channel attention mechanism, achieves high image quality and generation speed in visual generation tasks with efficiency gains compared to Diffusion Transformer (DiT).

This paper talks about DiCo, a new type of AI model that uses classic convolutional neural networks (ConvNets) instead of transformers to create high-quality images quickly and efficiently in diffusion-based image generation tasks.

DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling

Summary

What's the problem?

What's the solution?

Why it matters?

Abstract