CGB-DM: Content and Graphic Balance Layout Generation with Transformer-based Diffusion Model
Yu Li, Yifan Chen, Gongye Liu, Jie Wu, Yujiu Yang
2024-07-23

Summary
This paper discusses CGB-DM, a new method for generating visually appealing layouts that balance content and graphics using a transformer-based diffusion model. It aims to improve how designs are created by ensuring that both the visual elements and the arrangement are harmonious.
What's the problem?
Generating layouts that look good and effectively communicate information is challenging. Existing methods often produce layouts that have issues like overlapping elements or misalignment, which can make the final design confusing or unattractive. These problems arise because many models focus too much on the content of the design without considering how the layout should be structured visually.
What's the solution?
CGB-DM addresses these issues by introducing a system that balances the importance of content and graphic elements during the layout generation process. It uses a regulator to ensure that both aspects are given equal attention. Additionally, it incorporates a saliency bounding box to help align geometric features in the layout with the images being used. By utilizing a transformer-based diffusion model, CGB-DM can generate high-quality layouts that maintain both visual appeal and structural integrity.
Why it matters?
This research is significant because it enhances the field of graphic design by providing a more effective way to create layouts automatically. By improving how content and graphics are balanced, CGB-DM can help designers produce better work faster, which is valuable in areas like website design, advertising, and any field where visual communication is key.
Abstract
Layout generation is the foundation task of intelligent design, which requires the integration of visual aesthetics and harmonious expression of content delivery. However, existing methods still face challenges in generating precise and visually appealing layouts, including blocking, overlap, or spatial misalignment between layouts, which are closely related to the spatial structure of graphic layouts. We find that these methods overly focus on content information and lack constraints on layout spatial structure, resulting in an imbalance of learning content-aware and graphic-aware features. To tackle this issue, we propose Content and Graphic Balance Layout Generation with Transformer-based Diffusion Model (CGB-DM). Specifically, we first design a regulator that balances the predicted content and graphic weight, overcoming the tendency of paying more attention to the content on canvas. Secondly, we introduce a graphic constraint of saliency bounding box to further enhance the alignment of geometric features between layout representations and images. In addition, we adapt a transformer-based diffusion model as the backbone, whose powerful generation capability ensures the quality in layout generation. Extensive experimental results indicate that our method has achieved state-of-the-art performance in both quantitative and qualitative evaluations. Our model framework can also be expanded to other graphic design fields.