Spherical Leech Quantization for Visual Tokenization and Generation

Yue Zhao, Hanwen Jiang, Zhenlin Xu, Chutong Yang, Ehsan Adeli, Philipp Krähenbühl

2025-12-17

Spherical Leech Quantization for Visual Tokenization and Generation

Summary

This paper explores a new way to compress data, specifically images, by using a clever mathematical technique based on something called 'lattice coding'. It aims to improve how efficiently we can represent information without losing too much quality.

What's the problem?

When you try to simplify data to save space (quantization), you often run into issues with auto-encoders, which are used to learn these simplifications. Some methods, like BSQ, work okay, but they need extra 'helper' losses during training to work well. The paper investigates *why* these extra losses are needed and looks for a better way to do things without relying on them so heavily.

What's the solution?

The researchers realized that the underlying structure of how data is organized – using lattices – is key. They tested different lattice structures, like random ones and Fibonacci lattices, but found that a particularly symmetrical one called the 'Leech lattice' (Λ_{24}-SQ) worked best. This method simplifies the training process and provides a better balance between how much the data is compressed and how well it can be reconstructed later.

Why it matters?

This research matters because it offers a more efficient and effective way to compress images and other data. By improving the reconstruction quality while using a similar amount of data space compared to existing methods like BSQ, it could lead to faster image processing, reduced storage needs, and better performance in image generation tasks.

Abstract

Non-parametric quantization has received much attention due to its efficiency on parameters and scalability to a large codebook. In this paper, we present a unified formulation of different non-parametric quantization methods through the lens of lattice coding. The geometry of lattice codes explains the necessity of auxiliary loss terms when training auto-encoders with certain existing lookup-free quantization variants such as BSQ. As a step forward, we explore a few possible candidates, including random lattices, generalized Fibonacci lattices, and densest sphere packing lattices. Among all, we find the Leech lattice-based quantization method, which is dubbed as Spherical Leech Quantization (Λ_{24}-SQ), leads to both a simplified training recipe and an improved reconstruction-compression tradeoff thanks to its high symmetry and even distribution on the hypersphere. In image tokenization and compression tasks, this quantization approach achieves better reconstruction quality across all metrics than BSQ, the best prior art, while consuming slightly fewer bits. The improvement also extends to state-of-the-art auto-regressive image generation frameworks.

View Paper