TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation

Zhekai Chen, Ruihang Chu, Yukang Chen, Shiwei Zhang, Yujie Wei, Yingya Zhang, Xihui Liu

2025-07-25

TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive
Generation

Summary

This paper talks about TTS-VAR, a new method that improves how visual auto-regressive models generate images or videos by adjusting how many samples to process at once and using smart clustering and resampling techniques during generation.

What's the problem?

Generating high-quality visuals with auto-regressive models takes a lot of computation and sometimes produces less diverse or lower quality results because it's hard to balance exploration and efficiency.

What's the solution?

The researchers created TTS-VAR to dynamically reduce batch sizes during generation and to select better candidate outputs by grouping similar features and resampling promising ones based on their quality scores. This approach improves the diversity and quality of generated visuals while using resources efficiently.

Why it matters?

This matters because TTS-VAR makes it cheaper and faster to produce high-quality images and videos with AI, which is valuable for creative content creation and many practical applications.

Abstract

TTS-VAR is a test-time scaling framework for visual auto-regressive models that improves generation quality by dynamically adjusting batch size and using clustering and resampling techniques.

View Paper