UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

Zhaolong Su, Wang Lu, Hao Chen, Sharon Li, Jindong Wang

2025-11-27

UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

Summary

This paper focuses on improving how well unified multimodal models – AI systems that can understand and generate content from different types of data like text and images – work together. These models are powerful, but they have an internal conflict that makes them less reliable.

What's the problem?

Unified multimodal models face a trade-off: the part of the model that *understands* information works best with simplified representations, while the part that *generates* new content needs detailed, complex representations. This difference causes the model to make inconsistent decisions, struggle with coherence between different types of data, and become easily confused when presented with slightly altered or unusual inputs.

What's the solution?

The researchers developed a technique called UniGame. It works by having the generation part of the model actively try to 'trick' the understanding part, essentially turning the model into its own opponent. This is done by making small changes to the information the model processes, forcing the understanding component to become more robust and consistent. It’s a relatively small addition to the existing model and can be used with different types of architectures.

Why it matters?

This work is important because it shows a way to make these powerful multimodal models more reliable and consistent. By improving their internal coherence and ability to handle unexpected data, UniGame helps pave the way for more stable and capable AI systems that can seamlessly process and generate information across different formats, like creating images from text or answering questions about videos.

Abstract

Unified Multimodal Models (UMMs) have shown impressive performance in both understanding and generation with a single architecture. However, UMMs still exhibit a fundamental inconsistency: understanding favors compact embeddings, whereas generation favors reconstruction-rich representations. This structural trade-off produces misaligned decision boundaries, degraded cross-modal coherence, and heightened vulnerability under distributional and adversarial shifts. In this paper, we present UniGame, a self-adversarial post-training framework that directly targets the inconsistencies. By applying a lightweight perturber at the shared token interface, UniGame enables the generation branch to actively seek and challenge fragile understanding, turning the model itself into its own adversary. Experiments demonstrate that UniGame significantly improves the consistency (+4.6%). Moreover, it also achieves substantial improvements in understanding (+3.6%), generation (+0.02), out-of-distribution and adversarial robustness (+4.8% and +6.2% on NaturalBench and AdVQA). The framework is architecture-agnostic, introduces less than 1% additional parameters, and is complementary to existing post-training methods. These results position adversarial self-play as a general and effective principle for enhancing the coherence, stability, and unified competence of future multimodal foundation models. The official code is available at: https://github.com/AIFrontierLab/UniGame

View Paper