Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Alexia Jolicoeur-Martineau

2025-08-04

Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings

Summary

This paper talks about a new system that uses multiple AI agents working together to create and evaluate JavaScript games and animations using audio and visual feedback. The system selects multimedia assets and improves the code generation through repeated testing and evaluation.

What's the problem?

The problem is that while AI can generate text, images, and audio, creating interactive multimedia content like games is difficult. Current models often struggle to produce complex, high-quality games that require many steps and human-like creativity, especially when using special assets like sounds or 3D models.

What's the solution?

The paper introduces AVR-Eval, a way to judge the quality of multimedia content by comparing audio-visual recordings, and AVR-Agent, a multi-agent system that generates and improves game code by selecting assets and using feedback from AVR-Eval to choose better versions.

Why it matters?

This matters because it helps AI systems make better and more complex games and animations automatically, pushing the limits of what AI can create in interactive and multimedia experiences and narrowing the gap between human and machine creativity.

Abstract

A multi-agent system using an omni-modal evaluation metric improves JavaScript game and animation generation but struggles with custom assets and audio-visual feedback.

View Paper