Web World Models

Jichen Feng, Yifan Zhang, Chenggong Zhang, Yifu Lu, Shilong Liu, Mengdi Wang

2025-12-30

Summary

This paper introduces a new way to create realistic and interactive environments for AI agents, called the Web World Model (WWM). It combines the reliability of existing web technologies with the creativity of large language models to build worlds where AI can act, remember, and learn.

What's the problem?

Currently, building environments for AI is tricky. Traditional web frameworks are dependable but limit the AI to pre-defined actions and information. On the other hand, fully generative world models can create limitless environments, but they're hard to control and aren't very practical to build. There's a gap between having a stable, predictable world and a completely open-ended one.

What's the solution?

The researchers created WWMs where the basic rules and 'physics' of the world are coded using standard web technologies, ensuring everything works logically. Then, they use large language models to generate the details – like stories, descriptions, and higher-level decisions – on top of this solid foundation. They built several example worlds, including a realistic map, a space exploration game, and encyclopedic knowledge bases, and found that keeping the rules separate from the creative content, using clear data formats, and generating things predictably helps create expansive but manageable worlds.

Why it matters?

This work is important because it shows that existing web technologies can be used as a powerful base for building complex and interactive environments for AI. This could lead to more capable AI agents that can learn and operate in more realistic and open-ended scenarios, bridging the gap between controlled simulations and truly dynamic worlds.

Abstract

Language agents increasingly require persistent worlds in which they can act, remember, and learn. Existing approaches sit at two extremes: conventional web frameworks provide reliable but fixed contexts backed by databases, while fully generative world models aim for unlimited environments at the expense of controllability and practical engineering. In this work, we introduce the Web World Model (WWM), a middle ground where world state and ``physics'' are implemented in ordinary web code to ensure logical consistency, while large language models generate context, narratives, and high-level decisions on top of this structured latent state. We build a suite of WWMs on a realistic web stack, including an infinite travel atlas grounded in real geography, fictional galaxy explorers, web-scale encyclopedic and narrative worlds, and simulation- and game-like environments. Across these systems, we identify practical design principles for WWMs: separating code-defined rules from model-driven imagination, representing latent state as typed web interfaces, and utilizing deterministic generation to achieve unlimited but structured exploration. Our results suggest that web stacks themselves can serve as a scalable substrate for world models, enabling controllable yet open-ended environments. Project Page: https://github.com/Princeton-AI2-Lab/Web-World-Models.

View Paper