Static Sandboxes Are Inadequate: Modeling Societal Complexity Requires Open-Ended Co-Evolution in LLM-Based Multi-Agent Simulations

Jinkun Chen, Sher Badshah, Xuemin Yu, Sijia Han

2025-10-22

Static Sandboxes Are Inadequate: Modeling Societal Complexity Requires Open-Ended Co-Evolution in LLM-Based Multi-Agent Simulations

Summary

This paper discusses the exciting but currently limited field of using artificial intelligence, specifically large language models, to create simulations with many interacting 'agents' – think of them as characters in a virtual world. It argues that these simulations have the potential to model complex real-world scenarios, but current methods aren't sophisticated enough to truly capture that complexity.

What's the problem?

Right now, most AI simulations are pretty basic. They give the AI agents specific tasks in a controlled environment and then measure how well they do. This is like giving someone a test with pre-defined answers. Real life isn't like that; things change, unexpected events happen, and societies evolve. These static simulations can't show us how AI agents would actually behave in a truly dynamic and unpredictable world, and the way we currently evaluate them doesn't account for surprising or novel behaviors.

What's the solution?

The paper suggests we need to move away from these rigid, task-focused simulations. It looks at new ways to combine large language models with multi-agent systems, acknowledging the challenges of keeping these systems stable while still allowing for diverse and evolving behaviors. They propose a new way to categorize these types of simulations and outline a plan for future research focusing on 'open-endedness' – meaning the simulations should be able to run indefinitely without a pre-defined end goal – and continuous adaptation between the agents and their environment.

Why it matters?

This research is important because it's about building AI that can not only *do* things but also *adapt* to changing circumstances and interact with the world in a more realistic way. Better simulations could help us understand complex social systems, predict the consequences of different policies, and ultimately create AI that is more robust, reliable, and aligned with human values. It's a call for the AI community to focus on building simulations that are less about achieving specific goals and more about creating evolving, self-regulating ecosystems.

Abstract

What if artificial agents could not just communicate, but also evolve, adapt, and reshape their worlds in ways we cannot fully predict? With llm now powering multi-agent systems and social simulations, we are witnessing new possibilities for modeling open-ended, ever-changing environments. Yet, most current simulations remain constrained within static sandboxes, characterized by predefined tasks, limited dynamics, and rigid evaluation criteria. These limitations prevent them from capturing the complexity of real-world societies. In this paper, we argue that static, task-specific benchmarks are fundamentally inadequate and must be rethought. We critically review emerging architectures that blend llm with multi-agent dynamics, highlight key hurdles such as balancing stability and diversity, evaluating unexpected behaviors, and scaling to greater complexity, and introduce a fresh taxonomy for this rapidly evolving field. Finally, we present a research roadmap centered on open-endedness, continuous co-evolution, and the development of resilient, socially aligned AI ecosystems. We call on the community to move beyond static paradigms and help shape the next generation of adaptive, socially-aware multi-agent simulations.

View Paper