SIMA 2's new architecture integrates Gemini's powerful reasoning abilities to help it understand a user's high-level goal, perform complex reasoning in pursuit, and skillfully execute goal-oriented actions within games. The agent can understand and accomplish long and complex tasks, and its capacity to transfer learned concepts is foundational to achieving the kind of broad generalization seen in human cognition. SIMA 2's performance is significantly closer to that of a human player on a wide range of tasks.
SIMA 2's ability to operate across diverse gaming environments is a crucial proving ground for general intelligence, allowing agents to master skills, practice complex reasoning, and learn continuously through self-directed play. The agent's self-improvement cycle begins with Gemini providing an initial task and an estimated reward for SIMA 2's behavior, which is then added to a bank of self-generated experience used for further training in subsequent generations. This process allows the agent to improve on previously failed tasks entirely independently of human-generated demonstrations and intervention.

