Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Yi Liao, Yu Gu, Yuan Sui, Zining Zhu, Yifan Lu, Guohua Tang, Zhongqian Sun, Wei Yang

2025-09-01

Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Summary

This paper explores why large language models, despite being good at things like math and coding, often struggle with simple, real-world tasks that even young children can do easily, and proposes a new method to help them learn these skills.

What's the problem?

Large language models have a lot of knowledge *about* how the world works – what things are, and how they relate to each other. This is called declarative knowledge. However, they struggle with *doing* things, like actually interacting with an environment and making decisions based on what happens. This is procedural knowledge. Traditional methods for teaching computers to *do* things, like reinforcement learning, require a ton of data and aren't very transparent in how they make choices. LLMs have the knowledge part down, but can't easily translate that into action.

What's the solution?

The researchers developed a framework called 'Think in Games' (TiG). TiG essentially teaches the LLM to learn by playing games. Instead of treating the learning process as a separate thing, TiG frames it as a language task: the LLM writes out instructions (policies) for how to play, and then gets feedback from the game environment. This feedback helps the LLM refine its instructions over time. It’s like the LLM is thinking through the game and explaining its moves, which then helps it learn to play better.

Why it matters?

This work is important because it allows LLMs to learn how to *do* things more efficiently, using less data and computing power than traditional methods. It also makes the LLM’s decision-making process much more understandable, as it can explain its reasoning in plain language. This is a big step towards creating AI that can not only understand the world but also interact with it effectively and transparently.

Abstract

Large language models (LLMs) excel at complex reasoning tasks such as mathematics and coding, yet they frequently struggle with simple interactive tasks that young children perform effortlessly. This discrepancy highlights a critical gap between declarative knowledge (knowing about something) and procedural knowledge (knowing how to do something). Although traditional reinforcement learning (RL) agents can acquire procedural knowledge through environmental interaction, they often operate as black boxes and require substantial training data. In contrast, LLMs possess extensive world knowledge and reasoning capabilities, but are unable to effectively convert this static knowledge into dynamic decision-making in interactive settings. To address this challenge, we propose Think in Games (TiG), a novel framework that empowers LLMs to develop procedural understanding through direct interaction with game environments, while retaining their inherent reasoning and explanatory abilities. Specifically, TiG reformulates RL-based decision-making as a language modeling task: LLMs generate language-guided policies, which are refined iteratively through online reinforcement learning based on environmental feedback. Our experimental results show that TiG successfully bridges the gap between declarative and procedural knowledge, achieving competitive performance with dramatically lower data and computational demands compared to conventional RL methods. Moreover, TiG provides step-by-step natural language explanations for its decisions, greatly improving transparency and interpretability in complex interactive tasks.

View Paper