STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts
Zachary Bamberger, Till R. Saenger, Gilad Morad, Ofra Amir, Brandon M. Stewart, Amir Feder
2026-02-17
Summary
This paper introduces a new method called STATe-of-Thoughts for improving how AI models generate text, focusing on making the output both good quality and varied, while also being easier to understand how the AI arrived at its answer.
What's the problem?
Current methods for getting AI to explore different ideas when generating text, like 'Best-of-N' or 'Tree-of-Thoughts', often rely on randomness which doesn't actually lead to truly diverse outputs. Also, it's hard to know *why* these AI systems make the choices they do during the generation process, making it a bit of a 'black box'.
What's the solution?
The researchers developed STATe, which works by having a 'controller' make specific, understandable decisions about how to approach the problem. This controller chooses 'actions' that guide a 'generator' to create text, and then an 'evaluator' ranks the results. Instead of random guessing, STATe uses these deliberate steps, making the reasoning process clear and controllable. They tested this on generating arguments and found it worked better than random methods.
Why it matters?
This research is important because it provides a way to create AI-generated text that isn't just good, but also diverse and, crucially, *explainable*. Being able to understand the AI's reasoning is a big step towards building more trustworthy and useful AI systems, and this method offers a practical way to achieve that.
Abstract
Inference-Time-Compute (ITC) methods like Best-of-N and Tree-of-Thoughts are meant to produce output candidates that are both high-quality and diverse, but their use of high-temperature sampling often fails to achieve meaningful output diversity. Moreover, existing ITC methods offer limited control over how to perform reasoning, which in turn limits their explainability. We present STATe-of-Thoughts (STATe), an interpretable ITC method that searches over high-level reasoning patterns. STATe replaces stochastic sampling with discrete and interpretable textual interventions: a controller selects actions encoding high-level reasoning choices, a generator produces reasoning steps conditioned on those choices, and an evaluator scores candidates to guide search. This structured approach yields three main advantages. First, action-guided textual interventions produce greater response diversity than temperature-based sampling. Second, in a case study on argument generation, STATe's explicit action sequences capture interpretable features that are highly predictive of output quality. Third, estimating the association between performance and action choices allows us to identify promising yet unexplored regions of the action space and steer generation directly toward them. Together, these results establish STATe as a practical framework for generating high-quality, diverse, and interpretable text. Our framework is available at https://github.com/zbambergerNLP/state-of-thoughts.