Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Simon A. Aytes, Jinheon Baek, Sung Ju Hwang

2025-03-09

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive
Cognitive-Inspired Sketching

Summary

This paper talks about Sketch-of-Thought (SoT), a new method for improving AI reasoning by reducing unnecessary steps and making the process more efficient while keeping accuracy high

What's the problem?

Chain-of-Thought (CoT) prompting, which helps AI reason step-by-step, often uses too many words and takes up a lot of computer power. This makes it slow and expensive, especially for tasks that don’t need so much detail

What's the solution?

The researchers created SoT, which uses ideas from cognitive science to simplify how AI models think through problems. It combines three reasoning styles—Conceptual Chaining, Chunked Symbolism, and Expert Lexicons—and picks the best one for each task using a lightweight model. SoT reduces the number of words needed by 76% on average while keeping accuracy the same or even improving it in areas like math and complex reasoning

Why it matters?

This matters because it makes AI reasoning faster and cheaper without losing quality. By cutting down on unnecessary steps, SoT allows AI to handle complex tasks more efficiently, making it useful for things like solving math problems, answering questions in different languages, or working with images and videos

Abstract

Recent advances in large language models have demonstrated remarkable reasoning capabilities through Chain of Thought (CoT) prompting, but often at the cost of excessive verbosity in their intermediate outputs, which increases computational overhead. We introduce Sketch-of-Thought (SoT), a novel prompting framework that combines cognitive-inspired reasoning paradigms with linguistic constraints to minimize token usage while preserving reasoning accuracy. SoT is designed as a flexible framework that can incorporate any custom reasoning paradigms based on cognitive science, and we instantiate it with three such paradigms - Conceptual Chaining, Chunked Symbolism, and Expert Lexicons - each tailored to different reasoning tasks and selected dynamically via a lightweight routing model. Through comprehensive evaluation across 15 reasoning datasets with multiple languages and multimodal scenarios, we demonstrate that SoT achieves token reductions of 76% with negligible accuracy impact. In certain domains like mathematical and multi-hop reasoning, it even improves accuracy while using significantly fewer tokens. Our code is publicly available: https://www.github.com/SimonAytes/SoT.

View Paper