STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning

Mingcong Lei, Yiming Zhao, Ge Wang, Zhixin Mai, Shuguang Cui, Yatong Han, Jinke Ren

2025-02-17

STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task
Planning

Summary

This paper talks about a new AI system called STMA (Spatio-Temporal Memory Agent) that helps robots or virtual agents complete complex tasks by remembering and using information about space and time more effectively.

What's the problem?

Current AI agents struggle to perform long, complicated tasks in changing environments because they can't remember and use past information well enough. This makes it hard for them to make good decisions and adapt to new situations.

What's the solution?

The researchers created STMA, which has three main parts: a memory system that remembers both what happened and where things are, a smart map (called a knowledge graph) that updates as the environment changes, and a planning system that comes up with strategies and checks if they're good. They tested STMA on 32 different tasks in a text-based world to see how well it worked.

Why it matters?

This matters because it could make robots and AI assistants much better at handling real-world tasks that take a long time and have many steps. STMA performed 31.25% better at completing tasks and scored 24.7% higher than the best current AI systems. This improvement could lead to more capable and reliable AI helpers in various fields, from virtual assistants to real-world robots.

Abstract

A key objective of embodied intelligence is enabling agents to perform long-horizon tasks in dynamic environments while maintaining robust decision-making and adaptability. To achieve this goal, we propose the Spatio-Temporal Memory Agent (STMA), a novel framework designed to enhance task planning and execution by integrating spatio-temporal memory. STMA is built upon three critical components: (1) a spatio-temporal memory module that captures historical and environmental changes in real time, (2) a dynamic knowledge graph that facilitates adaptive spatial reasoning, and (3) a planner-critic mechanism that iteratively refines task strategies. We evaluate STMA in the TextWorld environment on 32 tasks, involving multi-step planning and exploration under varying levels of complexity. Experimental results demonstrate that STMA achieves a 31.25% improvement in success rate and a 24.7% increase in average score compared to the state-of-the-art model. The results highlight the effectiveness of spatio-temporal memory in advancing the memory capabilities of embodied agents.

View Paper