Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

Jiaqi Liu, Zipeng Ling, Shi Qiu, Yanqing Liu, Siwei Han, Peng Xia, Haoqin Tu, Zeyu Zheng, Cihang Xie, Charles Fleming, Mingyu Ding, Huaxiu Yao

2026-04-03

Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

Summary

This paper focuses on improving the long-term memory of AI agents, specifically their ability to remember and use information from different types of data like images and text over long periods of time.

What's the problem?

AI agents are getting better at complex tasks, but they struggle to remember past experiences effectively, especially when those experiences involve different kinds of information. Building a good long-term memory system is really hard because there are so many different design choices to make – how the memory is structured, how information is retrieved, how prompts are written, and how data is prepared. Trying to figure out the best combination of these things manually or with standard automated methods is too complicated and time-consuming.

What's the solution?

The researchers used an automated system to design and test different memory systems for AI agents. This system ran about 50 experiments, automatically identifying problems, suggesting improvements to the memory’s structure, and even fixing errors in how the data was processed. Importantly, this all happened without any human intervention during the testing and improvement process. The final system, called Omni-SimpleMem, significantly outperformed previous systems on two different tests, showing a huge improvement in remembering and using past experiences.

Why it matters?

This work is important because it shows that we can use AI to *design* better AI systems, specifically for long-term memory. The biggest improvements came not from simply tweaking settings, but from fixing bugs, changing the memory’s overall structure, and improving how the AI is prompted – things that traditional automated methods often miss. This suggests a new way to build more capable and adaptable AI agents, and provides a guide for applying this automated design process to other areas of AI.

Abstract

AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall multimodal experiences remains a critical bottleneck. Building effective lifelong memory requires navigating a vast design space spanning architecture, retrieval strategies, prompt engineering, and data pipelines; this space is too large and interconnected for manual exploration or traditional AutoML to explore effectively. We deploy an autonomous research pipeline to discover Omni-SimpleMem, a unified multimodal memory framework for lifelong AI agents. Starting from a naïve baseline (F1=0.117 on LoCoMo), the pipeline autonomously executes {sim}50 experiments across two benchmarks, diagnosing failure modes, proposing architectural modifications, and repairing data pipeline bugs, all without human intervention in the inner loop. The resulting system achieves state-of-the-art on both benchmarks, improving F1 by +411% on LoCoMo (0.117to0.598) and +214% on Mem-Gallery (0.254to0.797) relative to the initial configurations. Critically, the most impactful discoveries are not hyperparameter adjustments: bug fixes (+175%), architectural changes (+44%), and prompt engineering (+188% on specific categories) each individually exceed the cumulative contribution of all hyperparameter tuning, demonstrating capabilities fundamentally beyond the reach of traditional AutoML. We provide a taxonomy of six discovery types and identify four properties that make multimodal memory particularly suited for autoresearch, offering guidance for applying autonomous research pipelines to other AI system domains. Code is available at this https://github.com/aiming-lab/SimpleMem.

View Paper