< Explain other AI papers

Mixed-Session Conversation with Egocentric Memory

Jihyoung Jang, Taeyoung Kim, Hyounghun Kim

2024-10-10

Mixed-Session Conversation with Egocentric Memory

Summary

This paper introduces Mixed-Session Conversation (MiSC), a new dialogue system that enhances conversations by allowing multiple speakers to interact over several sessions while using a memory system to maintain context.

What's the problem?

Current dialogue systems struggle to mimic real-life conversations, especially when they involve multiple speakers and long-term interactions. These systems often fail to remember past interactions or manage conversations with several participants, leading to disjointed and unrealistic dialogues.

What's the solution?

To solve this, the authors developed MiSC, which creates a structured conversation environment with four speakers (one main speaker and three partners) across six sessions. They also introduced Egocentric Memory, a mechanism that helps the main speaker remember details from previous sessions and conversations. This allows for smoother and more coherent interactions, even when the partners change from session to session. The MiSC dataset consists of 8,500 episodes designed to train and evaluate this system.

Why it matters?

This research is important because it represents a significant step toward creating more realistic and engaging dialogue systems. By allowing for long-term interactions with multiple partners and maintaining context through memory, MiSC can improve how AI communicates in various applications, such as virtual assistants, customer service bots, and interactive storytelling.

Abstract

Recently introduced dialogue systems have demonstrated high usability. However, they still fall short of reflecting real-world conversation scenarios. Current dialogue systems exhibit an inability to replicate the dynamic, continuous, long-term interactions involving multiple partners. This shortfall arises because there have been limited efforts to account for both aspects of real-world dialogues: deeply layered interactions over the long-term dialogue and widely expanded conversation networks involving multiple participants. As the effort to incorporate these aspects combined, we introduce Mixed-Session Conversation, a dialogue system designed to construct conversations with various partners in a multi-session dialogue setup. We propose a new dataset called MiSC to implement this system. The dialogue episodes of MiSC consist of 6 consecutive sessions, with four speakers (one main speaker and three partners) appearing in each episode. Also, we propose a new dialogue model with a novel memory management mechanism, called Egocentric Memory Enhanced Mixed-Session Conversation Agent (EMMA). EMMA collects and retains memories from the main speaker's perspective during conversations with partners, enabling seamless continuity in subsequent interactions. Extensive human evaluations validate that the dialogues in MiSC demonstrate a seamless conversational flow, even when conversation partners change in each session. EMMA trained with MiSC is also evaluated to maintain high memorability without contradiction throughout the entire conversation.