MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm

Ziyan Guo, Zeyu Hu, Na Zhao, De Wen Soh

2025-02-07

MotionLab: Unified Human Motion Generation and Editing via the
Motion-Condition-Motion Paradigm

Summary

This paper talks about MotionLab, a new AI system that can both create and edit human movements in a unified way. It uses a smart method called Motion-Condition-Motion to make motion generation and editing more flexible and efficient.

What's the problem?

Current tools for generating or editing human motion are often designed for specific tasks, which makes them limited and inefficient. They struggle to handle multiple tasks at once, lack precise control over motion details, and don't share knowledge between different types of motion-related tasks.

What's the solution?

The researchers created MotionLab, which uses the Motion-Condition-Motion paradigm to unify motion generation and editing tasks. It includes advanced features like the MotionFlow Transformer for flexible motion control, Aligned Rotational Position Encoding for keeping movements synchronized, and Motion Curriculum Learning to help the system learn multiple tasks effectively. These innovations allow MotionLab to handle diverse motion-related tasks while maintaining high-quality results.

Why it matters?

This research matters because it simplifies how human motion is created and edited, making it easier to use in animation, robotics, virtual reality, and other fields. By combining multiple tasks into one framework, MotionLab saves time and resources while delivering realistic and customizable motion outputs. It could lead to more advanced digital tools for creative industries and scientific applications.

Abstract

Human motion generation and editing are key components of computer graphics and vision. However, current approaches in this field tend to offer isolated solutions tailored to specific tasks, which can be inefficient and impractical for real-world applications. While some efforts have aimed to unify motion-related tasks, these methods simply use different modalities as conditions to guide motion generation. Consequently, they lack editing capabilities, fine-grained control, and fail to facilitate knowledge sharing across tasks. To address these limitations and provide a versatile, unified framework capable of handling both human motion generation and editing, we introduce a novel paradigm: Motion-Condition-Motion, which enables the unified formulation of diverse tasks with three concepts: source motion, condition, and target motion. Based on this paradigm, we propose a unified framework, MotionLab, which incorporates rectified flows to learn the mapping from source motion to target motion, guided by the specified conditions. In MotionLab, we introduce the 1) MotionFlow Transformer to enhance conditional generation and editing without task-specific modules; 2) Aligned Rotational Position Encoding} to guarantee the time synchronization between source motion and target motion; 3) Task Specified Instruction Modulation; and 4) Motion Curriculum Learning for effective multi-task learning and knowledge sharing across tasks. Notably, our MotionLab demonstrates promising generalization capabilities and inference efficiency across multiple benchmarks for human motion. Our code and additional video results are available at: https://diouo.github.io/motionlab.github.io/.

View Paper