Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Ke Fan, Shunlin Lu, Minyue Dai, Runyi Yu, Lixing Xiao, Zhiyang Dou, Junting Dong, Lizhuang Ma, Jingbo Wang

2025-07-10

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Summary

This paper talks about a new large-scale dataset and a model framework called Go to Zero, which helps AI create human motions based on text instructions without needing any training on the specific motions beforehand. This is called zero-shot motion generation.

What's the problem?

The problem is that existing AI models for generating human-like motions from text often fail to create new and complex motions they haven't seen during training because available datasets are limited in size and variety.

What's the solution?

The researchers built a huge dataset called MotionMillion by automatically collecting and describing over 2 million motion sequences from videos with text captions. They designed a scalable model that uses this dataset to generate diverse, smooth, and physically plausible motions, even for new, unseen instructions, allowing the model to generalize to many motion types without extra training.

Why it matters?

This matters because generating realistic human motions from text can improve virtual reality, gaming, robotics, and animation. Zero-shot capability means AI can handle new instructions flexibly and creatively, making these technologies more useful and natural.

Abstract

A new dataset and evaluation framework for text-to-motion generation achieve zero-shot generalization using a large-scale model.

View Paper