Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training

Hexiao Lu, Xiaokun Sun, Zeyu Cai, Hao Guo, Ying Tai, Jian Yang, Zhenyu Zhang

2026-01-07

Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training

Summary

This paper introduces Muses, a new computer program that can automatically create detailed 3D models of imaginary creatures without needing a lot of pre-training or human help.

What's the problem?

Existing methods for making 3D creatures often fall short because they struggle with creating realistic and consistent designs. Some rely on awkwardly combining pre-made parts, while others need a lot of manual tweaking or can’t easily create things that are very different from what they’ve seen before. Basically, it's hard to get a computer to design a believable creature from scratch.

What's the solution?

Muses takes a different approach by starting with a 3D skeleton, like the framework of a real animal. It then intelligently designs this skeleton to have a good shape and proportions. Next, it builds the creature’s body around this skeleton using a system that organizes different body parts in a logical way. Finally, it adds textures and colors to the model, making sure everything looks consistent and fits the overall design. It's like building with LEGOs, but the computer designs the instructions and finds the right pieces.

Why it matters?

This work is important because it opens the door to easily generating high-quality 3D creatures for things like video games, movies, or even just for fun. Because it doesn’t require extensive training data or manual effort, it’s much more flexible and can create a wider variety of unique and imaginative designs than previous methods. It also shows potential for editing existing 3D models more easily.

Abstract

We present Muses, the first training-free method for fantastic 3D creature generation in a feed-forward paradigm. Previous methods, which rely on part-aware optimization, manual assembly, or 2D image generation, often produce unrealistic or incoherent 3D assets due to the challenges of intricate part-level manipulation and limited out-of-domain generation. In contrast, Muses leverages the 3D skeleton, a fundamental representation of biological forms, to explicitly and rationally compose diverse elements. This skeletal foundation formalizes 3D content creation as a structure-aware pipeline of design, composition, and generation. Muses begins by constructing a creatively composed 3D skeleton with coherent layout and scale through graph-constrained reasoning. This skeleton then guides a voxel-based assembly process within a structured latent space, integrating regions from different objects. Finally, image-guided appearance modeling under skeletal conditions is applied to generate a style-consistent and harmonious texture for the assembled shape. Extensive experiments establish Muses' state-of-the-art performance in terms of visual fidelity and alignment with textual descriptions, and potential on flexible 3D object editing. Project page: https://luhexiao.github.io/Muses.github.io/.

View Paper