Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation
Zhiyang Guo, Ori Zhang, Jax Xiang, Alan Zhao, Wengang Zhou, Houqiang Li
2025-12-19
Summary
This paper introduces a new way to pose 3D characters, moving away from traditional methods that directly manipulate the character's shape.
What's the problem?
Currently, posing 3D characters is difficult because existing techniques often create inaccurate or flawed results. Things like how the 'skin' of the character moves with the bones (skinning weights) aren't always right, the character's shape can get messed up, and the pose doesn't always look natural or conform well to the intended movement. These methods don't work well with different character types, limiting how useful they are.
What's the solution?
The researchers developed a system called 'Make-It-Poseable' that works by changing a simplified, internal representation of the character – think of it like a set of instructions for building the character – instead of directly changing the character's surface. They use a 'latent posing transformer' to adjust these instructions based on how the character's skeleton moves. They also added ways to make sure the final character looks good and can handle changes in shape, like adding or removing parts.
Why it matters?
This new approach creates more realistic and accurate poses for 3D characters. Because it works with this internal representation, it's also easier to use for other things like editing the character's shape or swapping out parts, making it a more versatile tool for creating 3D content.
Abstract
Posing 3D characters is a fundamental task in computer graphics and vision. However, existing methods like auto-rigging and pose-conditioned generation often struggle with challenges such as inaccurate skinning weight prediction, topological imperfections, and poor pose conformance, limiting their robustness and generalizability. To overcome these limitations, we introduce Make-It-Poseable, a novel feed-forward framework that reformulates character posing as a latent-space transformation problem. Instead of deforming mesh vertices as in traditional pipelines, our method reconstructs the character in new poses by directly manipulating its latent representation. At the core of our method is a latent posing transformer that manipulates shape tokens based on skeletal motion. This process is facilitated by a dense pose representation for precise control. To ensure high-fidelity geometry and accommodate topological changes, we also introduce a latent-space supervision strategy and an adaptive completion module. Our method demonstrates superior performance in posing quality. It also naturally extends to 3D editing applications like part replacement and refinement.