MagicFace: High-Fidelity Facial Expression Editing with Action-Unit Control
Mengting Wei, Tuomas Varanka, Xingxun Jiang, Huai-Qian Khor, Guoying Zhao
2025-01-08

Summary
This paper talks about MagicFace, a new AI system that can edit facial expressions in photos while keeping the person's identity and other details the same.
What's the problem?
Changing facial expressions in photos is hard to do naturally. Current methods often change too much, like the person's identity or the background, or they can't make small, precise changes to expressions.
What's the solution?
The researchers created MagicFace, which uses something called 'action units' to control facial expressions. It has special parts that keep the person's identity, background, and pose the same while changing the expression. It uses advanced AI techniques like 'diffusion models' and 'self-attention' to make the changes look natural.
Why it matters?
This matters because it could be used in many areas like movie special effects, social media filters, or even in psychology research. It allows for more natural and precise editing of facial expressions in photos, which could lead to better visual effects, more realistic avatars in virtual reality, or tools for studying human emotions.
Abstract
We address the problem of facial expression editing by controling the relative variation of facial action-unit (AU) from the same person. This enables us to edit this specific person's expression in a fine-grained, continuous and interpretable manner, while preserving their identity, pose, background and detailed facial attributes. Key to our model, which we dub MagicFace, is a diffusion model conditioned on AU variations and an ID encoder to preserve facial details of high consistency. Specifically, to preserve the facial details with the input identity, we leverage the power of pretrained Stable-Diffusion models and design an ID encoder to merge appearance features through self-attention. To keep background and pose consistency, we introduce an efficient Attribute Controller by explicitly informing the model of current background and pose of the target. By injecting AU variations into a denoising UNet, our model can animate arbitrary identities with various AU combinations, yielding superior results in high-fidelity expression editing compared to other facial expression editing works. Code is publicly available at https://github.com/weimengting/MagicFace.