StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition
Kwan Yun, Changmin Lee, Ayeong Jeong, Youngseo Kim, Seungmi Lee, Junyong Noh
2026-04-24
Summary
This paper focuses on making stylized portraits, like turning a photo into a cartoon or painting, while still making sure the person in the picture is recognizable. The main issue is that current computer systems struggle to tell if a stylized image still represents the same person, because they're easily confused by the artistic changes.
What's the problem?
Existing computer programs that identify people in images are trained on regular photos. When you apply a style like a cartoon filter, these programs often think the person has changed just because the colors or lines are different. They aren't able to separate changes due to style from actual changes in identity, and there wasn't a good way to measure how well a system preserves identity *across* different art styles and how strong the style is applied.
What's the solution?
The researchers created two new datasets called StyleBench-H and StyleBench-S. StyleBench-H is a collection of images used to test how well humans can recognize the same person in different styles. StyleBench-S is a set of data based on experiments where people were asked to compare images and rate how strongly they recognized the person. They then used StyleBench-S to improve existing computer programs so they better match human perception of identity, even with different art styles applied at varying intensities.
Why it matters?
This work is important because it makes stylized portrait generation more reliable. By creating a way to accurately measure and improve identity preservation, it allows artists and developers to create more convincing and recognizable stylized images, and even works well on portraits drawn by actual artists, not just computer-generated styles.
Abstract
Creative face stylization aims to render portraits in diverse visual idioms such as cartoons, sketches, and paintings while retaining recognizable identity. However, current identity encoders, which are typically trained and calibrated on natural photographs, exhibit severe brittleness under stylization. They often mistake changes in texture or color palette for identity drift or fail to detect geometric exaggerations. This reveals the lack of a style-agnostic framework to evaluate and supervise identity consistency across varying styles and strengths. To address this gap, we introduce StyleID, a human perception-aware dataset and evaluation framework for facial identity under stylization. StyleID comprises two datasets: (i) StyleBench-H, a benchmark that captures human same-different verification judgments across diffusion- and flow-matching-based stylization at multiple style strengths, and (ii) StyleBench-S, a supervision set derived from psychometric recognition-strength curves obtained through controlled two-alternative forced-choice (2AFC) experiments. Leveraging StyleBench-S, we fine-tune existing semantic encoders to align their similarity orderings with human perception across styles and strengths. Experiments demonstrate that our calibrated models yield significantly higher correlation with human judgments and enhanced robustness for out-of-domain, artist drawn portraits. All of our datasets, code, and pretrained models are publicly available at https://kwanyun.github.io/StyleID_page/