VecGlypher: Unified Vector Glyph Generation with Language Models

Xiaoke Huang, Bhavul Gauri, Kam Woh Ng, Tony Ng, Mengmeng Xu, Zhiheng Liu, Weiming Ren, Zhaochong An, Zijian Zhou, Haonan Qiu, Yuyin Zhou, Sen He, Ziheng Wang, Tao Xiang, Xiao Han

2026-02-26

VecGlypher: Unified Vector Glyph Generation with Language Models

Summary

This paper introduces VecGlypher, a new artificial intelligence model that can create high-quality vector graphics for fonts directly from text descriptions or example images.

What's the problem?

Currently, creating digital fonts is difficult because it usually requires skilled designers and relies on converting images to vector graphics, which can limit how easily the fonts can be edited or customized. Existing AI methods still need a lot of pre-made examples and often don't produce perfectly clean, editable font outlines.

What's the solution?

VecGlypher is a single AI model that learns to generate the code for vector graphics (specifically, SVG paths) step-by-step, based on what the user asks for in words or shows in an image. It was trained in two stages: first, it learned the basic rules of creating vector graphics from a huge collection of fonts, and then it was fine-tuned to better understand how descriptions and images relate to specific font shapes using a smaller, carefully labeled dataset. The researchers also developed techniques to prepare the data so the AI could learn more effectively.

Why it matters?

VecGlypher makes it much easier for anyone to create their own fonts, even without specialized design skills. Instead of needing to be an expert, you can simply describe the font you want or provide an example image, and the AI will generate it for you. This could lead to more diverse and accessible font options and provide a foundation for new tools that help people design all sorts of graphics.

Abstract

Vector glyphs are the atomic units of digital typography, yet most learning-based pipelines still depend on carefully curated exemplar sheets and raster-to-vector postprocessing, which limits accessibility and editability. We introduce VecGlypher, a single multimodal language model that generates high-fidelity vector glyphs directly from text descriptions or image exemplars. Given a style prompt, optional reference glyph images, and a target character, VecGlypher autoregressively emits SVG path tokens, avoiding raster intermediates and producing editable, watertight outlines in one pass. A typography-aware data and training recipe makes this possible: (i) a large-scale continuation stage on 39K noisy Envato fonts to master SVG syntax and long-horizon geometry, followed by (ii) post-training on 2.5K expert-annotated Google Fonts with descriptive tags and exemplars to align language and imagery with geometry; preprocessing normalizes coordinate frames, canonicalizes paths, de-duplicates families, and quantizes coordinates for stable long-sequence decoding. On cross-family OOD evaluation, VecGlypher substantially outperforms both general-purpose LLMs and specialized vector-font baselines for text-only generation, while image-referenced generation reaches a state-of-the-art performance, with marked gains over DeepVecFont-v2 and DualVector. Ablations show that model scale and the two-stage recipe are critical and that absolute-coordinate serialization yields the best geometry. VecGlypher lowers the barrier to font creation by letting users design with words or exemplars, and provides a scalable foundation for future multimodal design tools.

View Paper