Discrete Flow Matching

Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky T. Q. Chen, Gabriel Synnaeve, Yossi Adi, Yaron Lipman

2024-07-23

Summary

This paper introduces Discrete Flow Matching, a new method designed to generate discrete data, like language, using flow models. It aims to improve how we create and understand discrete data by enhancing the efficiency and quality of the generation process.

What's the problem?

While flow models and diffusion methods have been successful for continuous data (like images and videos), they haven't worked as well for discrete data, such as text or language. This is a problem because many applications, like chatbots or translation services, rely on generating high-quality discrete outputs. Additionally, existing methods often take a long time to train and may not produce the best results when applied to discrete data.

What's the solution?

The authors propose Discrete Flow Matching, which separates the process of generating discrete data into different probability paths. This method allows for better sampling from these paths and improves the overall quality of generated outputs. They also show that by scaling their models up to 1.7 billion parameters, they can achieve impressive results on coding benchmarks, demonstrating that their approach can compete with larger models that are specifically designed for discrete data generation.

Why it matters?

This research is important because it addresses a significant gap in the ability to generate discrete data effectively. By improving how AI models handle language and other discrete forms of information, Discrete Flow Matching could enhance various applications such as natural language processing, automated coding, and more. This advancement could lead to more efficient AI systems that better understand and generate human language.

Abstract

Despite Flow Matching and diffusion models having emerged as powerful generative paradigms for continuous variables such as images and videos, their application to high-dimensional discrete data, such as language, is still limited. In this work, we present Discrete Flow Matching, a novel discrete flow paradigm designed specifically for generating discrete data. Discrete Flow Matching offers several key contributions: (i) it works with a general family of probability paths interpolating between source and target distributions; (ii) it allows for a generic formula for sampling from these probability paths using learned posteriors such as the probability denoiser (x-prediction) and noise-prediction (epsilon-prediction); (iii) practically, focusing on specific probability paths defined with different schedulers considerably improves generative perplexity compared to previous discrete diffusion and flow models; and (iv) by scaling Discrete Flow Matching models up to 1.7B parameters, we reach 6.7% Pass@1 and 13.4% Pass@10 on HumanEval and 6.7% Pass@1 and 20.6% Pass@10 on 1-shot MBPP coding benchmarks. Our approach is capable of generating high-quality discrete data in a non-autoregressive fashion, significantly closing the gap between autoregressive models and discrete flow models.

View Paper