Papers
arxiv:2603.20155

Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

Published on Mar 20
· Submitted by
taesiri
on Mar 23
Authors:
,
,
,
,

Abstract

Discrete Moment Matching Distillation (D-MMD) enables effective distillation of discrete diffusion models by adapting continuous-domain techniques, achieving superior performance compared to previous methods.

AI-generated summary

It is currently difficult to distill discrete diffusion models. In contrast, continuous diffusion literature has many distillation approaches methods that can reduce sampling steps to a handful. Our method, Discrete Moment Matching Distillation (D-MMD), leverages ideas that have been highly successful in the continuous domain. Whereas previous discrete distillation methods collapse, D-MMD maintains high quality and diversity (given sufficient sampling steps). This is demonstrated on both text and image datasets. Moreover, the newly distilled generators can outperform their teachers.

Community

Paper submitter

Discrete Moment Matching Distillation preserves quality and diversity when distilling discrete diffusion models, enabling efficient sampling for text and image tasks and sometimes surpassing teacher models.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.20155 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.20155 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.20155 in a Space README.md to link it from this page.

Collections including this paper 1