Generative AI

Balancing Understanding and Generation in Discrete Diffusion Models

YYue LiuYYuzhong ZhaoZZheyong XieQQixiang YeJJianbin JiaoYYao HuSShaosheng CaoYYunfan Liu
Published
February 1, 2026
Authors
8
Word Count
15,349
Code
Includes code

XDLM unifies MDLM and UDLM for superior performance.

Abstract

In discrete generative modeling, two dominant paradigms demonstrate divergent capabilities: Masked Diffusion Language Models (MDLM) excel at semantic understanding and zero-shot generalization, whereas Uniform-noise Diffusion Language Models (UDLM) achieve strong few-step generation quality, yet neither attains balanced performance across both dimensions. To address this, we propose XDLM, which bridges the two paradigms via a stationary noise kernel. XDLM offers two key contributions: (1) it provides a principled theoretical unification of MDLM and UDLM, recovering each paradigm as a special case; and (2) an alleviated memory bottleneck enabled by an algebraic simplification of the posterior probabilities. Experiments demonstrate that XDLM advances the Pareto frontier between understanding capability and generation quality. Quantitatively, XDLM surpasses UDLM by 5.4 points on zero-shot text benchmarks and outperforms MDLM in few-step image generation (FID 54.1 vs. 80.8). When scaled to tune an 8B-parameter large language model, XDLM achieves 15.0 MBPP in just 32 steps, effectively doubling the baseline performance. Finally, analysis of training dynamics reveals XDLM's superior potential for long-term scaling. Code is available at https://github.com/MzeroMiko/XDLM

Key Takeaways

  • 1

    XDLM balances understanding and generation in discrete models.

  • 2

    Stationary noise kernel enables efficient, scalable implementation.

  • 3

    XDLM outperforms existing models in multiple benchmarks.

Limitations

  • Requires careful tuning of mixing ratio for optimal results.

  • Computational efficiency gains may vary with different datasets.

Keywords

Masked Diffusion Language ModelsUniform-noise Diffusion Language Modelsstationary noise kernelPareto frontierposterior probabilitiesalgebraic simplificationlarge language modelFIDMBPPtraining dynamics

More in Generative AI

View all
Balancing Understanding and Generation in Discrete Diffusion Models | Paperchime