ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
- DiffM

Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. Firstly, many approaches rely on modeling continuous embeddings, which is suboptimal for inherently discrete data distributions. Secondly, most models optimize variational bounds rather than exact data likelihood, resulting in inaccurate likelihood estimates and degraded sampling quality. While recent diffusion-based models have addressed these issues separately, we tackle them jointly. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), inspired by photon arrival process, which combines exact likelihood estimation with fully discrete-state modeling. Central to our approach is an information-theoretic Poisson Reconstruction Loss (PRL) that has a provable exact relationship with the true data likelihood. ItDPDM achieves improved likelihood and sampling performance over prior discrete and continuous diffusion models on a variety of synthetic discrete datasets. Furthermore, on real-world datasets such as symbolic music and images, ItDPDM attains superior likelihood estimates and competitive generation quality-demonstrating a proof of concept for distribution-robust discrete generative modeling.
View on arXiv@article{bhattacharya2025_2505.05082, title={ ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model }, author={ Sagnik Bhattacharya and Abhiram Gorle and Ahsan Bilal and Connor Ding and Amit Kumar Singh Yadav and Tsachy Weissman }, journal={arXiv preprint arXiv:2505.05082}, year={ 2025 } }