Diffsound: Discrete Diffusion Model for Text-to-sound Generation

Diffsound: Discrete Diffusion Model for Text-to-sound Generation

20 July 2022

Dongchao Yang

Helin Wang

Dong Yu

Papers citing "Diffsound: Discrete Diffusion Model for Text-to-sound Generation"

9 / 59 papers shown

Title
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS Dongchao Yang Songxiang Liu Jianwei Yu Helin Wang Chao Weng Yuexian Zou DiffM VLM 38 18 0 04 Nov 2022
Full-band General Audio Synthesis with Score-based Diffusion Santiago Pascual Gautam Bhattacharya Chunghsin Yeh Jordi Pons Joan Serrà DiffM 27 33 0 26 Oct 2022
Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images Hien Ohnaka Shinnosuke Takamichi Keisuke Imoto Yuki Okamoto Kazuki Fujii Hiroshi Saruwatari DiffM 19 8 0 17 Oct 2022
CLIP-Diffusion-LM: Apply Diffusion Model on Image Captioning Shi-You Xu VLM DiffM 32 11 0 10 Oct 2022
AudioGen: Textually Guided Audio Generation Felix Kreuk Gabriel Synnaeve Adam Polyak Uriel Singer Alexandre Défossez Jade Copet Devi Parikh Yaniv Taigman Yossi Adi DiffM 27 289 0 30 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications Ling Yang Zhilong Zhang Yingxia Shao Shenda Hong Runsheng Xu Yue Zhao Wentao Zhang Tengjiao Wang Ming-Hsuan Yang DiffM MedIm 224 1,304 0 02 Sep 2022
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 255 4,781 0 24 Feb 2021
Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions Emiel Hoogeboom Didrik Nielsen P. Jaini Patrick Forré Max Welling DiffM 207 394 0 10 Feb 2021
Image-to-Image Translation with Conditional Adversarial Networks Phillip Isola Jun-Yan Zhu Tinghui Zhou Alexei A. Efros SSeg 212 19,450 0 21 Nov 2016