Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.14518
Cited By
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
30 December 2022
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
Xuejiao Tan
Yang Cui
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo P. Mandic
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech"
19 / 19 papers shown
Title
Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series
Ilan Naiman
Nimrod Berman
Itai Pemper
Idan Arbiv
Gal Fadlon
Omri Azencot
32
11
0
25 Oct 2024
MIMII-Gen: Generative Modeling Approach for Simulated Evaluation of Anomalous Sound Detection System
Harsh Purohit
Tomoya Nishida
Kota Dohi
Takashi Endo
Y. Kawaguchi
DiffM
36
0
0
27 Sep 2024
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
C. Han
Seokgi Lee
Gyuhyeon Nam
Gyeongsu Chae
DiffM
121
0
0
14 Sep 2024
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation
Xinhan Di
Jiahao Lu
Yunming Liang
Junjie Zheng
Yihua Wang
Chaofan Ding
ALM
33
1
0
01 Aug 2024
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Philip Anastassiou
Jiawei Chen
J. Chen
Yuanzhe Chen
Zhuo Chen
...
Wenjie Zhang
Y. Zhang
Zilin Zhao
Dejian Zhong
Xiaobin Zhuang
49
77
0
04 Jun 2024
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound
Haohe Liu
Xuenan Xu
Yiitan Yuan
Mengyue Wu
Wenwu Wang
Mark D. Plumbley
35
18
0
30 Apr 2024
SonicVisionLM: Playing Sound with Vision Language Models
Zhifeng Xie
Shengye Yu
Qile He
Mengtian Li
VLM
VGen
28
2
0
09 Jan 2024
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Zehua Chen
Guande He
Kaiwen Zheng
Xu Tan
Jun Zhu
DiffM
53
21
0
06 Dec 2023
Improving Diffusion Models for ECG Imputation with an Augmented Template Prior
Alexander Jenkins
Zehua Chen
F. Ng
Danilo P. Mandic
DiffM
29
2
0
24 Oct 2023
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
Wenhao Guan
Qi Su
Haodong Zhou
Shiyu Miao
Xingjia Xie
Lin Li
Q. Hong
DiffM
18
13
0
29 Sep 2023
Generating Visual Scenes from Touch
Fengyu Yang
Jiacheng Zhang
Andrew Owens
DiffM
22
24
0
26 Sep 2023
PromptTTS 2: Describing and Generating Voices with Text Prompt
Yichong Leng
Zhifang Guo
Kai Shen
Xu Tan
Zeqian Ju
...
Lei He
Xiang-Yang Li
Sheng Zhao
Tao Qin
Jiang Bian
VLM
DiffM
39
40
0
05 Sep 2023
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Haohe Liu
Yiitan Yuan
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Qiao Tian
Yuping Wang
Wenwu Wang
Yuxuan Wang
Mark D. Plumbley
DiffM
25
222
0
10 Aug 2023
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech
Xin Jing
Yi Chang
Zijiang Yang
Jiang-jian Xie
Andreas Triantafyllopoulos
Bjoern W. Schuller
31
10
0
22 May 2023
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Zhe Ye
Wei Xue
Xuejiao Tan
Jie Chen
Qi-fei Liu
Yi-Ting Guo
DiffM
30
40
0
11 May 2023
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Haohe Liu
Zehua Chen
Yiitan Yuan
Xinhao Mei
Xubo Liu
Danilo P. Mandic
Wenwu Wang
Mark D. Plumbley
DiffM
33
467
0
29 Jan 2023
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Dan Su
Dong Yu
DiffM
68
65
0
28 Jan 2022
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation
Qiuqiang Kong
Yin Cao
Haohe Liu
Keunwoo Choi
Yuxuan Wang
118
96
0
12 Sep 2021
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
223
239
0
25 Sep 2019
1