Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.06337
Cited By
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
13 May 2021
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
50 / 352 papers shown
Title
Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge
Chenpeng Du
Yiwei Guo
Feiyu Shen
Kai Yu
19
5
0
25 Apr 2023
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models
Zhendong Wang
Yi Ding
Huangjie Zheng
Peihao Wang
Pengcheng He
Zhangyang Wang
Weizhu Chen
Mingyuan Zhou
38
97
0
25 Apr 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
152
144
0
24 Apr 2023
DiffVoice: Text-to-Speech with Latent Diffusion
Zhijun Liu
Yiwei Guo
K. Yu
DiffM
27
22
0
23 Apr 2023
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Kai Shen
Zeqian Ju
Xu Tan
Yanqing Liu
Yichong Leng
Lei He
Tao Qin
Sheng Zhao
Jiang Bian
DiffM
26
224
0
18 Apr 2023
Hi Sheldon! Creating Deep Personalized Characters from TV Shows
Meidai Xuanyuan
Yuwang Wang
Honglei Guo
Xiao Ma
Yuchen Guo
Tao Yu
Qionghai Dai
VGen
25
0
0
09 Apr 2023
DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection
Amit Kumar Singh Yadav
Kratika Bhagtani
Ziyue Xiang
Paolo Bestagini
Stefano Tubaro
Edward J. Delp
DRL
34
6
0
06 Apr 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffM
VGen
41
21
0
27 Mar 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI
Chenshuang Zhang
Chaoning Zhang
Sheng Zheng
Mengchun Zhang
Maryam Qamar
Sung-Ho Bae
In So Kweon
DiffM
MedIm
54
64
0
23 Mar 2023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Peng Jin
Hao Li
Ze-Long Cheng
Kehan Li
Xiang Ji
Chang-rui Liu
Li-ming Yuan
Jie Chen
DiffM
VGen
28
54
0
17 Mar 2023
Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models
Suhyeon Lee
Hyungjin Chung
Minyoung Park
Jonghyuk Park
Wi-Sun Ryu
J. C. Ye
DiffM
MedIm
17
44
0
15 Mar 2023
Editing Implicit Assumptions in Text-to-Image Diffusion Models
Hadas Orgad
Bahjat Kawar
Yonatan Belinkov
DiffM
30
86
0
14 Mar 2023
One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale
Fan Bao
Shen Nie
Kaiwen Xue
Chongxuan Li
Shiliang Pu
Yaole Wang
Gang Yue
Yue Cao
Hang Su
Jun Zhu
DiffM
207
149
0
12 Mar 2023
GECCO: Geometrically-Conditioned Point Diffusion Models
M. Tyszkiewicz
Pascal Fua
Eduard Trulls
DiffM
26
21
0
10 Mar 2023
Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports
Hyunseung Chung
Jiho Kim
Joon-Myoung Kwon
K. Jeon
Min Sung Lee
Edward Choi
MedIm
11
13
0
09 Mar 2023
Unifying Layout Generation with a Decoupled Diffusion Model
Mude Hui
Zhizheng Zhang
Xiaoyi Zhang
Wenxuan Xie
Yuwang Wang
Yan Lu
DiffM
18
39
0
09 Mar 2023
TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation
David Berthelot
Arnaud Autef
Jierui Lin
Dian Ang Yap
Shuangfei Zhai
Siyuan Hu
Daniel Zheng
Walter Talbot
Eric Gu
DiffM
28
80
0
07 Mar 2023
An investigation into the adaptability of a diffusion-based TTS model
Haolin Chen
Philip N. Garner
DiffM
39
1
0
03 Mar 2023
Consistency Models
Yang Song
Prafulla Dhariwal
Mark Chen
Ilya Sutskever
VLM
DiffM
25
865
0
02 Mar 2023
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus
Ajinkya Kulkarni
Atharva Kulkarni
Sara Shatnawi
Hanan Aldarmaki
17
8
0
28 Feb 2023
Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement
Bunlong Lay
Simon Welker
Julius Richter
Timo Gerkmann
DiffM
10
24
0
28 Feb 2023
Can We Use Diffusion Probabilistic Models for 3D Motion Prediction?
Hyemin Ahn
Esteve Valls Mascaro
Dongheui Lee
VGen
DiffM
16
22
0
28 Feb 2023
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
Jiyoung Lee
Joon Son Chung
Soo-Whan Chung
DiffM
38
27
0
27 Feb 2023
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow
Yoonhyung Lee
Jinhyeok Yang
Kyomin Jung
22
6
0
27 Feb 2023
Star-Shaped Denoising Diffusion Probabilistic Models
Andrey Okhotin
Dmitry Molchanov
V. Arkhipkin
Grigory Bartosh
Viktor Ohanesian
Aibek Alanov
Dmitry Vetrov
DiffM
40
12
0
10 Feb 2023
Noise2Music: Text-conditioned Music Generation with Diffusion Models
Qingqing Huang
Daniel S. Park
Tao Wang
Timo I. Denk
Andy Ly
...
Jesse Engel
Quoc V. Le
William Chan
Zhifeng Chen
Wei Han
MGen
DiffM
38
190
0
08 Feb 2023
HumanMAC: Masked Motion Completion for Human Motion Prediction
Ling-Hao Chen
Jiawei Zhang
Ye-rong Li
Yiren Pang
Xiaobo Xia
Tongliang Liu
DiffM
VGen
32
56
0
07 Feb 2023
ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion Trajectories
Zijian Zhang
Zhou Zhao
Jun Yu
Qi Tian
DiffM
22
12
0
05 Feb 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Dongchao Yang
Songxiang Liu
Rongjie Huang
Chao Weng
Helen Meng
DiffM
VLM
31
85
0
31 Jan 2023
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
Haohe Liu
Zehua Chen
Yiitan Yuan
Xinhao Mei
Xubo Liu
Danilo Mandic
Wenwu Wang
Mark D. Plumbley
DiffM
38
467
0
29 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffM
VGen
24
34
0
10 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
48
644
0
05 Jan 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
...
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo Mandic
DiffM
32
22
0
30 Dec 2022
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder
Yusuke Yasuda
T. Toda
DiffM
20
7
0
16 Dec 2022
Towards Practical Plug-and-Play Diffusion Models
Hyojun Go
Yunsung Lee
Jin-Young Kim
Seunghyun Lee
Myeongho Jeong
Hyun Seung Lee
Seungtaek Choi
DiffM
35
16
0
12 Dec 2022
How to Backdoor Diffusion Models?
Sheng-Yen Chou
Pin-Yu Chen
Tsung-Yi Ho
DiffM
SILM
19
95
0
11 Dec 2022
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Rishabh Dabral
Muhammad Hamza Mughal
Vladislav Golyanik
Christian Theobalt
DiffM
VGen
32
172
0
08 Dec 2022
Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue
Daxin Tan
Nikos Kargas
David McHardy
C. Papayiannis
A. Bonafonte
Marek Střelec
Jonas Rohnke
A. Filandras
Trevor Wood
6
0
0
07 Dec 2022
Denoising diffusion probabilistic models for probabilistic energy forecasting
Esteban Hernandez Capel
Jonathan Dumas
DiffM
19
15
0
06 Dec 2022
Fast Sampling of Diffusion Models via Operator Learning
Hongkai Zheng
Weili Nie
Arash Vahdat
Kamyar Azizzadenesheli
Anima Anandkumar
DiffM
65
131
0
24 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
63
443
0
17 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users
Gokul Karthik Kumar
V. PraveenS.
Pratyush Kumar
Mitesh M. Khapra
Karthik Nandakumar
38
18
0
17 Nov 2022
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Yiwei Guo
Chenpeng Du
Xie Chen
K. Yu
DiffM
52
40
0
17 Nov 2022
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models
Minki Kang
Dong Min
Sung Ju Hwang
DiffM
25
48
0
17 Nov 2022
OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta
Ambika Kirkland
Harm Lameris
Jonas Beskow
Éva Székely
G. Henter
AI4TS
39
12
0
13 Nov 2022
DiffPhase: Generative Diffusion-based STFT Phase Retrieval
Tal Peer
Simon Welker
Timo Gerkmann
DiffM
22
7
0
08 Nov 2022
Guided Conditional Diffusion for Controllable Traffic Simulation
Ziyuan Zhong
Davis Rempe
Danfei Xu
Yuxiao Chen
Sushant Veer
Tong Che
Baishakhi Ray
Marco Pavone
24
147
0
31 Oct 2022
Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar
Shiran Zada
Oran Lang
Omer Tov
Hui-Tang Chang
Tali Dekel
Inbar Mosseri
Michal Irani
11
1,050
0
17 Oct 2022
Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario
Emily R. Bartusiak
Edward J. Delp
19
12
0
14 Oct 2022
LION: Latent Point Diffusion Models for 3D Shape Generation
Fangyin Wei
Arash Vahdat
Francis Williams
Zan Gojcic
Or Litany
Sanja Fidler
Karsten Kreis
DiffM
70
485
0
12 Oct 2022
Previous
1
2
3
4
5
6
7
8
Next