Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.11755
Cited By
v1
v2
v3
v4 (latest)
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
23 November 2021
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffM
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance"
50 / 50 papers shown
Title
On Memorization in Diffusion Models
Xiangming Gu
Chao Du
Tianyu Pang
Chongxuan Li
Min Lin
Ye Wang
DiffM
TDI
332
55
0
21 Feb 2025
Text2Data: Low-Resource Data Generation with Textual Control
Shiyu Wang
Yihao Feng
Tian Lan
Ning Yu
Yu Bai
Ran Xu
Han Wang
Caiming Xiong
Siyang Song
DiffM
128
0
0
03 Jan 2025
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
166
1
0
16 Dec 2024
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Harry Zhang
Luca Carlone
3DH
213
1
0
27 May 2024
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
280
30,103
0
01 Mar 2022
Text-Free Prosody-Aware Generative Spoken Language Modeling
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
...
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
68
121
0
07 Sep 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
51
88
0
17 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
128
894
0
11 Jun 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
104
537
0
13 May 2021
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
241
7,933
0
11 May 2021
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Yuzi Yan
Xu Tan
Bohan Li
Tao Qin
Sheng Zhao
Yuan-Chung Shen
Tie-Yan Liu
37
46
0
20 Apr 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Myeonghun Jeong
Hyeongju Kim
Sung Jun Cheon
Byoung Jin Choi
N. Kim
DiffM
59
196
0
03 Apr 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
Edresson Casanova
C. Shulby
Eren Golge
Nicolas Müller
F. S. Oliveira
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
54
100
0
02 Apr 2021
Diffusion Probabilistic Models for 3D Point Cloud Generation
Shitong Luo
Wei Hu
3DPC
257
747
0
02 Mar 2021
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
352
3,702
0
18 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Dan Su
72
22
0
12 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
254
364
0
01 Feb 2021
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
344
6,551
0
26 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
69
101
0
06 Nov 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,936
0
12 Oct 2020
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
286
7,454
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
155
1,466
0
21 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
84
793
0
02 Sep 2020
Unsupervised Learning For Sequence-to-sequence Text-to-speech For Low-resource Languages
Haitong Zhang
Yue Lin
45
30
0
11 Aug 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
658
18,276
0
19 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,401
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
70
186
0
05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
100
492
0
22 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
224
3,153
0
16 May 2020
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
165
954
0
08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
280
240
0
25 Sep 2019
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
255
307
0
14 Sep 2019
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
69
131
0
04 Jun 2019
FloWaveNet : A Generative Flow for Raw Audio
Sungwon Kim
Sang-gil Lee
Jongyoon Song
Jaehyeon Kim
Sungroh Yoon
69
169
0
06 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
153
1,035
0
31 Oct 2018
Glow: Generative Flow with Invertible 1x1 Convolutions
Diederik P. Kingma
Prafulla Dhariwal
BDL
DRL
297
3,138
0
09 Jul 2018
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
356
2,285
0
14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
256
834
0
12 Jun 2018
Efficient Neural Audio Synthesis
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
91
867
0
23 Feb 2018
Neural Voice Cloning with a Few Samples
Sercan O. Arik
Jitong Chen
Kainan Peng
Ming-Yu Liu
Yanqi Zhou
63
387
0
14 Feb 2018
Adversarial Audio Synthesis
Chris Donahue
Julian McAuley
M. Puckette
GAN
143
614
0
12 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
79
2,701
0
16 Dec 2017
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
Aaron van den Oord
Yazhe Li
Igor Babuschkin
Karen Simonyan
Oriol Vinyals
...
Alex Graves
Helen King
T. Walters
Dan Belov
Demis Hassabis
218
859
0
28 Nov 2017
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
228
5,061
0
02 Nov 2017
Generalized End-to-End Loss for Speaker Verification
Li Wan
Quan Wang
Alan Papir
Ignacio López Moreno
VLM
68
930
0
28 Oct 2017
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
160
1,826
0
29 Mar 2017
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,405
0
12 Sep 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.8K
77,341
0
18 May 2015
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
Jascha Narain Sohl-Dickstein
Eric A. Weiss
Niru Maheswaranathan
Surya Ganguli
SyDa
DiffM
306
7,005
0
12 Mar 2015
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
452
16,923
0
20 Dec 2013
1