ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.11755
  4. Cited By
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
v1v2v3v4 (latest)

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

23 November 2021
Heeseung Kim
Sungwon Kim
Sungroh Yoon
    DiffMBDL
ArXiv (abs)PDFHTML

Papers citing "Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance"

50 / 50 papers shown
Title
On Memorization in Diffusion Models
On Memorization in Diffusion Models
Xiangming Gu
Chao Du
Tianyu Pang
Chongxuan Li
Min Lin
Ye Wang
DiffMTDI
332
55
0
21 Feb 2025
Text2Data: Low-Resource Data Generation with Textual Control
Text2Data: Low-Resource Data Generation with Textual Control
Shiyu Wang
Yihao Feng
Tian Lan
Ning Yu
Yu Bai
Ran Xu
Han Wang
Caiming Xiong
Siyang Song
DiffM
128
0
0
03 Jan 2025
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffMAAML
166
1
0
16 Dec 2024
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Harry Zhang
Luca Carlone
3DH
213
1
0
27 May 2024
Generative Adversarial Networks
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
283
30,103
0
01 Mar 2022
Text-Free Prosody-Aware Generative Spoken Language Modeling
Text-Free Prosody-Aware Generative Spoken Language Modeling
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
...
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
68
121
0
07 Sep 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
51
88
0
17 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for
  End-to-End Text-to-Speech
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
128
894
0
11 Jun 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
104
537
0
13 May 2021
Diffusion Models Beat GANs on Image Synthesis
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
241
7,933
0
11 May 2021
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
Yuzi Yan
Xu Tan
Bohan Li
Tao Qin
Sheng Zhao
Yuan-Chung Shen
Tie-Yan Liu
37
46
0
20 Apr 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Myeonghun Jeong
Hyeongju Kim
Sung Jun Cheon
Byoung Jin Choi
N. Kim
DiffM
59
196
0
03 Apr 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
Edresson Casanova
C. Shulby
Eren Golge
Nicolas Müller
F. S. Oliveira
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
54
100
0
02 Apr 2021
Diffusion Probabilistic Models for 3D Point Cloud Generation
Diffusion Probabilistic Models for 3D Point Cloud Generation
Shitong Luo
Wei Hu
3DPC
260
747
0
02 Mar 2021
Improved Denoising Diffusion Probabilistic Models
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
352
3,702
0
18 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep
  VAE with Residual Attention
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Dan Su
72
22
0
12 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
254
364
0
01 Feb 2021
Score-Based Generative Modeling through Stochastic Differential
  Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffMSyDa
347
6,551
0
26 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
69
101
0
06 Nov 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,936
0
12 Oct 2020
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
286
7,454
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffMBDL
155
1,466
0
21 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffMBDL
84
793
0
02 Sep 2020
Unsupervised Learning For Sequence-to-sequence Text-to-speech For
  Low-resource Languages
Unsupervised Learning For Sequence-to-sequence Text-to-speech For Low-resource Languages
Haitong Zhang
Yue Lin
45
30
0
11 Aug 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
658
18,276
0
19 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,401
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
70
186
0
05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
100
492
0
22 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
227
3,153
0
16 May 2020
MelGAN: Generative Adversarial Networks for Conditional Waveform
  Synthesis
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
165
954
0
08 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
283
240
0
25 Sep 2019
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
255
307
0
14 Sep 2019
MelNet: A Generative Model for Audio in the Frequency Domain
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
69
131
0
04 Jun 2019
FloWaveNet : A Generative Flow for Raw Audio
FloWaveNet : A Generative Flow for Raw Audio
Sungwon Kim
Sang-gil Lee
Jongyoon Song
Jaehyeon Kim
Sungroh Yoon
69
169
0
06 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
153
1,035
0
31 Oct 2018
Glow: Generative Flow with Invertible 1x1 Convolutions
Glow: Generative Flow with Invertible 1x1 Convolutions
Diederik P. Kingma
Prafulla Dhariwal
BDLDRL
297
3,138
0
09 Jul 2018
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
356
2,285
0
14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
256
834
0
12 Jun 2018
Efficient Neural Audio Synthesis
Efficient Neural Audio Synthesis
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
91
867
0
23 Feb 2018
Neural Voice Cloning with a Few Samples
Neural Voice Cloning with a Few Samples
Sercan O. Arik
Jitong Chen
Kainan Peng
Ming-Yu Liu
Yanqi Zhou
63
387
0
14 Feb 2018
Adversarial Audio Synthesis
Adversarial Audio Synthesis
Chris Donahue
Julian McAuley
M. Puckette
GAN
143
614
0
12 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
79
2,701
0
16 Dec 2017
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
Aaron van den Oord
Yazhe Li
Igor Babuschkin
Karen Simonyan
Oriol Vinyals
...
Alex Graves
Helen King
T. Walters
Dan Belov
Demis Hassabis
218
859
0
28 Nov 2017
Neural Discrete Representation Learning
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDLSSLOCL
228
5,061
0
02 Nov 2017
Generalized End-to-End Loss for Speaker Verification
Generalized End-to-End Loss for Speaker Verification
Li Wan
Quan Wang
Alan Papir
Ignacio López Moreno
VLM
68
930
0
28 Oct 2017
Tacotron: Towards End-to-End Speech Synthesis
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
160
1,826
0
29 Mar 2017
WaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,405
0
12 Sep 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg3DV
1.8K
77,341
0
18 May 2015
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
Jascha Narain Sohl-Dickstein
Eric A. Weiss
Niru Maheswaranathan
Surya Ganguli
SyDaDiffM
306
7,005
0
12 Mar 2015
Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
452
16,923
0
20 Dec 2013
1