Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.10550
Cited By
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
20 July 2023
Daegyeom Kim
Seong-soo Hong
Yong-Hoon Choi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer"
16 / 16 papers shown
Title
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
100
7
0
06 Mar 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Dongchao Yang
Songxiang Liu
Rongjie Huang
Chao Weng
Helen Meng
DiffM
VLM
89
102
0
31 Jan 2023
Unsupervised word-level prosody tagging for controllable speech synthesis
Yiwei Guo
Chenpeng Du
Kai Yu
55
15
0
15 Feb 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
234
415
0
04 Dec 2021
Controllable Emotion Transfer For End-to-End Speech Synthesis
Tao Li
Shan Yang
Liumeng Xue
Lei Xie
77
74
0
17 Nov 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,952
0
12 Oct 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
96
342
0
11 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,411
0
08 Jun 2020
Understanding and Improving Layer Normalization
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
107
357
0
16 Nov 2019
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
62
820
0
25 Oct 2019
Adversarially Trained End-to-end Korean Singing Voice Synthesis System
Juheon Lee
Hyeong-Seok Choi
Chang-Bin Jeon
Junghyun Koo
Kyogu Lee
75
78
0
06 Aug 2019
Learning latent representations for style control and transfer in end-to-end speech synthesis
Ya-Jie Zhang
Shifeng Pan
Lei He
Zhenhua Ling
BDL
SSL
DRL
66
229
0
11 Dec 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
85
2,704
0
16 Dec 2017
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
166
1,831
0
29 Mar 2017
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,421
0
12 Sep 2016
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
448
20,606
0
10 Sep 2014
1