Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.07654
Cited By
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
20 October 2017
Ming-Yu Liu
Kainan Peng
Andrew Gibiansky
Sercan Ö. Arik
Ajay Kannan
Sharan Narang
Jonathan Raiman
John Miller
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning"
28 / 78 papers shown
Title
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen Zeng
Jianzong Wang
Ning Cheng
Tian Xia
Jing Xiao
VLM
33
56
0
04 Mar 2020
Semi-Supervised Neural Architecture Search
Renqian Luo
Xu Tan
Rui Wang
Tao Qin
Enhong Chen
Tie-Yan Liu
13
88
0
24 Feb 2020
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuanbin Cao
Heiga Zen
Yonghui Wu
16
130
0
06 Feb 2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
DiffM
36
92
0
06 Feb 2020
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Nick Rossenbach
Albert Zeyer
Ralf Schluter
Hermann Ney
18
83
0
19 Dec 2019
Vision-Infused Deep Audio Inpainting
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
35
88
0
24 Oct 2019
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
243
239
0
25 Sep 2019
Unpaired Image-to-Speech Synthesis with Multimodal Information Bottleneck
Shuang Ma
Daniel J. McDuff
Yale Song
25
22
0
19 Aug 2019
Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features
Zexin Cai
Yaogen Yang
Chuxiong Zhang
Xiaoyi Qin
Ming Li
32
26
0
03 Jul 2019
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
V. Wan
Chun-an Chan
Tom Kenter
Jakub Vít
R. Clark
24
75
0
17 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
44
101
0
13 May 2019
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
19
55
0
09 Apr 2019
Feature reinforcement with word embedding and parsing information in neural TTS
Huaiping Ming
Lei He
Haohan Guo
Frank Soong
74
15
0
03 Jan 2019
FPETS : Fully Parallel End-to-End Text-to-Speech System
Dabiao Ma
Zhiba Su
Wenxuan Wang
Yuhao Lu
24
6
0
12 Dec 2018
Activation Functions: Comparison of trends in Practice and Research for Deep Learning
S. Bodenstedt
Dominik Rivoir
A. Gachagan
S. T. Mees
22
1,269
0
08 Nov 2018
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation
Ye Jia
Melvin Johnson
Wolfgang Macherey
Ron J. Weiss
Yuan Cao
Chung-Cheng Chiu
Naveen Ari
Stella Laurenzo
Yonghui Wu
31
159
0
05 Nov 2018
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention
Bajibabu Bollepalli
Lauri Juvela
P. Alku
17
4
0
29 Oct 2018
Sequence-to-Sequence Acoustic Modeling for Voice Conversion
Jing-Xuan Zhang
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Lirong Dai
16
129
0
16 Oct 2018
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
35
149
0
27 Sep 2018
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks
Sercan Ö. Arik
Heewoo Jun
G. Diamos
14
107
0
20 Aug 2018
Multi-task WaveNet: A Multi-task Generative Model for Statistical Parametric Speech Synthesis without Fundamental Frequency Conditions
Yu Gu
Yongguo Kang
12
17
0
22 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
820
0
12 Jun 2018
Voice Imitating Text-to-Speech Neural Networks
Younggun Lee
Taesu Kim
Soo-Young Lee
29
11
0
04 Jun 2018
A Universal Music Translation Network
Noam Mor
Lior Wolf
Adam Polyak
Yaniv Taigman
22
110
0
21 May 2018
Collapsed speech segment detection and suppression for WaveNet vocoder
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Hayashi
Patrick Lumban Tobing
T. Toda
12
25
0
30 Apr 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
26
815
0
23 Mar 2018
Fitting New Speakers Based on a Short Untranscribed Sample
Eliya Nachmani
Adam Polyak
Yaniv Taigman
Lior Wolf
24
84
0
20 Feb 2018
Adversarial Audio Synthesis
Chris Donahue
Julian McAuley
M. Puckette
GAN
45
604
0
12 Feb 2018
Previous
1
2