Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.10135
Cited By
Tacotron: Towards End-to-End Speech Synthesis
29 March 2017
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Y. Xiao
Z. Chen
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tacotron: Towards End-to-End Speech Synthesis"
50 / 817 papers shown
Title
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible
Marcely Zanon Boito
William N. Havard
Mahault Garnerin
Éric Le Ferrand
Laurent Besacier
32
47
0
30 Jul 2019
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
8
10
0
19 Jul 2019
Forward-Backward Decoding for Regularizing End-to-End TTS
Yibin Zheng
Xi Wang
Lei He
Shifeng Pan
Frank Soong
Zhengqi Wen
J. Tao
17
13
0
18 Jul 2019
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
P. Narayanan
Punarjay Chakravarty
F. Charette
G. Puskorius
23
3
0
15 Jul 2019
Multi-Speaker End-to-End Speech Synthesis
Jihyun Park
Kexin Zhao
Kainan Peng
Ming-Yu Liu
SyDa
14
19
0
09 Jul 2019
A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach
Noé Tits
16
10
0
05 Jul 2019
Fine-grained robust prosody transfer for single-speaker neural text-to-speech
V. Klimkov
S. Ronanki
Jonas Rohnke
Thomas Drugman
AI4TS
16
82
0
04 Jul 2019
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
Patrick Lumban Tobing
Kazuhiro Kobayashi
T. Toda
21
16
0
01 Jul 2019
RUSLAN: Russian Spoken Language Corpus for Speech Synthesis
Lenar Gabdrakhmanov
Rustem Garaev
E. Razinkov
23
9
0
26 Jun 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training
Peng Wu
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Hong-Chuan Wu
Lirong Dai
8
72
0
26 Jun 2019
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
22
99
0
25 Jun 2019
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling
Yuanhao Yi
Yang Ai
Zhenhua Ling
Lirong Dai
13
33
0
21 Jun 2019
A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation
Hieu-Thi Luong
Junichi Yamagishi
40
10
0
18 Jun 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models
Wei Fang
Yu-An Chung
James R. Glass
15
27
0
17 Jun 2019
Parametric Resynthesis with neural vocoders
Soumi Maiti
Michael I. Mandel
14
19
0
16 Jun 2019
Telephonetic: Making Neural Language Models Robust to ASR and Semantic Noise
Christopher Larson
Tarek Lahlou
Diana Mingels
Zachary Kulis
Erik T. Mueller
14
2
0
13 Jun 2019
Using generative modelling to produce varied intonation for speech synthesis
Zack Hodari
O. Watts
Simon King
29
29
0
10 Jun 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
BDL
19
45
0
08 Jun 2019
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
Nisansa de Silva
30
43
0
05 Jun 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
23
65
0
04 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
24
131
0
04 Jun 2019
Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with SampleRNN
David Álvarez
Santiago Pascual
Antonio Bonafonte
16
12
0
03 Jun 2019
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
Mutian He
Yan Deng
Lei He
12
81
0
03 Jun 2019
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain
Johanes Effendi
Andros Tjandra
S. Sakti
Satoshi Nakamura
24
3
0
03 Jun 2019
SignalTrain: Profiling Audio Compressors with Deep Neural Networks
Scott H. Hawley
Benjamin Colburn
S. I. Mimilakis
14
12
0
28 May 2019
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion
Andy T. Liu
Po-Chun Hsu
Hung-yi Lee
SSL
25
29
0
28 May 2019
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems
Ohsung Kwon
Eunwoo Song
Jae-Min Kim
Hong-Goo Kang
11
4
0
21 May 2019
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
29
39
0
21 May 2019
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
V. Wan
Chun-an Chan
Tom Kenter
Jakub Vít
R. Clark
21
75
0
17 May 2019
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
G. Henter
Simon Alexanderson
Jonas Beskow
39
97
0
16 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Yi Ren
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
44
101
0
13 May 2019
Adversarially Trained Autoencoders for Parallel-Data-Free Voice Conversion
Orhan Ocal
Oguz H. Elibol
Gokce Keskin
Cory Stephenson
Anil Thomas
Kannan Ramchandran
26
10
0
09 May 2019
Deep Learning for Audio Signal Processing
Hendrik Purwins
Bo-wen Li
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
24
586
0
30 Apr 2019
End-to-End Spoken Language Translation
Michelle Guo
Albert Haque
Prateek Verma
14
8
0
23 Apr 2019
Expediting TTS Synthesis with Adversarial Vocoding
Paarth Neekhara
Chris Donahue
M. Puckette
Shlomo Dubnov
Julian McAuley
6
20
0
16 Apr 2019
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
Tao Tu
Yuan-Jui Chen
Cheng-chieh Yeh
Hung-yi Lee
14
87
0
13 Apr 2019
RNN-based speech synthesis using a continuous sinusoidal model
M. S. Al-Radhi
T. Csapó
Géza Németh
14
4
0
12 Apr 2019
Building a mixed-lingual neural TTS system with only monolingual data
Liumeng Xue
Wei Song
Guanghui Xu
Lei Xie
Zhizheng Wu
17
30
0
12 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Z. Chen
Yonghui Wu
21
223
0
12 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
26
47
0
09 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Haohan Guo
Frank Soong
Lei He
Lei Xie
16
30
0
09 Apr 2019
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
D. Kanvesky
Ye Jia
21
112
0
08 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
11
18
0
08 Apr 2019
Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data
Roee Levy Leshem
Raja Giryes
8
8
0
06 Apr 2019
An Unsupervised Autoregressive Model for Speech Representation Learning
Yu-An Chung
Wei-Ning Hsu
Hao Tang
James R. Glass
SSL
24
407
0
05 Apr 2019
Attention-Augmented End-to-End Multi-Task Learning for Emotion Prediction from Speech
Zixing Zhang
Bingwen Wu
Bjoern Schuller
19
83
0
29 Mar 2019
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang Zhang
Xin Wang
Fuming Fang
Haizhou Li
Junichi Yamagishi
6
49
0
29 Mar 2019
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis
Noé Tits
Fengna Wang
Kevin El Haddad
Vincent Pagel
Thierry Dutoit
DiffM
15
39
0
27 Mar 2019
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
Kyubyong Park
Thomas Mulc
14
100
0
27 Mar 2019
GANSynth: Adversarial Neural Audio Synthesis
Jesse Engel
Kumar Krishna Agrawal
Shuo Chen
Ishaan Gulrajani
Chris Donahue
Adam Roberts
49
385
0
23 Feb 2019
Previous
1
2
3
...
14
15
16
17
Next