Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.10135
Cited By
Tacotron: Towards End-to-End Speech Synthesis
29 March 2017
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
Navdeep Jaitly
Zongheng Yang
Y. Xiao
Zhehuai Chen
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tacotron: Towards End-to-End Speech Synthesis"
50 / 817 papers shown
Title
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
27
38
0
07 Aug 2020
Peking Opera Synthesis via Duration Informed Attention Network
Yusong Wu
Shengchen Li
Chengzhu Yu
Heng Lu
Chao Weng
Liqiang Zhang
Dong Yu
16
10
0
07 Aug 2020
DurIAN-SC: Duration Informed Attention Network based Singing Voice Conversion System
Liqiang Zhang
Chengzhu Yu
Heng Lu
Chao Weng
Chunlei Zhang
Yusong Wu
Xiang Xie
Zijin Li
Dong Yu
30
34
0
07 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding
Ollie McCarthy
Zo Ahmed
24
14
0
06 Aug 2020
PPSpeech: Phrase based Parallel End-to-End TTS System
Yahuan Cong
Ran Zhang
Jian Luan
34
3
0
06 Aug 2020
Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
15
6
0
05 Aug 2020
Audiovisual Speech Synthesis using Tacotron2
Ahmed Hussen Abdelaziz
Anushree Prasanna Kumar
Chloe Seivwright
Gabriele Fanelli
Justin Binder
Y. Stylianou
S. Kajarekar
20
15
0
03 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis
Fengyu Yang
Shan Yang
Qinghua Wu
Yujun Wang
Lei Xie
39
5
0
03 Aug 2020
Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning
Jaesung Bae
Hanbin Bae
Young-Sun Joo
Junmo Lee
Gyeong-Hoon Lee
Hoon-Young Cho
16
17
0
30 Jul 2020
Xiaomingbot: A Multilingual Robot News Reporter
Runxin Xu
Jun Cao
Mingxuan Wang
Jiaze Chen
Hao Zhou
...
Xiang Yin
Xijin Zhang
Songcheng Jiang
Yuxuan Wang
Lei Li
23
11
0
12 Jul 2020
LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Jordan J. Bird
Diego Resende Faria
Anikó Ekárt
C. Premebida
Pedro P. S. Ayrosa
25
5
0
01 Jul 2020
Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis
Antti Suni
Sofoklis Kakouros
M. Vainio
J. Šimko
19
17
0
29 Jun 2020
Gamma Boltzmann Machine for Simultaneously Modeling Linear- and Log-amplitude Spectra
Toru Nakashika
Kohei Yatabe
13
0
0
24 Jun 2020
Audeo: Audio Generation for a Silent Performance Video
Kun Su
Xiulong Liu
Eli Shlizerman
VGen
33
67
0
23 Jun 2020
Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion
Narjes Bozorg
Michael T.Johnson
18
1
0
22 Jun 2020
Embodied Self-supervised Learning by Coordinated Sampling and Training
Yifan Sun
Xihong Wu
SSL
30
7
0
20 Jun 2020
SE-MelGAN -- Speaker Agnostic Rapid Speech Enhancement
Luka Chkhetiani
Levan Bejanidze
25
1
0
13 Jun 2020
Neural voice cloning with a few low-quality samples
Sunghee Jung
Hoi-Rim Kim
33
2
0
12 Jun 2020
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
32
20
0
10 Jun 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
29
109
0
08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
60
1,362
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
22
186
0
05 Jun 2020
PJS: phoneme-balanced Japanese singing voice corpus
Junya Koguchi
Shinnosuke Takamichi
20
22
0
04 Jun 2020
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
Arun Baby
Saranya Vinnaitherthan
Nagaraj Adiga
Pranav Jawale
Sumukh Badam
Sharath Adavanne
Srikanth Konjeti
9
7
0
02 Jun 2020
High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder
Kazi Nazmul Haque
R. Rana
Björn W Schuller
DRL
31
12
0
01 Jun 2020
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices
Run Wang
Felix Juefei Xu
Yihao Huang
Qing Guo
Xiaofei Xie
Lei Ma
Yang Liu
AAML
32
105
0
28 May 2020
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi Luong
Junichi Yamagishi
28
51
0
22 May 2020
Conversational End-to-End TTS for Voice Agent
Haohan Guo
Shaofei Zhang
Frank Soong
Lei He
Lei Xie
34
67
0
21 May 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
22
31
0
20 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Wenjie Li
Benlai Tang
Xiang Yin
Yushi Zhao
Wei Li
Kang Wang
Hao Huang
Yuxuan Wang
Zejun Ma
14
13
0
19 May 2020
Deep Architecture Enhancing Robustness to Noise, Adversarial Attacks, and Cross-corpus Setting for Speech Emotion Recognition
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Björn W. Schuller
46
28
0
18 May 2020
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
Tomoki Toda
ViT
30
30
0
18 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Tao Tu
Yuan-Jui Chen
Alexander H. Liu
Hung-yi Lee
33
7
0
16 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
D. Lim
Won Jang
Gyeonghwan O
Heayoung Park
Bongwan Kim
Jaesam Yoon
27
36
0
15 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
16
61
0
14 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
Kevin J. Shih
R. Prenger
Bryan Catanzaro
28
119
0
12 May 2020
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
Zewang Zhang
Qiao Tian
Heng Lu
Ling-Hao Chen
Shan Liu
9
27
0
12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
27
32
0
12 May 2020
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech
S. Karlapati
Alexis Moinet
Arnaud Joly
V. Klimkov
Daniel Sáez-Trigueros
Thomas Drugman
19
67
0
30 Apr 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Shan Yang
Yuxuan Wang
Lei Xie
21
9
0
28 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Yu Gu
Xiang Yin
Yonghui Rao
Yuan Wan
Benlai Tang
Yang Zhang
Jitong Chen
Yuxuan Wang
Zejun Ma
28
70
0
23 Apr 2020
Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
Tomoki Koriyama
Hiroshi Saruwatari
BDL
26
5
0
22 Apr 2020
Transformer based Grapheme-to-Phoneme Conversion
Sevinj Yolchuyeva
Géza Németh
Bálint Gyires-Tóth
35
63
0
14 Apr 2020
Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data
Soumi Maiti
Erik Marchi
Alistair Conkie
19
17
0
10 Apr 2020
Advancing Speech Synthesis using EEG
G. Krishna
Co Tran
Mason Carnahan
Ahmed H. Tewfik
27
11
0
09 Apr 2020
Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System
Gwenaelle Cunha Sergio
Minho Lee
9
8
0
05 Apr 2020
Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments
Koichiro Yoshino
Kohei Wakimoto
Yuta Nishimura
Satoshi Nakamura
14
8
0
23 Mar 2020
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen Zeng
Jianzong Wang
Ning Cheng
Tian Xia
Jing Xiao
VLM
33
56
0
04 Mar 2020
GraphTTS: graph-to-sequence modelling in neural text-to-speech
Aolan Sun
Jianzong Wang
Ning Cheng
Huayi Peng
Zhen Zeng
Jing Xiao
24
21
0
04 Mar 2020
Semi-Supervised Neural Architecture Search
Renqian Luo
Xu Tan
Rui Wang
Tao Qin
Enhong Chen
Tie-Yan Liu
13
88
0
24 Feb 2020
Previous
1
2
3
...
12
13
14
15
16
17
Next