Speaking Speed Control of End-to-End Speech Synthesis using
Sentence-Level Conditioning

v1v2 (latest)

Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning

30 July 2020

Gyeong-Hoon Lee

ArXiv (abs)PDF HTML

Papers citing "Speaking Speed Control of End-to-End Speech Synthesis using Sentence-Level Conditioning"

12 / 12 papers shown

Title
Fine-grained robust prosody transfer for single-speaker neural text-to-speech V. Klimkov S. Ronanki Jonas Rohnke Thomas Drugman AI4TS 69 82 0 04 Jul 2019
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 59 147 0 06 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis R. Prenger Rafael Valle Bryan Catanzaro 155 1,036 0 31 Oct 2018
ESPnet: End-to-End Speech Processing Toolkit Shinji Watanabe Takaaki Hori Shigeki Karita Tomoki Hayashi Jiro Nishitoba ... Jahn Heymann Sanjeev Khudanpur Nanxin Chen Adithya Renduchintala Tsubasa Ochiai VLM 112 1,513 0 30 Mar 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron RJ Skerry-Ryan Eric Battenberg Y. Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss R. Clark Rif A. Saurous 56 555 0 24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Y. Xiao Fei Ren Ye Jia Rif A. Saurous 66 827 0 23 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,701 0 16 Dec 2017
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention Hideyuki Tachibana Katsuya Uenoyama Shunsuke Aihara 55 266 0 24 Oct 2017
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning Ming-Yu Liu Kainan Peng Andrew Gibiansky Sercan O. Arik Ajay Kannan Sharan Narang Jonathan Raiman John Miller 72 308 0 20 Oct 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 163 1,828 0 29 Mar 2017
Deep Voice: Real-time Neural Text-to-Speech Sercan O. Arik Mike Chrzanowski Adam Coates G. Diamos Andrew Gibiansky ... John Miller Andrew Ng Jonathan Raiman Shubho Sengupta Mohammad Shoeybi 90 616 0 25 Feb 2017
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 2.0K 150,312 0 22 Dec 2014