Initial investigation of an encoder-decoder end-to-end TTS framework
using marginalization of monotonic hard latent alignments

Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments

30 August 2019

Xin Wang

Junichi Yamagishi

ArXiv (abs)PDF HTML

Papers citing "Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments"

19 / 19 papers shown

Title
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Yusuke Yasuda Xin Wang Shinji Takaki Junichi Yamagishi 55 87 0 29 Oct 2018
Neural Speech Synthesis with Transformer Network Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming-Yuan Liu M. Zhou 48 102 0 19 Sep 2018
Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects Hieu-Thi Luong Xin Wang Junichi Yamagishi Nobuyuki Nishizawa 42 16 0 02 Aug 2018
Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis Jing-Xuan Zhang Zhenhua Ling Lirong Dai 48 83 0 18 Jul 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,701 0 16 Dec 2017
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention Hideyuki Tachibana Katsuya Uenoyama Shunsuke Aihara 55 266 0 24 Oct 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 716 132,199 0 12 Jun 2017
Online and Linear-Time Attention by Enforcing Monotonic Alignments Colin Raffel Minh-Thang Luong Peter J. Liu Ron J. Weiss Douglas Eck 78 261 0 03 Apr 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 160 1,826 0 29 Mar 2017
The Neural Noisy Channel Lei Yu Phil Blunsom Chris Dyer Edward Grefenstette Tomás Kociský 60 67 0 08 Nov 2016
Online Segment to Segment Neural Transduction Lei Yu Jan Buys Phil Blunsom 105 82 0 26 Sep 2016
WaveNet: A Generative Model for Raw Audio Aaron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner A. Senior Koray Kavukcuoglu DiffM 406 7,405 0 12 Sep 2016
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations David M. Krueger Tegan Maharaj János Kramár Mohammad Pezeshki Nicolas Ballas Nan Rosemary Ke Anirudh Goyal Yoshua Bengio Aaron Courville C. Pal 82 317 0 03 Jun 2016
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 385 7,964 0 17 Aug 2015
Attention-Based Models for Speech Recognition J. Chorowski Dzmitry Bahdanau Dmitriy Serdyuk Kyunghyun Cho Yoshua Bengio 127 2,607 0 24 Jun 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.9K 150,260 0 22 Dec 2014
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 573 27,311 0 01 Sep 2014
Generating Sequences With Recurrent Neural Networks Alex Graves GAN 155 4,039 0 04 Aug 2013
Sequence Transduction with Recurrent Neural Networks Alex Graves 191 1,870 0 14 Nov 2012