Investigation of enhanced Tacotron text-to-speech synthesis systems with
self-attention for pitch accent language

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

29 October 2018

Xin Wang

Junichi Yamagishi

Papers citing "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language"

13 / 13 papers shown

Title
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra Yang Ai Zhenhua Ling 34 13 0 13 May 2023
UniFLG: Unified Facial Landmark Generator from Text or Speech Kentaro Mitsui Yukiya Hono Kei Sawada CVBM 16 6 0 28 Feb 2023
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Yusuke Yasuda T. Toda 33 8 0 16 Dec 2022
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems? Xuan Shi Erica Cooper Xin Wang Junichi Yamagishi Shrikanth Narayanan 27 1 0 25 Nov 2022
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over Junchen Lu Berrak Sisman Rui Liu Mingyang Zhang Haizhou Li DiffM 34 19 0 07 Oct 2021
Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features Mahsa Elyasi Gaurav Bharaj 11 2 0 08 Apr 2021
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators Ryuichi Yamamoto Eunwoo Song Min-Jae Hwang Jae-Min Kim 27 18 0 27 Oct 2020
Expressive TTS Training with Frame and Style Reconstruction Loss Rui Liu Berrak Sisman Guanglai Gao Haizhou Li 32 73 0 04 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis Fengyu Yang Shan Yang Qinghua Wu Yujun Wang Lei Xie 19 5 0 03 Aug 2020
Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis Antti Suni Sofoklis Kakouros M. Vainio J. Šimko 16 17 0 29 Jun 2020
Pitchtron: Towards audiobook generation from ordinary people's voices Sunghee Jung Hoi-Rim Kim 13 5 0 21 May 2020
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments Yusuke Yasuda Xin Wang Junichi Yamagishi 21 8 0 30 Aug 2019
Neural source-filter waveform models for statistical parametric speech synthesis Xin Wang Shinji Takaki Junichi Yamagishi 31 117 0 27 Apr 2019