Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

23 October 2019

Papers citing "Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis"

26 / 76 papers shown

Title
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis Yinjiao Lei Shan Yang Lei Xie 25 55 0 17 Nov 2020
Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis Erica Cooper Xin Wang Yi Zhao Yusuke Yasuda Junichi Yamagishi SyDa 6 3 0 10 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Ron J. Weiss RJ Skerry-Ryan Eric Battenberg Soroosh Mariooryad Diederik P. Kingma 24 97 0 06 Nov 2020
FeatherTTS: Robust and Efficient attention based Neural TTS Qiao Tian Zewang Zhang Chao-Jung Liu Heng Lu Linghui Chen Bin Wei P. He Shan Liu 15 4 0 02 Nov 2020
PPG-based singing voice conversion with adversarial representation learning Zhonghao Li Benlai Tang Xiang Yin Yuan Wan Linjia Xu Chen Shen Zejun Ma 16 37 0 28 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines Yao Shi Hui Bu Xin Xu Shaojing Zhang Ming Li 30 219 0 22 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS Isaac Elias Heiga Zen Jonathan Shen Yu Zhang Ye Jia Ron J. Weiss Yonghui Wu DRL 19 102 0 22 Oct 2020
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE Yusuke Yasuda Xin Wang Junichi Yamagishi 23 16 0 19 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling Jonathan Shen Ye Jia Mike Chrzanowski Yu Zhang Isaac Elias Heiga Zen Yonghui Wu 19 112 0 08 Oct 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling Songxiang Liu Yuewen Cao Disong Wang Xixin Wu Xunying Liu Helen Meng BDL 26 88 0 06 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation Nanxin Chen Yu Zhang Heiga Zen Ron J. Weiss Mohammad Norouzi William Chan DiffM BDL 14 771 0 02 Sep 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit Zhen Zeng Jianzong Wang Ning Cheng Jing Xiao 13 8 0 13 Aug 2020
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training Jian Cong Shan Yang Lei Xie Guoqiao Yu Guanglu Wan 24 30 0 10 Aug 2020
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning D. Mohan R. Lenain Lorenzo Foglianti Tian Huey Teh Marlene Staib Alexandra Torresquintero Jiameng Gao AI4TS 9 11 0 07 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis Fengyu Yang Shan Yang Qinghua Wu Yujun Wang Lei Xie 31 5 0 03 Aug 2020
End-to-End Adversarial Text-to-Speech Jeff Donahue Sander Dieleman Mikolaj Binkowski Erich Elsen Karen Simonyan 17 185 0 05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search Jaehyeon Kim Sungwon Kim Jungil Kong Sungroh Yoon 54 475 0 22 May 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis Yusuke Yasuda Xin Wang Junichi Yamagishi AI4TS 14 31 0 20 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech Wenjie Li Benlai Tang Xiang Yin Yushi Zhao Wei Li Kang Wang Hao Huang Yuxuan Wang Zejun Ma 6 13 0 19 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment D. Lim Won Jang Gyeonghwan O Heayoung Park Bongwan Kim Jaesam Yoon 11 36 0 15 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation A. Laptev Roman Korostik A. Svischev A. Andrusenko Ivan Medennikov S. Rybin 16 61 0 14 May 2020
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data Seung-won Park Doo-young Kim Myun-chul Joe 10 40 0 07 May 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise Shan Yang Yuxuan Wang Lei Xie 6 9 0 28 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders Yu Gu Xiang Yin Yonghui Rao Yuan Wan Benlai Tang Yang Zhang Jitong Chen Yuxuan Wang Zejun Ma 12 70 0 23 Apr 2020
Semi-Supervised Generative Modeling for Controllable Speech Synthesis Raza Habib Soroosh Mariooryad Matt Shannon Eric Battenberg RJ Skerry-Ryan Daisy Stanton David Kao Tom Bagby BDL 13 48 0 03 Oct 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis Eric Battenberg Soroosh Mariooryad Daisy Stanton RJ Skerry-Ryan Matt Shannon David Kao Tom Bagby BDL 14 45 0 08 Jun 2019