Effective Use of Variational Embedding Capacity in Expressive End-to-End
Speech Synthesis

Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis

8 June 2019

Eric Battenberg

Soroosh Mariooryad

Papers citing "Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis"

13 / 13 papers shown

Title
Controllable Speaking Styles Using a Large Language Model A. Sigurgeirsson Simon King 25 2 0 17 May 2023
Do Prosody Transfer Models Transfer Prosody? A. Sigurgeirsson Simon King DiffM 12 7 0 07 Mar 2023
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis Alexandra Vioni Myrsini Christidou Nikolaos Ellinas G. Vamvoukakis Panos Kakoulidis Taehoon Kim June Sig Sung Hyoungmin Park Aimilios Chalamandaris Pirros Tsiakoulis 16 11 0 19 Nov 2021
Emotional Prosody Control for Speech Generation S. Sivaprasad Saiteja Kosgi Vineet Gandhi 12 17 0 07 Nov 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 26 2 0 06 Oct 2021
Learning De-identified Representations of Prosody from Raw Audio J. Weston R. Lenain U. Meepegama E. Fristed SSL 24 15 0 17 Jul 2021
Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network Ravi Shankar Hsi-Wei Hsieh N. Charon A. Venkataraman 40 11 0 25 Jul 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Seungwoo Choi Seungju Han Dongyoung Kim S. Ha 37 65 0 18 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation A. Laptev Roman Korostik A. Svischev A. Andrusenko Ivan Medennikov S. Rybin 16 61 0 14 May 2020
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis Guangzhi Sun Yu Zhang Ron J. Weiss Yuanbin Cao Heiga Zen Yonghui Wu 16 130 0 06 Feb 2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior Guangzhi Sun Yu Zhang Ron J. Weiss Yuan Cao Heiga Zen Andrew Rosenberg Bhuvana Ramabhadran Yonghui Wu DiffM 36 92 0 06 Feb 2020
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis Junjie Pan Xiang Yin Zhiling Zhang Shichao Liu Yang Zhang Zejun Ma Yuxuan Wang 9 26 0 11 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis Eric Battenberg RJ Skerry-Ryan Soroosh Mariooryad Daisy Stanton David Kao Matt Shannon Tom Bagby 33 113 0 23 Oct 2019