ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.03402
  4. Cited By
Effective Use of Variational Embedding Capacity in Expressive End-to-End
  Speech Synthesis

Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis

8 June 2019
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
    BDL
ArXivPDFHTML

Papers citing "Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis"

13 / 13 papers shown
Title
Controllable Speaking Styles Using a Large Language Model
Controllable Speaking Styles Using a Large Language Model
A. Sigurgeirsson
Simon King
25
2
0
17 May 2023
Do Prosody Transfer Models Transfer Prosody?
Do Prosody Transfer Models Transfer Prosody?
A. Sigurgeirsson
Simon King
DiffM
12
7
0
07 Mar 2023
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End
  Speech Synthesis
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis
Alexandra Vioni
Myrsini Christidou
Nikolaos Ellinas
G. Vamvoukakis
Panos Kakoulidis
Taehoon Kim
June Sig Sung
Hyoungmin Park
Aimilios Chalamandaris
Pirros Tsiakoulis
16
11
0
19 Nov 2021
Emotional Prosody Control for Speech Generation
Emotional Prosody Control for Speech Generation
S. Sivaprasad
Saiteja Kosgi
Vineet Gandhi
12
17
0
07 Nov 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks
E. Hortal
Rodrigo Brechard Alarcia
GAN
26
2
0
06 Oct 2021
Learning De-identified Representations of Prosody from Raw Audio
Learning De-identified Representations of Prosody from Raw Audio
J. Weston
R. Lenain
U. Meepegama
E. Fristed
SSL
24
15
0
17 Jul 2021
Multi-speaker Emotion Conversion via Latent Variable Regularization and
  a Chained Encoder-Decoder-Predictor Network
Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network
Ravi Shankar
Hsi-Wei Hsieh
N. Charon
A. Venkataraman
40
11
0
25 Jul 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based
  Variable-Length Embedding
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
37
65
0
18 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by
  Text-To-Speech Data Augmentation
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
16
61
0
14 May 2020
Fully-hierarchical fine-grained prosody modeling for interpretable
  speech synthesis
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuanbin Cao
Heiga Zen
Yonghui Wu
16
130
0
06 Feb 2020
Generating diverse and natural text-to-speech samples using a quantized
  fine-grained VAE and auto-regressive prosody prior
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
DiffM
36
92
0
06 Feb 2020
A unified sequence-to-sequence front-end model for Mandarin
  text-to-speech synthesis
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis
Junjie Pan
Xiang Yin
Zhiling Zhang
Shichao Liu
Yang Zhang
Zejun Ma
Yuxuan Wang
9
26
0
11 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech
  Synthesis
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
33
113
0
23 Oct 2019
1