Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition

22 February 2024

Papers citing "Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition"

16 / 16 papers shown

Title
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis Haobin Tang Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao DiffM 58 24 0 01 Jun 2023
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis Julian Zaïdi Hugo Seuté Benjamin van Niekerk M. Carbonneau 34 21 0 04 Aug 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov DiffM 70 526 0 13 May 2021
Exploring emotional prototypes in a high dimensional TTS latent space Pol van Rijn Silvan Mertes Dominik Schiller Peter M. C. Harrison P. Larrouy-Maestri Elisabeth André Nori Jacoby 33 12 0 05 May 2021
Emotion Ratings: How Intensity, Annotation Confidence and Agreements are Entangled Enrica Troiano Sebastian Padó Roman Klinger 24 19 0 02 Mar 2021
Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset Kun Zhou Berrak Sisman Rui Liu Haizhou Li 42 190 0 28 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 89 1,891 0 12 Oct 2020
Denoising Diffusion Probabilistic Models Jonathan Ho Ajay Jain Pieter Abbeel DiffM 213 17,550 0 19 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search Jaehyeon Kim Sungwon Kim Jungil Kong Sungroh Yoon 59 482 0 22 May 2020
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations Jing-Xuan Zhang Zhenhua Ling Lirong Dai 31 99 0 25 Jun 2019
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron RJ Skerry-Ryan Eric Battenberg Y. Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss R. Clark Rif A. Saurous 37 550 0 24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Y. Xiao Fei Ren Ye Jia Rif A. Saurous 52 822 0 23 Mar 2018
Emotional End-to-End Neural Speech Synthesizer Younggun Lee Azam Rabiee Soo-Young Lee 39 105 0 15 Nov 2017
FiLM: Visual Reasoning with a General Conditioning Layer Ethan Perez Florian Strub H. D. Vries Vincent Dumoulin Aaron Courville FAtt AIMat OffRL AI4CE 210 2,178 0 22 Sep 2017
U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger Philipp Fischer Thomas Brox SSeg 3DV 865 76,547 0 18 May 2015
Unsupervised Domain Adaptation by Backpropagation Yaroslav Ganin Victor Lempitsky OOD 181 5,972 0 26 Sep 2014