ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13831
  4. Cited By
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech
  Synthesis with Diffusion and Style-based Models

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models

23 May 2023
Minki Kang
Wooseok Han
Sung Ju Hwang
Eunho Yang
    DiffM
ArXivPDFHTML

Papers citing "ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models"

15 / 15 papers shown
Title
OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
Hieu-Nghia Huynh-Nguyen
Ngoc Son Nguyen
Huynh Nguyen Dang
Thieu Vo
Truong-Son Hy
Van Nguyen
14
0
0
19 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
44
0
0
01 May 2025
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
Guanrou Yang
Chen Yang
Qian Chen
Ziyang Ma
Wenxi Chen
...
Fan Yu
Zhihao Du
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
57
0
0
17 Apr 2025
A Review of Human Emotion Synthesis Based on Generative Technology
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Yong Li
Yifan Xie
Y. He
Yujie Zhang
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
78
1
0
10 Dec 2024
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control
Haozhe Chen
Run Chen
Julia Hirschberg
26
3
0
01 Oct 2024
Exploring synthetic data for cross-speaker style transfer in style
  representation based TTS
Exploring synthetic data for cross-speaker style transfer in style representation based TTS
Lucas Ueda
Leonardo B. de M. M. Marques
Flávio O. Simões
Mário Uliani Neto
Fernando Runstein
Bianca Dal Bó
Paula D. P. Costa
33
0
0
25 Sep 2024
StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion
  for Zero-shot Text-to-speech Synthesis
StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis
Zhiyong Chen
Xinnuo Li
Zhiqi Ai
Shugong Xu
DiffM
36
1
0
24 Sep 2024
Enhancing Emotional Text-to-Speech Controllability with Natural Language
  Guidance through Contrastive Learning and Diffusion Models
Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
Xin Jing
Kun Zhou
Andreas Triantafyllopoulos
Björn W. Schuller
DiffM
42
3
0
10 Sep 2024
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical
  Emotion Vector for Controllable Emotional Text-to-Speech
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
Deok-Hyeon Cho
Hyung-Seok Oh
Seung-Bin Kim
Sang-Hoon Lee
Seong-Whan Lee
45
7
0
12 Jun 2024
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling
  for Zero-Shot Voice Cloning
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
35
3
0
06 Oct 2023
On the Design Fundamentals of Diffusion Models: A Survey
On the Design Fundamentals of Diffusion Models: A Survey
Ziyi Chang
George Alex Koulieris
Hubert P. H. Shum
DiffM
29
54
0
07 Jun 2023
Cross-speaker Emotion Transfer Based On Prosody Compensation for
  End-to-End Speech Synthesis
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis
Tao Li
Xinsheng Wang
Qicong Xie
Zhichao Wang
Ming Jiang
Linfu Xie
35
15
0
04 Jul 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech
  with Untranscribed Data
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
204
52
0
30 May 2022
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
306
10,378
0
12 Dec 2018
Domain-Adversarial Training of Neural Networks
Domain-Adversarial Training of Neural Networks
Yaroslav Ganin
E. Ustinova
Hana Ajakan
Pascal Germain
Hugo Larochelle
François Laviolette
M. Marchand
Victor Lempitsky
GAN
OOD
179
9,342
0
28 May 2015
1