ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.08679
  4. Cited By
Controllable Emotion Transfer For End-to-End Speech Synthesis

Controllable Emotion Transfer For End-to-End Speech Synthesis

17 November 2020
Tao Li
Shan Yang
Liumeng Xue
Lei Xie
ArXivPDFHTML

Papers citing "Controllable Emotion Transfer For End-to-End Speech Synthesis"

48 / 48 papers shown
Title
A Review of Human Emotion Synthesis Based on Generative Technology
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Yong Li
Yifan Xie
Y. He
Yujie Zhang
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
78
1
0
10 Dec 2024
Emo-DPO: Controllable Emotional Speech Synthesis through Direct
  Preference Optimization
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization
Xiaoxue Gao
Chen Zhang
Yiming Chen
Huayun Zhang
Nancy F. Chen
47
6
0
16 Sep 2024
Laugh Now Cry Later: Controlling Time-Varying Emotional States of
  Flow-Matching-Based Zero-Shot Text-to-Speech
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Haibin Wu
Xiaofei Wang
Sefik Emre Eskimez
Manthan Thakker
Daniel Tompkins
...
Canrun Li
Zhen Xiao
Sheng Zhao
Jinyu Li
Naoyuki Kanda
28
6
0
17 Jul 2024
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling
  on Time Variability
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability
Hyun Joon Park
Jin Sob Kim
Wooseok Shin
Sung Won Han
DiffM
41
2
0
27 Jun 2024
RSET: Remapping-based Sorting Method for Emotion Transfer Speech
  Synthesis
RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis
Haoxiang Shi
Jianzong Wang
Xulong Zhang
Ning Cheng
Jun Yu
Jing Xiao
44
2
0
27 May 2024
Fine-Grained Quantitative Emotion Editing for Speech Generation
Fine-Grained Quantitative Emotion Editing for Speech Generation
Sho Inoue
Kun Zhou
Shuai Wang
Haizhou Li
43
2
0
04 Mar 2024
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised
  Contrastive Learning
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning
Xinfa Zhu
Yuke Li
Yinjiao Lei
Ning Jiang
Guoqing Zhao
Lei Xie
28
0
0
26 Oct 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling
  for Zero-Shot Voice Cloning
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
35
3
0
06 Oct 2023
The Importance of Multimodal Emotion Conditioning and Affect Consistency
  for Embodied Conversational Agents
The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents
Che-Jui Chang
Samuel S. Sohn
Sen Zhang
R. Jayashankar
Muhammad Usman
Mubbasir Kapadia
38
7
0
26 Sep 2023
HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for
  Expressive Long-form TTS
HiGNN-TTS: Hierarchical Prosody Modeling with Graph Neural Networks for Expressive Long-form TTS
Dake Guo
Xinfa Zhu
Liumeng Xue
Tao Li
Yuanjun Lv
Yuepeng Jiang
Linfu Xie
22
1
0
25 Sep 2023
Diversity-based core-set selection for text-to-speech with linguistic
  and acoustic features
Diversity-based core-set selection for text-to-speech with linguistic and acoustic features
Kentaro Seki
Shinnosuke Takamichi
Takaaki Saeki
Hiroshi Saruwatari
26
3
0
15 Sep 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for
  Text-to-Speech -- A Study between English and Mandarin
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
43
8
0
02 Sep 2023
AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect
  Transfer for Speech Synthesis
AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect Transfer for Speech Synthesis
Hrishikesh Viswanath
Aneesh Bhattacharya
Pascal Jutras-Dubé
Prerit Gupta
Mridu Prashanth
Yashvardhan Khaitan
Aniket Bera
32
2
0
16 Aug 2023
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Daegyeom Kim
Seong-soo Hong
Yong-Hoon Choi
25
2
0
20 Jul 2023
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech
Daria Diatlova
V. Shutov
34
8
0
28 Jun 2023
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech
  Synthesis
EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis
Haobin Tang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DiffM
19
24
0
01 Jun 2023
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural
  Language Descriptions
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Guanghou Liu
Yongmao Zhang
Yinjiao Lei
Yunlin Chen
Rui Wang
Zhifei Li
Linfu Xie
39
37
0
31 May 2023
Accented Text-to-Speech Synthesis with Limited Data
Accented Text-to-Speech Synthesis with Limited Data
Xuehao Zhou
Mingyang Zhang
Yi Zhou
Zhizheng Wu
Haizhou Li
34
12
0
08 May 2023
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and
  Speaker-wise Normalization in Speech Synthesis
Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Chunyu Qiang
Peng Yang
Hao Che
Xiaorui Wang
Zhongyuan Wang
BDL
34
6
0
13 Dec 2022
Multi-Speaker Expressive Speech Synthesis via Multiple Factors
  Decoupling
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling
Xinfa Zhu
Yinjiao Lei
Kun Song
Yongmao Zhang
Tao Li
Linfu Xie
21
17
0
19 Nov 2022
Semi-supervised learning for continuous emotional intensity controllable
  speech synthesis with disentangled representations
Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations
Yoorim Oh
Juheon Lee
Yoseob Han
Kyogu Lee
28
3
0
11 Nov 2022
Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style
  Disentanglement
Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement
Wei Song
Ya Yue
Ya-Jie Zhang
Zhengchen Zhang
Youzheng Wu
Xiaodong He
32
4
0
02 Nov 2022
AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker
  TTS with Accents
AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents
Yongmao Zhang
Zhichao Wang
Pei-Yin Yang
Hongshen Sun
Zhisheng Wang
Linfu Xie
28
6
0
31 Oct 2022
Read it to me: An emotionally aware Speech Narration Application
Read it to me: An emotionally aware Speech Narration Application
Rishibha Bansal
16
0
0
06 Sep 2022
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
27
44
0
11 Aug 2022
Controllable Data Generation by Deep Learning: A Review
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
33
28
0
19 Jul 2022
Text-driven Emotional Style Control and Cross-speaker Style Transfer in
  Neural TTS
Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS
Yookyung Shin
Younggun Lee
Suhee Jo
Yeongtae Hwang
Taesu Kim
25
14
0
13 Jul 2022
Cross-speaker Emotion Transfer Based On Prosody Compensation for
  End-to-End Speech Synthesis
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis
Tao Li
Xinsheng Wang
Qicong Xie
Zhichao Wang
Ming Jiang
Linfu Xie
35
15
0
04 Jul 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for
  Speech Synthesis based on Disentanglement between Prosody and Timbre
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
Wenbo Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
50
26
0
29 Jun 2022
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on
  Data-Driven Deep Learning
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning
Rui Liu
Berrak Sisman
Björn Schuller
Guanglai Gao
Haizhou Li
27
11
0
15 Jun 2022
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls
  Emotional Intensity
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls Emotional Intensity
Sungjae Kim
Y.E. Kim
Jewoo Jun
Injung Kim
31
13
0
02 Mar 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer
Disentangling Style and Speaker Attributes for TTS Style Transfer
Xiaochun An
Frank Soong
Lei Xie
68
18
0
24 Jan 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for
  emotional speech synthesis
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis
Yinjiao Lei
Shan Yang
Xinsheng Wang
Lei Xie
27
73
0
17 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
65
54
0
10 Jan 2022
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker
  Single-style Training Data Scenarios
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios
Qicong Xie
Tao Li
Xinsheng Wang
Zhichao Wang
Lei Xie
Guoqiao Yu
Guanglu Wan
32
11
0
23 Dec 2021
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Li-Wei Chen
Alexander I. Rudnicky
88
29
0
12 Oct 2021
Environment Aware Text-to-Speech Synthesis
Environment Aware Text-to-Speech Synthesis
Daxin Tan
Guangyan Zhang
Tan Lee
13
3
0
08 Oct 2021
StrengthNet: Deep Learning-based Emotion Strength Assessment for
  Emotional Speech Synthesis
StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis
Rui Liu
Berrak Sisman
Haizhou Li
29
2
0
07 Oct 2021
Emotional Speech Synthesis for Companion Robot to Imitate Professional
  Caregiver Speech
Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech
Takeshi Homma
Qinghua Sun
Takuya Fujioka
R. Takawaki
Eriko Ankyu
Kenji Nagamatsu
Daichi Sugawara
E. Harada
17
1
0
27 Sep 2021
Cross-speaker emotion disentangling and transfer for end-to-end speech
  synthesis
Cross-speaker emotion disentangling and transfer for end-to-end speech synthesis
Tao Li
Xinsheng Wang
Qicong Xie
Zhichao Wang
Linfu Xie
26
42
0
14 Sep 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Improving Performance of Seen and Unseen Speech Style Transfer in
  End-to-end Neural TTS
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-end Neural TTS
Xiaochun An
Frank Soong
Lei Xie
42
9
0
18 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
24
35
0
17 Jun 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
33
168
0
31 May 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
26
24
0
20 Apr 2021
Reinforcement Learning for Emotional Text-to-Speech Synthesis with
  Improved Emotion Discriminability
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
Rui Liu
Berrak Sisman
Haizhou Li
34
32
0
03 Apr 2021
Investigating on Incorporating Pretrained and Learnable Speaker
  Representations for Multi-Speaker Multi-Style Text-to-Speech
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
C. Chien
Jheng-hao Lin
Chien-yu Huang
Po-Chun Hsu
Hung-yi Lee
27
68
0
06 Mar 2021
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech
  Synthesis via Phone-Level Content-Style Disentanglement
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement
Daxin Tan
Tan Lee
29
21
0
08 Nov 2020
1