ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.03521
  4. Cited By
Towards Multi-Scale Style Control for Expressive Speech Synthesis

Towards Multi-Scale Style Control for Expressive Speech Synthesis

8 April 2021
Xiang Li
Changhe Song
Jingbei Li
Zhiyong Wu
Jia Jia
Helen Meng
ArXivPDFHTML

Papers citing "Towards Multi-Scale Style Control for Expressive Speech Synthesis"

33 / 33 papers shown
Title
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting
Wenjie Qu
Wenxiang Guo
Changhao Pan
Zehan Zhu
Tao Jin
Zhou Zhao
VGen
54
1
0
29 Apr 2025
HELPNet: Hierarchical Perturbations Consistency and Entropy-guided
  Ensemble for Scribble Supervised Medical Image Segmentation
HELPNet: Hierarchical Perturbations Consistency and Entropy-guided Ensemble for Scribble Supervised Medical Image Segmentation
Xiao Zhang
Shaoxuan Wu
Peilin Zhang
Zhuo Jin
Xiaosong Xiong
Qirong Bu
Jingkun Chen
Jun Feng
94
0
0
25 Dec 2024
A Review of Human Emotion Synthesis Based on Generative Technology
A Review of Human Emotion Synthesis Based on Generative Technology
Fei Ma
Yong Li
Yifan Xie
Y. He
Yuyao Zhang
...
Z. Liu
Wei Yao
Fuji Ren
Fei Richard Yu
Shiguang Ni
78
1
0
10 Dec 2024
Content and Style Aware Audio-Driven Facial Animation
Content and Style Aware Audio-Driven Facial Animation
Qingju Liu
Hyeongwoo Kim
Gaurav Bharaj
DiffM
43
1
0
13 Aug 2024
StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
Gaoxiang Cong
Yuankai Qi
Liang-Sheng Li
Amin Beheshti
Zhedong Zhang
Anton Van Den Hengel
Ming-Hsuan Yang
Chenggang Yan
Qingming Huang
46
12
0
20 Feb 2024
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive
  Text-to-Speech Synthesis
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
Wenhao Guan
Yishuang Li
Tao Li
Hukai Huang
Feng Wang
Jiayan Lin
Lingyan Huang
Lin Li
Q. Hong
36
9
0
17 Dec 2023
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis
Yu Zhang
Rongjie Huang
Ruiqi Li
Jinzheng He
Yan Xia
Feiyang Chen
Xinyu Duan
Baoxing Huai
Zhou Zhao
VLM
31
18
0
17 Dec 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control
  and Contrastive Learning with Negative Samples Augmentation
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation
Yimin Deng
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
35
3
0
15 Nov 2023
Zero-Shot Emotion Transfer For Cross-Lingual Speech Synthesis
Zero-Shot Emotion Transfer For Cross-Lingual Speech Synthesis
Yuke Li
Xinfa Zhu
Yinjiao Lei
Hai Li
Junhui Liu
Danming Xie
Lei Xie
46
3
0
06 Oct 2023
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with
  Multi-Scale Acoustic Prompts
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Shunwei Lei
Yixuan Zhou
Liyang Chen
Dan Luo
Zhiyong Wu
...
Shiyin Kang
Tao Jiang
Yahui Zhou
Yuxing Han
Helen M. Meng
VLM
46
2
0
21 Sep 2023
A Discourse-level Multi-scale Prosodic Model for Fine-grained Emotion
  Analysis
A Discourse-level Multi-scale Prosodic Model for Fine-grained Emotion Analysis
X. Wei
Jia Jia
Xiang Li
Zhiyong Wu
Ziyi Wang
23
1
0
21 Sep 2023
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice
  Conversion by Multi-scale Style Modeling
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Zhichao Wang
Xinsheng Wang
Qicong Xie
Tao Li
Linfu Xie
Qiao Tian
Yuping Wang
31
4
0
03 Sep 2023
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context
  Information for Expressive Speech Synthesis
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Zhiyong Wu
Xixin Wu
Shiyin Kang
Helen Meng
35
7
0
29 Jul 2023
CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion
  Intensity Regulation
CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation
Yuhao Cui
Xiongwei Wang
Zhongzhou Zhao
Wei Zhou
Haiqing Chen
36
1
0
27 Jun 2023
Interpretable Style Transfer for Text-to-Speech with ControlVAE and
  Diffusion Bridge
Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Wenhao Guan
Tao Li
Yishuang Li
Hukai Huang
Q. Hong
Lin Li
DiffM
32
6
0
07 Jun 2023
Diverse and Expressive Speech Prosody Prediction with Denoising
  Diffusion Probabilistic Model
Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Xiang Li
Songxiang Liu
Max W. Y. Lam
Zhiyong Wu
Chao Weng
Helen Meng
DiffM
29
5
0
26 May 2023
Joint Multi-scale Cross-lingual Speaking Style Transfer with
  Bidirectional Attention Mechanism for Automatic Dubbing
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Jingbei Li
Sipan Li
Ping Chen
Lu Zhang
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
40
3
0
09 May 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with
  Natural Language Style Prompt
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Dongchao Yang
Songxiang Liu
Rongjie Huang
Chao Weng
Helen Meng
DiffM
VLM
31
85
0
31 Jan 2023
Delivering Speaking Style in Low-resource Voice Conversion with
  Multi-factor Constraints
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Zhichao Wang
Xinsheng Wang
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
30
5
0
16 Nov 2022
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for
  Noise-robust Expressive TTS
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Dongchao Yang
Songxiang Liu
Jianwei Yu
Helin Wang
Chao Weng
Yuexian Zou
DiffM
VLM
43
18
0
04 Nov 2022
MAST: Multiscale Audio Spectrogram Transformers
MAST: Multiscale Audio Spectrogram Transformers
Sreyan Ghosh
Ashish Seth
S. Umesh
Tianyi Zhou
22
3
0
02 Nov 2022
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
27
44
0
11 Aug 2022
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Xiang Li
Changhe Song
X. Wei
Zhiyong Wu
Jia Jia
Helen Meng
29
4
0
10 Aug 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain
  Text-to-Speech
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
117
34
0
15 May 2022
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and
  Natural Non-Autoregressive Text-to-Speech
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech
Jaesung Bae
Jinhyeok Yang
Taejun Bak
Young-Sun Joo
DiffM
27
6
0
08 Apr 2022
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context
  Information for Mandarin Speech Synthesis
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Jiankun Hu
Zhiyong Wu
Shiyin Kang
Helen Meng
27
10
0
06 Apr 2022
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker
  Adaptation in Text-to-Speech Synthesis
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Yixuan Zhou
Changhe Song
Xiang Li
Lu Zhang
Zhiyong Wu
Yanyao Bian
Dan Su
Helen Meng
28
22
0
03 Apr 2022
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly
  Voice Agent
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Yuki Saito
Yuto Nishimura
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
19
12
0
28 Mar 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for
  emotional speech synthesis
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis
Yinjiao Lei
Shan Yang
Xinsheng Wang
Lei Xie
27
73
0
17 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
65
54
0
10 Jan 2022
Cross-speaker emotion disentangling and transfer for end-to-end speech
  synthesis
Cross-speaker emotion disentangling and transfer for end-to-end speech synthesis
Tao Li
Xinsheng Wang
Qicong Xie
Zhichao Wang
Linfu Xie
29
42
0
14 Sep 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
20
353
0
29 Jun 2021
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis
  with Graph-based Multi-modal Context Modeling
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling
Jingbei Li
Yi Meng
Chenyi Li
Zhiyong Wu
Helen Meng
Chao Weng
Dan Su
31
24
0
11 Jun 2021
1