ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.14794
  4. Cited By
Seen and Unseen emotional style transfer for voice conversion with a new
  emotional speech dataset

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset

28 October 2020
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
ArXivPDFHTML

Papers citing "Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset"

35 / 35 papers shown
Title
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Sandipan Dhar
N. D. Jana
Swagatam Das
48
0
0
27 Apr 2025
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
Ashishkumar Gudmalwar
Ishan D. Biyani
Nirmesh J. Shah
Pankaj Wasnik
R. Shah
DiffM
26
0
0
31 Dec 2024
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
Kun Zhou
You Zhang
Shengkui Zhao
Hao Wang
Zexu Pan
...
Chongjia Ni
Yukun Ma
Trung Hieu Nguyen
J. Yip
Bin Ma
59
5
0
25 Sep 2024
Leveraging AI-Generated Emotional Self-Voice to Nudge People towards
  their Ideal Selves
Leveraging AI-Generated Emotional Self-Voice to Nudge People towards their Ideal Selves
Cathy Mengying Fang
Phoebe Chua
Samantha Chan
J. Leong
Andria Bao
Pattie Maes
28
1
0
17 Sep 2024
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Ruchik Mishra
Andrew Frye
M. M. Rayguru
Dan O. Popa
39
1
0
16 Sep 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
42
1
0
12 Aug 2024
Controlling Emotion in Text-to-Speech with Natural Language Prompts
Controlling Emotion in Text-to-Speech with Natural Language Prompts
Thomas Bott
Florian Lux
Ngoc Thang Vu
38
6
0
10 Jun 2024
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
Sho Inoue
Kun Zhou
Shuai Wang
Haizhou Li
34
8
0
15 May 2024
The VoicePrivacy 2024 Challenge Evaluation Plan
The VoicePrivacy 2024 Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Pierre Champion
Sarina Meyer
Xin Wang
Emmanuel Vincent
Michele Panariello
Nicholas W. D. Evans
Junichi Yamagishi
Massimiliano Todisco
38
21
0
03 Apr 2024
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding
  Decomposition
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
Rendi Chevi
Alham Fikri Aji
25
2
0
22 Feb 2024
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Suhee Jo
Younggun Lee
Yookyung Shin
Yeongtae Hwang
Taesu Kim
13
3
0
15 Mar 2023
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised
  Style Extractor and Hierarchical Modeling in Speech Synthesis
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Chunyu Qiang
Peng Yang
Hao Che
Ying Zhang
Xiaorui Wang
Zhong-ming Wang
46
9
0
14 Mar 2023
Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection
Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection
Jianwei Zhang
J. Liss
Suren Jayasuriya
Visar Berisha
36
6
0
17 Nov 2022
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label
  Guidance
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Yiwei Guo
Chenpeng Du
Xie Chen
K. Yu
DiffM
54
40
0
17 Nov 2022
Delivering Speaking Style in Low-resource Voice Conversion with
  Multi-factor Constraints
Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints
Zhichao Wang
Xinsheng Wang
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
25
5
0
16 Nov 2022
A unified one-shot prosody and speaker conversion system with
  self-supervised discrete speech units
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
25
6
0
12 Nov 2022
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
Yan Zhao
Jiangyan Yi
J. Tao
Chenglong Wang
Xiaohui Zhang
Yongfeng Dong
24
10
0
10 Nov 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
20
53
0
06 Oct 2022
Speech Synthesis with Mixed Emotions
Speech Synthesis with Mixed Emotions
Kun Zhou
Berrak Sisman
R. Rana
B.W.Schuller
Haizhou Li
14
44
0
11 Aug 2022
Controllable Data Generation by Deep Learning: A Review
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
33
28
0
19 Jul 2022
CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net
  for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Xin-Cheng Wen
Jiaxin Ye
Yan Luo
Yong-mei Xu
Xinyu Wang
Changqing Wu
Kun Liu
29
30
0
18 Jul 2022
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on
  Data-Driven Deep Learning
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning
Rui Liu
Berrak Sisman
Björn Schuller
Guanglai Gao
Haizhou Li
22
11
0
15 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
36
38
0
30 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain
  Text-to-Speech
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
115
34
0
15 May 2022
An Overview of Recent Work in Media Forensics: Methods and Threats
An Overview of Recent Work in Media Forensics: Methods and Threats
Kratika Bhagtani
A. Yadav
Emily R. Bartusiak
Ziyue Xiang
Ruiting Shao
Sriram Baireddy
Edward J. Delp
AAML
52
25
0
26 Apr 2022
Textless Speech Emotion Conversion using Discrete and Decomposed
  Representations
Textless Speech Emotion Conversion using Discrete and Decomposed Representations
Felix Kreuk
Adam Polyak
Jade Copet
Eugene Kharitonov
Tu Nguyen
M. Rivière
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
25
29
0
14 Nov 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive
  Voice Conversion
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
11
24
0
20 Oct 2021
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Li-Wei Chen
Alexander I. Rudnicky
88
29
0
12 Oct 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for
  Natural-Sounding Voice Conversion
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Yinghao Aaron Li
A. Zare
N. Mesgarani
30
98
0
21 Jul 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
24
35
0
17 Jun 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
25
168
0
31 May 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech:
  Two-stage Sequence-to-Sequence Training
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
Kun Zhou
Berrak Sisman
Haizhou Li
18
27
0
31 Mar 2021
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech
  Synthesis
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis
Rui Liu
Berrak Sisman
Haizhou Li
18
24
0
23 Oct 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
32
73
0
04 Aug 2020
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network
  and an Adversarial Pair Discriminator
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network and an Adversarial Pair Discriminator
Ravi Shankar
Jacob Sager
A. Venkataraman
GAN
35
18
0
25 Jul 2020
1