v1v2 (latest)

Emotional End-to-End Neural Speech Synthesizer

15 November 2017

Papers citing "Emotional End-to-End Neural Speech Synthesizer"

50 / 55 papers shown

Title
Prompt-Unseen-Emotion: Zero-shot Expressive Speech Synthesis with Prompt-LLM Contextual Knowledge for Mixed Emotions Xiaoxue Gao Huayun Zhang Nancy F. Chen 56 0 0 03 Jun 2025
Making Social Platforms Accessible: Emotion-Aware Speech Generation with Integrated Text Analysis Suparna De Ionut Bostan Nishanth Sastry 122 0 0 24 Oct 2024
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions Kun Zhou You Zhang Shengkui Zhao Hao Wang Zexu Pan ... Chongjia Ni Yukun Ma Trung Hieu Nguyen J. Yip Bin Ma 127 7 0 25 Sep 2024
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization Xiaoxue Gao Chen Zhang Yiming Chen Huayun Zhang Nancy F. Chen 112 11 0 16 Sep 2024
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech Haibin Wu Xiaofei Wang Sefik Emre Eskimez Manthan Thakker Daniel Tompkins ... Canrun Li Zhen Xiao Sheng Zhao Jinyu Li Naoyuki Kanda 118 9 0 17 Jul 2024
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability Hyun Joon Park Jin Sob Kim Wooseok Shin Sung Won Han DiffM 70 3 0 27 Jun 2024
Style Mixture of Experts for Expressive Text-To-Speech Synthesis Ahad Jawaid Shreeram Suresh Chandra Junchen Lu Berrak Sisman MoE 102 1 0 05 Jun 2024
RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis Haoxiang Shi Jianzong Wang Xulong Zhang Ning Cheng Jun Yu Jing Xiao 73 2 0 27 May 2024
Exploring speech style spaces with language models: Emotional TTS without emotion labels Shreeram Suresh Chandra Zongyang Du Berrak Sisman 76 2 0 18 May 2024
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation Rohan Chaudhury Mihir Godbole Aakash Garg Jinsil Hwaryoung Seo 77 0 0 31 Mar 2024
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition Rendi Chevi Alham Fikri Aji 108 3 0 22 Feb 2024
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis Wenhao Guan Yishuang Li Tao Li Hukai Huang Feng Wang Jiayan Lin Lingyan Huang Lin Li Q. Hong 85 14 0 17 Dec 2023
MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling Zhichao Wang Xinsheng Wang Qicong Xie Tao Li Linfu Xie Qiao Tian Yuping Wang 114 4 0 03 Sep 2023
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech Daria Diatlova V. Shutov 93 9 0 28 Jun 2023
CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation Yuhao Cui Xiongwei Wang Zhongzhou Zhao Wei Zhou Haiqing Chen 64 1 0 27 Jun 2023
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis Dengfeng Ke Yayue Deng Yukang Jia Jinlong Xue Qi Luo Ya Li Jianqing Sun Jiaen Liang Binghuai Lin 39 0 0 05 Jun 2023
Affective social anthropomorphic intelligent system Md. Adyelullahil Mamun Hasnat Md. Abdullah Md. Golam Rabiul Alam Muhammad Mehedi Hassan Md. Zia Uddin 52 1 0 19 Apr 2023
Fine-grained Emotional Control of Text-To-Speech: Learning To Rank Inter- And Intra-Class Emotion Intensities Shijun Wang Jón Guðnason Damian Borth 83 10 0 02 Mar 2023
Generative Emotional AI for Speech Emotion Recognition: The Case for Synthetic Emotional Speech Augmentation Abdullah Shahid S. Latif Junaid Qadir 64 23 0 10 Jan 2023
Emotion Selectable End-to-End Text-based Speech Editing Tao Wang Jiangyan Yi Ruibo Fu J. Tao Zhengqi Wen Chu Yuan Zhang 76 2 0 20 Dec 2022
Contextual Expressive Text-to-Speech Jianhong Tu Zeyu Cui Xiaohuan Zhou Siqi Zheng Kaiqin Hu Ju Fan Chang Zhou 51 3 0 26 Nov 2022
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling Xinfa Zhu Yinjiao Lei Kun Song Yongmao Zhang Tao Li Linfu Xie 75 17 0 19 Nov 2022
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance Yiwei Guo Chenpeng Du Xie Chen K. Yu DiffM 132 44 0 17 Nov 2022
Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations Yoorim Oh Juheon Lee Yoseob Han Kyogu Lee 67 3 0 11 Nov 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Andreas Triantafyllopoulos Björn W. Schuller Gokcce .Iymen M. Sezgin Xiangheng He ... Shuo Liu Silvan Mertes Elisabeth André Ruibo Fu Jianhua Tao 115 57 0 06 Oct 2022
Speech Synthesis with Mixed Emotions Kun Zhou Berrak Sisman R. Rana B.W.Schuller Haizhou Li 87 47 0 11 Aug 2022
Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS Yookyung Shin Younggun Lee Suhee Jo Yeongtae Hwang Taesu Kim 100 14 0 13 Jul 2022
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis Tao Li Xinsheng Wang Qicong Xie Zhichao Wang Ming Jiang Linfu Xie 101 16 0 04 Jul 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre Guangyan Zhang Ying Qin Weinan Zhang Jialun Wu Mei Li Yu Gai Feijun Jiang Tan Lee 108 27 0 29 Jun 2022
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis Yihan Wu Xi Wang S. Zhang Lei He Ruihua Song J. Nie 102 15 0 25 Jun 2022
ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence Sangshin Oh Seyun Um Hong-Goo Kang BDL DRL 43 2 0 09 May 2022
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent Yuki Saito Yuto Nishimura Shinnosuke Takamichi Kentaro Tachibana Hiroshi Saruwatari 126 12 0 28 Mar 2022
Robotic Speech Synthesis: Perspectives on Interactions, Scenarios, and Ethics Yuanchao Li Catherine Lai 25 5 0 17 Mar 2022
A Review of Affective Generation Models Guangtao Nie Yibing Zhan 68 2 0 22 Feb 2022
Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis Yu Wang Xinsheng Wang Pengcheng Zhu Jie Wu Hanzhao Li Heyang Xue Yongmao Zhang Lei Xie Mengxiao Bi 109 103 0 19 Jan 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis Yinjiao Lei Shan Yang Xinsheng Wang Lei Xie 81 75 0 17 Jan 2022
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios Qicong Xie Tao Li Xinsheng Wang Zhichao Wang Lei Xie Guoqiao Yu Guanglu Wan 86 11 0 23 Dec 2021
Multi-speaker Emotional Text-to-speech Synthesizer Sungjae Cho Soo-Young Lee 41 1 0 07 Dec 2021
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation Fengyu Yang Jian Luan Yujun Wang 137 5 0 19 Oct 2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Pengfei Wu Junjie Pan Chenchang Xu Junhui Zhang Lin Wu Xiang Yin Zejun Ma 64 16 0 08 Oct 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework Jinyin Chen Linhui Ye Zhaoyan Ming 65 7 0 10 May 2021
Controllable Emotion Transfer For End-to-End Speech Synthesis Tao Li Shan Yang Liumeng Xue Lei Xie 79 74 0 17 Nov 2020
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis Yinjiao Lei Shan Yang Lei Xie 88 56 0 17 Nov 2020
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition Xiong Cai Dongyang Dai Zhiyong Wu Xiang Li Jingbei Li Helen Meng 94 67 0 26 Oct 2020
Emotional Voice Conversion using Multitask Learning with Text-to-speech Tae-Ho Kim Sungjae Cho Shinkook Choi Sejik Park Soo-Young Lee 92 40 0 11 Nov 2019
Emotional speech synthesis with rich and granularized control Seyun Um Sangshin Oh Kyungguen Byun Inseon Jang C. Ahn Hong-Goo Kang 85 90 0 05 Nov 2019
Sequence to Sequence Neural Speech Synthesis with Prosody Modification Capabilities Slava Shechtman A. Sorin 59 33 0 23 Sep 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training Peng Wu Zhenhua Ling Li-Juan Liu Yuan Jiang Hong-Chuan Wu Lirong Dai 98 72 0 26 Jun 2019
ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks Xiaoqi Jia Jianwei Tai Hang Zhou Yakai Li Weijuan Zhang Haichao Du Qingjia Huang GAN 43 6 0 27 May 2019
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis Noé Tits Fengna Wang Kevin El Haddad Vincent Pagel Thierry Dutoit DiffM 91 39 0 27 Mar 2019