Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis

4 August 2018

Yuxuan Wang

Papers citing "Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis"

32 / 32 papers shown

Title
On the Cost and Benefits of Training Context with Utterance or Full Conversation Training: A Comparative Stud Hyouin Liu Zhikuan Zhang 34 0 0 12 May 2025
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions Kun Zhou You Zhang Shengkui Zhao Hao Wang Zexu Pan ... Chongjia Ni Yukun Ma Trung Hieu Nguyen J. Yip Bin Ma 61 5 0 25 Sep 2024
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis Sho Inoue Kun Zhou Shuai Wang Haizhou Li 39 8 0 15 May 2024
CALM: Contrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis Yi Meng Xiang Li Zhiyong Wu Tingtian Li Zixun Sun Xinyu Xiao Chi Sun Hui Zhan Helen Meng 14 0 0 30 Aug 2023
The DeepZen Speech Synthesis System for Blizzard Challenge 2023 C. Veaux R. Maia Spyridoula Papendreou 25 1 0 30 Aug 2023
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis Shunwei Lei Yixuan Zhou Liyang Chen Zhiyong Wu Xixin Wu Shiyin Kang Helen Meng 35 7 0 29 Jul 2023
Going Retro: Astonishingly Simple Yet Effective Rule-based Prosody Modelling for Speech Synthesis Simulating Emotion Dimensions Felix Burkhardt U. Reichel F. Eyben Björn Schuller 23 0 0 05 Jul 2023
ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings Yuki Saito Shinnosuke Takamichi Eiji Iimori Kentaro Tachibana Hiroshi Saruwatari 51 11 0 23 May 2023
Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis Shunwei Lei Yixuan Zhou Liyang Chen Zhiyong Wu Shiyin Kang Helen Meng 38 6 0 13 Apr 2023
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Andreas Triantafyllopoulos Björn W. Schuller Gokcce .Iymen M. Sezgin Xiangheng He ... Shuo Liu Silvan Mertes Elisabeth André Ruibo Fu Jianhua Tao 20 53 0 06 Oct 2022
Speech Synthesis with Mixed Emotions Kun Zhou Berrak Sisman R. Rana B.W.Schuller Haizhou Li 27 44 0 11 Aug 2022
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model Brooke Stephenson Laurent Besacier Laurent Girin Thomas Hueber 25 8 0 04 Jul 2022
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis Yihan Wu Xi Wang S. Zhang Lei He Ruihua Song J. Nie 42 15 0 25 Jun 2022
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis Shunwei Lei Yixuan Zhou Liyang Chen Jiankun Hu Zhiyong Wu Shiyin Kang Helen Meng 27 10 0 06 Apr 2022
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis Shunwei Lei Yixuan Zhou Liyang Chen Zhiyong Wu Shiyin Kang Helen Meng 28 12 0 23 Mar 2022
Distribution augmentation for low-resource expressive text-to-speech Mateusz Lajszczak Animesh Prasad Arent van Korlaar Bajibabu Bollepalli Antonio Bonafonte ... M. Nicolis Alexis Moinet Thomas Drugman Trevor Wood Elena Sokolova 33 7 0 13 Feb 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer Xiaochun An Frank Soong Lei Xie 68 18 0 24 Jan 2022
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis Yinjiao Lei Shan Yang Xinsheng Wang Lei Xie 27 73 0 17 Jan 2022
V2C: Visual Voice Cloning Qi Chen Yuanqing Li Yuankai Qi Jiaqiu Zhou Mingkui Tan Qi Wu VGen 33 23 0 25 Nov 2021
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021 Yanqing Liu Rui Shao G. Wang Kuan Chen Bohan Li Pong C. Yuen Jinzhu Li Lei He Sheng Zhao 39 55 0 25 Oct 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-end Neural TTS Xiaochun An Frank Soong Lei Xie 42 9 0 18 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model Chenye Cui Yi Ren Jinglin Liu Feiyang Chen Rongjie Huang Ming Lei Zhou Zhao 24 35 0 17 Jun 2021
GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis Aolan Sun Jianzong Wang Ning Cheng Huayi Peng Zhen Zeng Lingwei Kong Jing Xiao 16 9 0 03 Dec 2020
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis Yinjiao Lei Shan Yang Lei Xie 27 55 0 17 Nov 2020
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis C. Chien Hung-yi Lee 29 36 0 12 Nov 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features T. Raitio Ramya Rasipuram D. Castellani 36 66 0 14 Sep 2020
Expressive TTS Training with Frame and Style Reconstruction Loss Rui Liu Berrak Sisman Guanglai Gao Haizhou Li 39 73 0 04 Aug 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding Seungwoo Choi Seungju Han Dongyoung Kim S. Ha 37 65 0 18 May 2020
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection Shubhi Tyagi M. Nicolis Jonas Rohnke Thomas Drugman Jaime Lorenzo-Trueba 32 32 0 02 Dec 2019
Using generative modelling to produce varied intonation for speech synthesis Zack Hodari O. Watts Simon King 29 29 0 10 Jun 2019
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 9 147 0 06 Nov 2018