Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis
with Graph-based Multi-modal Context Modeling

Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling

11 June 2021

Zhiyong Wu

Papers citing "Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling"

19 / 19 papers shown

Title
SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development Minghan Wang Ye Bai Yanjie Wang Thuy-Trang Vu Ehsan Shareghi Gholamreza Haffari 90 0 0 31 Mar 2025
Controllable Context-aware Conversational Speech Synthesis Jian Cong Shan Yang Na Hu Guangzhi Li Lei Xie Dan Su 45 30 0 21 Jun 2021
Towards Multi-Scale Style Control for Expressive Speech Synthesis Xiang Li Changhe Song Jingbei Li Zhiyong Wu Jia Jia Helen Meng 39 47 0 08 Apr 2021
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis Yinjiao Lei Shan Yang Lei Xie 55 56 0 17 Nov 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 164 1,928 0 12 Oct 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,396 0 08 Jun 2020
Conversational End-to-End TTS for Voice Agent Haohan Guo Shaofei Zhang Frank Soong Lei He Lei Xie 39 68 0 21 May 2020
Revisiting Pre-Trained Models for Chinese Natural Language Processing Yiming Cui Wanxiang Che Ting Liu Bing Qin Shijin Wang Guoping Hu 77 697 0 29 Apr 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury ... Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai Soumith Chintala ODL 396 42,299 0 03 Dec 2019
DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation Deepanway Ghosal Navonil Majumder Soujanya Poria Niyati Chhaya Alexander Gelbukh 77 512 0 30 Aug 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.6K 94,511 0 11 Oct 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron RJ Skerry-Ryan Eric Battenberg Y. Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss R. Clark Rif A. Saurous 54 554 0 24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Y. Xiao Fei Ren Ye Jia Rif A. Saurous 64 825 0 23 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 77 2,694 0 16 Dec 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 654 130,942 0 12 Jun 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 155 1,819 0 29 Mar 2017
Quasi-Recurrent Neural Networks James Bradbury Stephen Merity Caiming Xiong R. Socher 138 441 0 05 Nov 2016
Domain-Adversarial Training of Neural Networks Yaroslav Ganin E. Ustinova Hana Ajakan Pascal Germain Hugo Larochelle François Laviolette M. Marchand Victor Lempitsky GAN OOD 366 9,467 0 28 May 2015
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Kyunghyun Cho B. V. Merrienboer Çağlar Gülçehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk Yoshua Bengio AIMat 951 23,310 0 03 Jun 2014