Improved Prosodic Clustering for Multispeaker and Speaker-independent
Phoneme-level Prosody Control

Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control

19 November 2021

Myrsini Christidou

Alexandra Vioni

Nikolaos Ellinas

Panos Kakoulidis

Aimilios Chalamandaris

Pirros Tsiakoulis

ArXiv (abs)PDF HTML

Papers citing "Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control"

14 / 14 papers shown

Title
Prosodic Clustering for Phoneme-level Prosody Control in End-to-End Speech Synthesis Alexandra Vioni Myrsini Christidou Nikolaos Ellinas G. Vamvoukakis Panos Kakoulidis Taehoon Kim June Sig Sung Hyoungmin Park Aimilios Chalamandaris Pirros Tsiakoulis 45 11 0 19 Nov 2021
Expressive Neural Voice Cloning Paarth Neekhara Shehzeen Samarah Hussain Shlomo Dubnov F. Koushanfar Julian McAuley DiffM 51 30 0 30 Jan 2021
Controllable neural text-to-speech synthesis using intuitive prosodic features T. Raitio Ramya Rasipuram D. Castellani 61 66 0 14 Sep 2020
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems Ravichander Vipperla Sangjun Park Kihyun Choo Samin S. Ishtiaq Kyoungbo Min S. Bhattacharya Abhinav Mehrotra Alberto Gil C. P. Ramos Nicholas D. Lane 58 26 0 11 Aug 2020
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis Guangzhi Sun Yu Zhang Ron J. Weiss Yuanbin Cao Heiga Zen Yonghui Wu 51 130 0 06 Feb 2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior Guangzhi Sun Yu Zhang Ron J. Weiss Yuan Cao Heiga Zen Andrew Rosenberg Bhuvana Ramabhadran Yonghui Wu DiffM 79 93 0 06 Feb 2020
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens Rafael Valle Jason Chun Lok Li R. Prenger Bryan Catanzaro 72 149 0 26 Oct 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu Zhiwen Chen RJ Skerry-Ryan Ye Jia Andrew Rosenberg Bhuvana Ramabhadran 47 188 0 09 Jul 2019
Fine-grained robust prosody transfer for single-speaker neural text-to-speech V. Klimkov S. Ronanki Jonas Rohnke Thomas Drugman AI4TS 69 82 0 04 Jul 2019
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 53 229 0 11 Dec 2018
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 56 147 0 06 Nov 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 72 275 0 16 Oct 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,698 0 16 Dec 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 160 1,825 0 29 Mar 2017