Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens

26 October 2019

Papers citing "Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens"

28 / 78 papers shown

Title
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
Synchronising speech segments with musical beats in Mandarin and English singing Cong Zhang Jian Zhu 6 0 0 18 Jun 2021
Global Rhythm Style Transfer Without Text Transcriptions Kaizhi Qian Yang Zhang Shiyu Chang Jinjun Xiong Chuang Gan David D. Cox M. Hasegawa-Johnson 30 20 0 16 Jun 2021
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis Beáta Lőrincz Adriana Stan M. Giurgiu 13 2 0 03 Jun 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework Jinyin Chen Linhui Ye Zhaoyan Ming 23 6 0 10 May 2021
Exploring emotional prototypes in a high dimensional TTS latent space Pol van Rijn Silvan Mertes Dominik Schiller Peter M. C. Harrison P. Larrouy-Maestri Elisabeth André Nori Jacoby 28 12 0 05 May 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 26 24 0 20 Apr 2021
Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques Kang-Wook Kim Seung-won Park Junhyeok Lee Myun-chul Joe 11 28 0 02 Apr 2021
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee Kyumin Park Daeyoung Kim 24 30 0 17 Mar 2021
Adversarially learning disentangled speech representations for robust multi-factor voice conversion Jie Wang Jingbei Li Xintao Zhao Zhiyong Wu Shiyin Kang Helen Meng DRL 36 29 0 30 Jan 2021
Expressive Neural Voice Cloning Paarth Neekhara Shehzeen Samarah Hussain Shlomo Dubnov F. Koushanfar Julian McAuley DiffM 19 30 0 30 Jan 2021
Using previous acoustic context to improve Text-to-Speech synthesis Pilar Oplustil Gallegos Simon King 29 11 0 07 Dec 2020
Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher Heyang Xue Shan Yang Yinjiao Lei Lei Xie Xiulin Li 6 10 0 17 Nov 2020
Speech Synthesis and Control Using Differentiable DSP Giorgio Fabbro Vladimir Golkov Thomas Kemp Daniel Cremers 23 12 0 28 Oct 2020
Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss Jiatong Shi Shuai Guo Nan Huo Yuekai Zhang Qin Jin 26 27 0 22 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines Yao Shi Hui Bu Xin Xu Shaojing Zhang Ming Li 35 219 0 22 Oct 2020
Efficient neural speech synthesis for low-resource languages through multilingual modeling M. D. Korte Jaebok Kim E. Klabbers 8 19 0 20 Aug 2020
Unsupervised Cross-Domain Singing Voice Conversion Adam Polyak Lior Wolf Yossi Adi Yaniv Taigman 20 44 0 06 Aug 2020
Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network Ravi Shankar Hsi-Wei Hsieh N. Charon A. Venkataraman 40 11 0 25 Jul 2020
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer Jie Wu Jian Luan 25 26 0 18 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 42 332 0 11 Jun 2020
PJS: phoneme-balanced Japanese singing voice corpus Junya Koguchi Shinnosuke Takamichi 12 22 0 04 Jun 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN Da-Yi Wu Yi-Hsuan Yang GAN 14 8 0 28 May 2020
Pitchtron: Towards audiobook generation from ordinary people's voices Sunghee Jung Hoi-Rim Kim 16 5 0 21 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis Rafael Valle Kevin J. Shih R. Prenger Bryan Catanzaro 21 119 0 12 May 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck Kaizhi Qian Yang Zhang Shiyu Chang David D. Cox M. Hasegawa-Johnson 17 177 0 23 Apr 2020
Singing Synthesis: with a little help from my attention Orazio Angelini Alexis Moinet K. Yanagisawa Thomas Drugman 17 17 0 12 Dec 2019
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features Francesco Ferroni Kilol Gupta D. Shah Z. Shakeri Jervis Pinto 17 15 0 21 Nov 2019