ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.11997
  4. Cited By
Mellotron: Multispeaker expressive voice synthesis by conditioning on
  rhythm, pitch and global style tokens

Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens

26 October 2019
Rafael Valle
Jason Chun Lok Li
R. Prenger
Bryan Catanzaro
ArXivPDFHTML

Papers citing "Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens"

28 / 78 papers shown
Title
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Synchronising speech segments with musical beats in Mandarin and English
  singing
Synchronising speech segments with musical beats in Mandarin and English singing
Cong Zhang
Jian Zhu
6
0
0
18 Jun 2021
Global Rhythm Style Transfer Without Text Transcriptions
Global Rhythm Style Transfer Without Text Transcriptions
Kaizhi Qian
Yang Zhang
Shiyu Chang
Jinjun Xiong
Chuang Gan
David D. Cox
M. Hasegawa-Johnson
30
20
0
16 Jun 2021
An objective evaluation of the effects of recording conditions and
  speaker characteristics in multi-speaker deep neural speech synthesis
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Beáta Lőrincz
Adriana Stan
M. Giurgiu
13
2
0
03 Jun 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework
MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Jinyin Chen
Linhui Ye
Zhaoyan Ming
23
6
0
10 May 2021
Exploring emotional prototypes in a high dimensional TTS latent space
Exploring emotional prototypes in a high dimensional TTS latent space
Pol van Rijn
Silvan Mertes
Dominik Schiller
Peter M. C. Harrison
P. Larrouy-Maestri
Elisabeth André
Nori Jacoby
28
12
0
05 May 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
26
24
0
20 Apr 2021
Assem-VC: Realistic Voice Conversion by Assembling Modern Speech
  Synthesis Techniques
Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques
Kang-Wook Kim
Seung-won Park
Junhyeok Lee
Myun-chul Joe
11
28
0
02 Apr 2021
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech
  Decomposition for Expressive and Controllable Neural Text to Speech
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech
Keon Lee
Kyumin Park
Daeyoung Kim
24
30
0
17 Mar 2021
Adversarially learning disentangled speech representations for robust
  multi-factor voice conversion
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie Wang
Jingbei Li
Xintao Zhao
Zhiyong Wu
Shiyin Kang
Helen Meng
DRL
36
29
0
30 Jan 2021
Expressive Neural Voice Cloning
Expressive Neural Voice Cloning
Paarth Neekhara
Shehzeen Samarah Hussain
Shlomo Dubnov
F. Koushanfar
Julian McAuley
DiffM
19
30
0
30 Jan 2021
Using previous acoustic context to improve Text-to-Speech synthesis
Using previous acoustic context to improve Text-to-Speech synthesis
Pilar Oplustil Gallegos
Simon King
29
11
0
07 Dec 2020
Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a
  Singing Teacher
Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher
Heyang Xue
Shan Yang
Yinjiao Lei
Lei Xie
Xiulin Li
6
10
0
17 Nov 2020
Speech Synthesis and Control Using Differentiable DSP
Speech Synthesis and Control Using Differentiable DSP
Giorgio Fabbro
Vladimir Golkov
Thomas Kemp
Daniel Cremers
23
12
0
28 Oct 2020
Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy
  Loss
Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss
Jiatong Shi
Shuai Guo
Nan Huo
Yuekai Zhang
Qin Jin
26
27
0
22 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
Yao Shi
Hui Bu
Xin Xu
Shaojing Zhang
Ming Li
35
219
0
22 Oct 2020
Efficient neural speech synthesis for low-resource languages through
  multilingual modeling
Efficient neural speech synthesis for low-resource languages through multilingual modeling
M. D. Korte
Jaebok Kim
E. Klabbers
8
19
0
20 Aug 2020
Unsupervised Cross-Domain Singing Voice Conversion
Unsupervised Cross-Domain Singing Voice Conversion
Adam Polyak
Lior Wolf
Yossi Adi
Yaniv Taigman
20
44
0
06 Aug 2020
Multi-speaker Emotion Conversion via Latent Variable Regularization and
  a Chained Encoder-Decoder-Predictor Network
Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network
Ravi Shankar
Hsi-Wei Hsieh
N. Charon
A. Venkataraman
40
11
0
25 Jul 2020
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing
  Synthesizer
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer
Jie Wu
Jian Luan
25
26
0
18 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
42
332
0
11 Jun 2020
PJS: phoneme-balanced Japanese singing voice corpus
PJS: phoneme-balanced Japanese singing voice corpus
Junya Koguchi
Shinnosuke Takamichi
12
22
0
04 Jun 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Da-Yi Wu
Yi-Hsuan Yang
GAN
14
8
0
28 May 2020
Pitchtron: Towards audiobook generation from ordinary people's voices
Pitchtron: Towards audiobook generation from ordinary people's voices
Sunghee Jung
Hoi-Rim Kim
16
5
0
21 May 2020
Flowtron: an Autoregressive Flow-based Generative Network for
  Text-to-Speech Synthesis
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle
Kevin J. Shih
R. Prenger
Bryan Catanzaro
21
119
0
12 May 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
Unsupervised Speech Decomposition via Triple Information Bottleneck
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
17
177
0
23 Apr 2020
Singing Synthesis: with a little help from my attention
Singing Synthesis: with a little help from my attention
Orazio Angelini
Alexis Moinet
K. Yanagisawa
Thomas Drugman
17
17
0
12 Dec 2019
Prosody Transfer in Neural Text to Speech Using Global Pitch and
  Loudness Features
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
Francesco Ferroni
Kilol Gupta
D. Shah
Z. Shakeri
Jervis Pinto
17
15
0
21 Nov 2019
Previous
12