ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04664
  4. Cited By
MultiSpeech: Multi-Speaker Text to Speech with Transformer

MultiSpeech: Multi-Speaker Text to Speech with Transformer

8 June 2020
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
ArXivPDFHTML

Papers citing "MultiSpeech: Multi-Speaker Text to Speech with Transformer"

30 / 30 papers shown
Title
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Ji-Hoon Kim
Hong-Sun Yang
Yoon-Cheol Ju
Il-Hwan Kim
Byeong-Yeol Kim
Joon Son Chung
BDL
54
0
0
31 Dec 2024
SongTrans: An unified song transcription and alignment method for lyrics
  and notes
SongTrans: An unified song transcription and alignment method for lyrics and notes
Siwei Wu
Jinzheng He
Ruibin Yuan
Haojie Wei
Xipin Wei
Chenghua Lin
Jin Xu
Junyang Lin
53
1
0
22 Sep 2024
SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
Ismail Rasim Ulgen
Shreeram Suresh Chandra
Junchen Lu
Berrak Sisman
227
1
0
30 Aug 2024
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech
Hyungchan Yoon
Changhwan Kim
Eunwoo Song
Hyun-Wook Yoon
Hong-Goo Kang
42
1
0
28 Aug 2023
Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech
  Using Consistent Diffusion Models
Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models
Heyang Xue
Shuai Guo
Pengcheng Zhu
Mengxiao Bi
DiffM
43
1
0
21 Aug 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by
  Unsupervised Learning from Voice Recordings
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
39
1
0
09 May 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Siwei Lyu
38
40
0
25 Apr 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
46
47
0
21 Mar 2023
Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts
Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Ehab AlBadawy
Siwei Lyu
135
3
0
18 Feb 2023
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with
  Diffusion Models
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models
Minki Kang
Dong Min
Sung Ju Hwang
DiffM
25
48
0
17 Nov 2022
Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered
  Speech
Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech
Xin Zhang
Iván Vallés-Pérez
A. Stolcke
Chengzhu Yu
J. Droppo
Olabanji Shonibare
Roberto Barra-Chicote
Venkatesh Ravichandran
33
6
0
04 Nov 2022
Controllable Accented Text-to-Speech Synthesis
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
42
6
0
22 Sep 2022
Simple and Effective Multi-sentence TTS with Expressive and Coherent
  Prosody
Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody
Peter Makarov
Ammar Abbas
Mateusz Lajszczak
Arnaud Joly
S. Karlapati
Alexis Moinet
Thomas Drugman
Penny Karanasou
23
16
0
29 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
79
530
0
13 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
45
38
0
30 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain
  Text-to-Speech
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
117
34
0
15 May 2022
Self-supervised learning for robust voice cloning
Self-supervised learning for robust voice cloning
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
32
6
0
07 Apr 2022
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Yihan Wu
Xu Tan
Bohan Li
Lei He
Sheng Zhao
Ruihua Song
Tao Qin
Tie-Yan Liu
VLM
DiffM
19
67
0
01 Apr 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale
  Corpus
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
36
100
0
20 Dec 2021
Neural Dubber: Dubbing for Videos According to Scripts
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
36
39
0
15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
55
60
0
15 Oct 2021
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Fine-grained style control in Transformer-based Text-to-speech Synthesis
Li-Wei Chen
Alexander I. Rudnicky
88
30
0
12 Oct 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
20
353
0
29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech
  Synthesis
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Jinhyeok Yang
Jaesung Bae
Taejun Bak
Young-Ik Kim
Hoon-Young Cho
34
36
0
29 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
24
35
0
17 Jun 2021
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Dong Min
Dong Bok Lee
Eunho Yang
Sung Ju Hwang
25
160
0
06 Jun 2021
AdaSpeech: Adaptive Text to Speech for Custom Voice
AdaSpeech: Adaptive Text to Speech for Custom Voice
Mingjian Chen
Xu Tan
Bohan Li
Yanqing Liu
Tao Qin
Sheng Zhao
Tie-Yan Liu
VLM
DiffM
37
188
0
01 Mar 2021
Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis
Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis
Neeraj Kumar
Srishti Goel
Ankur Narang
Brejesh Lall
29
5
0
14 Dec 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
60
1,362
0
08 Jun 2020
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhehuai Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
821
0
12 Jun 2018
1