ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.04448
  4. Cited By
Learning to Speak Fluently in a Foreign Language: Multilingual Speech
  Synthesis and Cross-Language Voice Cloning

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

9 July 2019
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Z. Chen
RJ Skerry-Ryan
Ye Jia
Andrew Rosenberg
Bhuvana Ramabhadran
ArXivPDFHTML

Papers citing "Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning"

39 / 39 papers shown
Title
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Ji-Hoon Kim
Hong-Sun Yang
Yoon-Cheol Ju
Il-Hwan Kim
Byeong-Yeol Kim
Joon Son Chung
BDL
54
0
0
31 Dec 2024
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual
  Text-to-Speech
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Ashishkumar Gudmalwar
Nirmesh Shah
Sai Akarsh
Pankaj Wasnik
R. Shah
36
1
0
12 Jun 2024
Multi-Level Attention Aggregation for Language-Agnostic Speaker
  Replication
Multi-Level Attention Aggregation for Language-Agnostic Speaker Replication
Yejin Jeon
Gary Geunbae Lee
31
2
0
06 Mar 2024
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for
  Text-to-Speech -- A Study between English and Mandarin
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
Tao Li
Chenxu Hu
Jian Cong
Xinfa Zhu
Jingbei Li
Qiao Tian
Yuping Wang
Linfu Xie
DiffM
43
8
0
02 Sep 2023
MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low
  Resource Setting
MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting
Neil Shah
Vishal Tambrahalli
Saiteja Kosgi
N. Pedanekar
Vineet Gandhi
44
0
0
19 May 2023
Joint Multi-scale Cross-lingual Speaking Style Transfer with
  Bidirectional Attention Mechanism for Automatic Dubbing
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Jingbei Li
Sipan Li
Ping Chen
Lu Zhang
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
40
3
0
09 May 2023
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Suhee Jo
Younggun Lee
Yookyung Shin
Yeongtae Hwang
Taesu Kim
13
3
0
15 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of
  Generative AI from GAN to ChatGPT
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
38
508
0
07 Mar 2023
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec
  Language Modeling
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Zi-Hua Zhang
Long Zhou
Chengyi Wang
Sanyuan Chen
Yu Wu
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
VLM
36
171
0
07 Mar 2023
Modelling low-resource accents without accent-specific TTS frontend
Modelling low-resource accents without accent-specific TTS frontend
Georgi Tinchev
Marta Czarnowska
Kamil Deja
K. Yanagisawa
Marius Cotescu
31
4
0
11 Jan 2023
Improve Bilingual TTS Using Dynamic Language and Phonology Embedding
Improve Bilingual TTS Using Dynamic Language and Phonology Embedding
Fengyu Yang
Jian Luan
Yujun Wang
21
1
0
07 Dec 2022
Voice-preserving Zero-shot Multiple Accent Conversion
Voice-preserving Zero-shot Multiple Accent Conversion
Mumin Jin
Prashant Serai
Jilong Wu
Andros Tjandra
Vimal Manohar
Qing He
19
12
0
23 Nov 2022
An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems
  via Vowel Space
An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Jihwan Lee
Jaesung Bae
Seongkyu Mun
Heejin Choi
Joun Yeop Lee
Hoon-Young Cho
Chanwoo Kim
32
2
0
06 Nov 2022
Explicit Intensity Control for Accented Text-to-speech
Explicit Intensity Control for Accented Text-to-speech
Rui Liu
Haolin Zuo
De Hu
Guanglai Gao
Haizhou Li
21
6
0
27 Oct 2022
SQuId: Measuring Speech Naturalness in Many Languages
SQuId: Measuring Speech Naturalness in Many Languages
Thibault Sellam
Ankur Bapna
Joshua Camp
Diana Mackinnon
Ankur P. Parikh
Jason Riesa
35
17
0
12 Oct 2022
Controllable Accented Text-to-Speech Synthesis
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
39
6
0
22 Sep 2022
Unify and Conquer: How Phonetic Feature Representation Affects Polyglot
  Text-To-Speech (TTS)
Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)
Ariadna Sánchez
Alessio Falai
Ziyao Zhang
Orazio Angelini
K. Yanagisawa
38
7
0
04 Jul 2022
Mix and Match: An Empirical Study on Training Corpus Composition for
  Polyglot Text-To-Speech (TTS)
Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS)
Ziyao Zhang
Alessio Falai
Ariadna Sánchez
Orazio Angelini
K. Yanagisawa
29
4
0
04 Jul 2022
Heterogeneous Target Speech Separation
Heterogeneous Target Speech Separation
Hyunjae Cho
Wonbin Jung
Junhyeok Lee
Paris Smaragdis
Sanghyun Woo
48
26
0
07 Apr 2022
Self-supervised learning for robust voice cloning
Self-supervised learning for robust voice cloning
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
32
6
0
07 Apr 2022
WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses
WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses
Zewang Zhang
Yibin Zheng
Xinhui Li
Li Lu
26
16
0
21 Mar 2022
Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker
  Classifier Joint Training
Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
J. Yang
Lei He
36
11
0
20 Jan 2022
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Haitong Zhang
Yue Lin
23
0
0
14 Oct 2021
Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Mu Yang
Shaojin Ding
Tianlong Chen
Tong Wang
Zhangyang Wang
CLL
30
5
0
09 Oct 2021
Combining speakers of multiple languages to improve quality of neural
  voices
Combining speakers of multiple languages to improve quality of neural voices
Javier Latorre
Charlotte Bailleul
Tuuli H. Morrill
Alistair Conkie
Y. Stylianou
38
8
0
17 Aug 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
26
67
0
19 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
D. Mohan
Qinmin Hu
Tian Huey Teh
Alexandra Torresquintero
C. Wallis
Marlene Staib
Lorenzo Foglianti
Jiameng Gao
Simon King
25
16
0
15 Jun 2021
Crossing the Conversational Chasm: A Primer on Natural Language
  Processing for Multilingual Task-Oriented Dialogue Systems
Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems
E. Razumovskaia
Goran Glavavs
Olga Majewska
Edoardo Ponti
Anna Korhonen
Ivan Vulić
30
32
0
17 Apr 2021
What all do audio transformer models hear? Probing Acoustic
  Representations for Language Delivery and its Structure
What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure
Jui Shah
Yaman Kumar Singla
Changyou Chen
R. Shah
27
81
0
02 Jan 2021
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
24
102
0
22 Oct 2020
Unsupervised Representation Learning for Speaker Recognition via
  Contrastive Equilibrium Learning
Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium Learning
Sung Hwan Mun
Woohyun Kang
Min Hyun Han
N. Kim
SSL
49
21
0
22 Oct 2020
Towards Natural Bilingual and Code-Switched Speech Synthesis Based on
  Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion
Shengkui Zhao
Trung Hieu Nguyen
Hao Wang
B. Ma
16
25
0
16 Oct 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge
  2020: Cascading ASR and TTS
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
T. Toda
DRL
13
39
0
06 Oct 2020
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing
  Synthesizer
Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer
Jie Wu
Jian Luan
25
26
0
18 Jun 2020
Pitchtron: Towards audiobook generation from ordinary people's voices
Pitchtron: Towards audiobook generation from ordinary people's voices
Sunghee Jung
Hoi-Rim Kim
16
5
0
21 May 2020
Investigation of learning abilities on linguistic features in
  sequence-to-sequence text-to-speech synthesis
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
19
31
0
20 May 2020
From Speaker Verification to Multispeaker Speech Synthesis, Deep
  Transfer with Feedback Constraint
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
Zexin Cai
Chuxiong Zhang
Ming Li
24
41
0
10 May 2020
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
820
0
12 Jun 2018
1