ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.11004
  4. Cited By
NAUTILUS: a Versatile Voice Cloning System

NAUTILUS: a Versatile Voice Cloning System

22 May 2020
Hieu-Thi Luong
Junichi Yamagishi
ArXivPDFHTML

Papers citing "NAUTILUS: a Versatile Voice Cloning System"

39 / 39 papers shown
Title
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
97
40
0
07 Aug 2020
Vocoder-Based Speech Synthesis from Silent Videos
Vocoder-Based Speech Synthesis from Silent Videos
Daniel Michelsanti
Olga Slizovskaia
G. Haro
Emilia Gómez
Zheng-Hua Tan
Jesper Jensen
60
31
0
06 Apr 2020
Speech Synthesis using EEG
Speech Synthesis using EEG
G. Krishna
Co Tran
Yan Han
Mason Carnahan
38
48
0
22 Feb 2020
Decision-Making with Auto-Encoding Variational Bayes
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
346
10,591
0
17 Feb 2020
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using
  Transformer with Text-to-Speech Pretraining
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
54
98
0
14 Dec 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source
  End-to-End Text-to-Speech Toolkit
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
Tomoki Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
85
205
0
24 Oct 2019
Bootstrapping non-parallel voice conversion from speaker-adaptive
  text-to-speech
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech
Hieu-Thi Luong
Junichi Yamagishi
46
17
0
14 Sep 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech
  Synthesis and Cross-Language Voice Cloning
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Zhiwen Chen
RJ Skerry-Ryan
Ye Jia
Andrew Rosenberg
Bhuvana Ramabhadran
45
188
0
09 Jul 2019
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled
  Linguistic and Speaker Representations
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
59
99
0
25 Jun 2019
A Unified Speaker Adaptation Method for Speech Synthesis using
  Transcribed and Untranscribed Speech with Backpropagation
A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation
Hieu-Thi Luong
Junichi Yamagishi
59
10
0
18 Jun 2019
Neural source-filter waveform models for statistical parametric speech
  synthesis
Neural source-filter waveform models for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
66
118
0
27 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
96
947
0
05 Apr 2019
Joint training framework for text-to-speech and voice conversion using
  multi-source Tacotron and WaveNet
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang Zhang
Xin Wang
Fuming Fang
Haizhou Li
Junichi Yamagishi
33
50
0
29 Mar 2019
Refined WaveNet Vocoder for Variational Autoencoder Based Voice
  Conversion
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
Wen-Chin Huang
Yi-Chiao Wu
Hsin-Te Hwang
Patrick Lumban Tobing
Tomoki Hayashi
Kazuhiro Kobayashi
Tomoki Toda
Yu Tsao
H. Wang
39
20
0
27 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and
  Context Preservation Mechanisms
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
56
112
0
09 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
151
1,029
0
31 Oct 2018
Sample Efficient Adaptive Text-to-Speech
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
76
149
0
27 Sep 2018
Multimodal speech synthesis architecture for unsupervised speaker
  adaptation
Multimodal speech synthesis architecture for unsupervised speaker adaptation
Hieu-Thi Luong
Junichi Yamagishi
39
10
0
20 Aug 2018
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for
  Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Yi Zhao
Shinji Takaki
Hieu-Thi Luong
Junichi Yamagishi
Daisuke Saito
Nobuaki Minematsu
44
63
0
31 Jul 2018
Scaling and bias codes for modeling speaker-adaptive DNN-based speech
  synthesis systems
Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems
Hieu-Thi Luong
Junichi Yamagishi
69
7
0
31 Jul 2018
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
Ming-Yu Liu
Kainan Peng
Jitong Chen
53
346
0
19 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
251
828
0
12 Jun 2018
StarGAN-VC: Non-parallel many-to-many voice conversion with star
  generative adversarial networks
StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
62
372
0
06 Jun 2018
The Voice Conversion Challenge 2018: Promoting Development of Parallel
  and Nonparallel Methods
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods
Jaime Lorenzo-Trueba
Junichi Yamagishi
Tomoki Toda
Daisuke Saito
F. Villavicencio
Tomi Kinnunen
Zhenhua Ling
50
320
0
12 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
93
1,501
0
30 Mar 2018
Linear networks based speaker adaptation for speech synthesis
Linear networks based speaker adaptation for speech synthesis
Zhiying Huang
Heng Lu
Ming Lei
Zhijie Yan
30
14
0
05 Mar 2018
Can we steal your vocal identity from the Internet?: Initial
  investigation of cloning Obama's voice using GAN, WaveNet and low-quality
  found data
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data
Jaime Lorenzo-Trueba
Fuming Fang
Xin Wang
Isao Echizen
Junichi Yamagishi
Tomi Kinnunen
35
73
0
02 Mar 2018
Neural Voice Cloning with a Few Samples
Neural Voice Cloning with a Few Samples
Sercan O. Arik
Jitong Chen
Kainan Peng
Ming-Yu Liu
Yanqi Zhou
58
386
0
14 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,694
0
16 Dec 2017
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional
  Networks with Guided Attention
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention
Hideyuki Tachibana
Katsuya Uenoyama
Shunsuke Aihara
52
266
0
24 Oct 2017
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional
  Neural Networks
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks
Jen-Cheng Hou
Syu-Siang Wang
Ying-Hui Lai
Yu Tsao
Hsiu-Wen Chang
H. Wang
77
198
0
01 Sep 2017
Listening while Speaking: Speech Chain by Deep Learning
Listening while Speaking: Speech Chain by Deep Learning
Andros Tjandra
S. Sakti
Satoshi Nakamura
AuLLM
147
166
0
16 Jul 2017
Voice Conversion Using Sequence-to-Sequence Learning of Context
  Posterior Probabilities
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities
Hiroyuki Miyoshi
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
53
60
0
10 Apr 2017
Tacotron: Towards End-to-End Speech Synthesis
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
155
1,819
0
29 Mar 2017
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
Soroush Mehri
Kundan Kumar
Ishaan Gulrajani
Rithesh Kumar
Shubham Jain
Jose M. R. Sotelo
Aaron Courville
Yoshua Bengio
100
598
0
22 Dec 2016
Quasi-Recurrent Neural Networks
Quasi-Recurrent Neural Networks
James Bradbury
Stephen Merity
Caiming Xiong
R. Socher
136
441
0
05 Nov 2016
Voice Conversion from Non-parallel Corpora Using Variational
  Auto-encoder
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
85
303
0
13 Oct 2016
WaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
368
7,381
0
12 Sep 2016
Autoencoding beyond pixels using a learned similarity metric
Autoencoding beyond pixels using a learned similarity metric
Anders Boesen Lindbo Larsen
Søren Kaae Sønderby
Hugo Larochelle
Ole Winther
GAN
163
2,066
0
31 Dec 2015
1