Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.11004
Cited By
NAUTILUS: a Versatile Voice Cloning System
22 May 2020
Hieu-Thi Luong
Junichi Yamagishi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NAUTILUS: a Versatile Voice Cloning System"
39 / 39 papers shown
Title
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
97
40
0
07 Aug 2020
Vocoder-Based Speech Synthesis from Silent Videos
Daniel Michelsanti
Olga Slizovskaia
G. Haro
Emilia Gómez
Zheng-Hua Tan
Jesper Jensen
60
31
0
06 Apr 2020
Speech Synthesis using EEG
G. Krishna
Co Tran
Yan Han
Mason Carnahan
38
48
0
22 Feb 2020
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
346
10,591
0
17 Feb 2020
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
54
98
0
14 Dec 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
Tomoki Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
85
205
0
24 Oct 2019
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech
Hieu-Thi Luong
Junichi Yamagishi
46
17
0
14 Sep 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Zhiwen Chen
RJ Skerry-Ryan
Ye Jia
Andrew Rosenberg
Bhuvana Ramabhadran
45
188
0
09 Jul 2019
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
59
99
0
25 Jun 2019
A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation
Hieu-Thi Luong
Junichi Yamagishi
59
10
0
18 Jun 2019
Neural source-filter waveform models for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
66
118
0
27 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
96
947
0
05 Apr 2019
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang Zhang
Xin Wang
Fuming Fang
Haizhou Li
Junichi Yamagishi
33
50
0
29 Mar 2019
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
Wen-Chin Huang
Yi-Chiao Wu
Hsin-Te Hwang
Patrick Lumban Tobing
Tomoki Hayashi
Kazuhiro Kobayashi
Tomoki Toda
Yu Tsao
H. Wang
39
20
0
27 Nov 2018
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
56
112
0
09 Nov 2018
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
151
1,029
0
31 Oct 2018
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
76
149
0
27 Sep 2018
Multimodal speech synthesis architecture for unsupervised speaker adaptation
Hieu-Thi Luong
Junichi Yamagishi
39
10
0
20 Aug 2018
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Yi Zhao
Shinji Takaki
Hieu-Thi Luong
Junichi Yamagishi
Daisuke Saito
Nobuaki Minematsu
44
63
0
31 Jul 2018
Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems
Hieu-Thi Luong
Junichi Yamagishi
69
7
0
31 Jul 2018
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
Ming-Yu Liu
Kainan Peng
Jitong Chen
53
346
0
19 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
251
828
0
12 Jun 2018
StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
62
372
0
06 Jun 2018
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods
Jaime Lorenzo-Trueba
Junichi Yamagishi
Tomoki Toda
Daisuke Saito
F. Villavicencio
Tomi Kinnunen
Zhenhua Ling
50
320
0
12 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
93
1,501
0
30 Mar 2018
Linear networks based speaker adaptation for speech synthesis
Zhiying Huang
Heng Lu
Ming Lei
Zhijie Yan
30
14
0
05 Mar 2018
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data
Jaime Lorenzo-Trueba
Fuming Fang
Xin Wang
Isao Echizen
Junichi Yamagishi
Tomi Kinnunen
35
73
0
02 Mar 2018
Neural Voice Cloning with a Few Samples
Sercan O. Arik
Jitong Chen
Kainan Peng
Ming-Yu Liu
Yanqi Zhou
58
386
0
14 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,694
0
16 Dec 2017
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention
Hideyuki Tachibana
Katsuya Uenoyama
Shunsuke Aihara
52
266
0
24 Oct 2017
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks
Jen-Cheng Hou
Syu-Siang Wang
Ying-Hui Lai
Yu Tsao
Hsiu-Wen Chang
H. Wang
77
198
0
01 Sep 2017
Listening while Speaking: Speech Chain by Deep Learning
Andros Tjandra
S. Sakti
Satoshi Nakamura
AuLLM
147
166
0
16 Jul 2017
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities
Hiroyuki Miyoshi
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
53
60
0
10 Apr 2017
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
155
1,819
0
29 Mar 2017
SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
Soroush Mehri
Kundan Kumar
Ishaan Gulrajani
Rithesh Kumar
Shubham Jain
Jose M. R. Sotelo
Aaron Courville
Yoshua Bengio
100
598
0
22 Dec 2016
Quasi-Recurrent Neural Networks
James Bradbury
Stephen Merity
Caiming Xiong
R. Socher
136
441
0
05 Nov 2016
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
85
303
0
13 Oct 2016
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
368
7,381
0
12 Sep 2016
Autoencoding beyond pixels using a learned similarity metric
Anders Boesen Lindbo Larsen
Søren Kaae Sønderby
Hugo Larochelle
Ole Winther
GAN
163
2,066
0
31 Dec 2015
1