Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.02111
Cited By
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
5 January 2023
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
Shujie Liu
Zhuo Chen
Yanqing Liu
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers"
14 / 114 papers shown
Title
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
408
24,160
0
26 Jul 2019
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019
Andros Tjandra
Berrak Sisman
Mingyang Zhang
S. Sakti
Haizhou Li
Satoshi Nakamura
53
71
0
27 May 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
68
933
0
05 Apr 2019
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R. Prenger
Rafael Valle
Bryan Catanzaro
129
1,024
0
31 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
966
93,936
0
11 Oct 2018
Sample Efficient Adaptive Text-to-Speech
Yutian Chen
Yannis Assael
Brendan Shillingford
David Budden
Scott E. Reed
...
Ben Laurie
Çağlar Gülçehre
Aaron van den Oord
Oriol Vinyals
Nando de Freitas
66
149
0
27 Sep 2018
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung
Yuxuan Wang
Wei-Ning Hsu
Yu Zhang
RJ Skerry-Ryan
51
117
0
30 Aug 2018
The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems
Adaeze Adigwe
Noé Tits
Kevin El Haddad
Sarah Ostadabbas
Thierry Dutoit
14
79
0
25 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
240
826
0
12 Jun 2018
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System
Weicheng Cai
Jinkun Chen
Ming Li
46
331
0
14 Apr 2018
Neural Voice Cloning with a Few Samples
Sercan O. Arik
Jitong Chen
Kainan Peng
Ming-Yu Liu
Yanqi Zhou
46
384
0
14 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
68
2,684
0
16 Dec 2017
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
161
4,928
0
02 Nov 2017
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
303
7,361
0
12 Sep 2016
Previous
1
2
3