ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
GANs & Reels: Creating Irish Music using a Generative Adversarial
  Network
GANs & Reels: Creating Irish Music using a Generative Adversarial Network
A. Kolokolova
M. Billard
Robert Bishop
Moustafa Elsisy
Zachary Northcott
Laura Graves
Vineel Nagisetty
Heather Patey
GAN
34
8
0
29 Oct 2020
The IQIYI System for Voice Conversion Challenge 2020
The IQIYI System for Voice Conversion Challenge 2020
Wendong Gan
Haitao Chen
Yin Yan
Jianwei Li
Bolong Wen
Xueping Xu
Hai Li
26
0
0
29 Oct 2020
Speech Synthesis and Control Using Differentiable DSP
Speech Synthesis and Control Using Differentiable DSP
Giorgio Fabbro
Vladimir Golkov
Thomas Kemp
Zorah Lähner
78
12
0
28 Oct 2020
PPG-based singing voice conversion with adversarial representation
  learning
PPG-based singing voice conversion with adversarial representation learning
Zhonghao Li
Benlai Tang
Xiang Yin
Yuan Wan
Linjia Xu
Chen Shen
Zejun Ma
59
37
0
28 Oct 2020
Upsampling artifacts in neural audio synthesis
Upsampling artifacts in neural audio synthesis
Jordi Pons
Santiago Pascual
Giulio Cengarle
Joan Serrà
95
64
0
27 Oct 2020
Parallel waveform synthesis based on generative adversarial networks
  with voicing-aware conditional discriminators
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Ryuichi Yamamoto
Eunwoo Song
Min-Jae Hwang
Jae-Min Kim
76
18
0
27 Oct 2020
Benchmarking Deep Learning Interpretability in Time Series Predictions
Benchmarking Deep Learning Interpretability in Time Series Predictions
Aya Abdelsalam Ismail
Mohamed K. Gunady
H. C. Bravo
Soheil Feizi
XAIAI4TSFAtt
72
174
0
26 Oct 2020
Shimon the Robot Film Composer and DeepScore: An LSTM for Generation of
  Film Scores based on Visual Analysis
Shimon the Robot Film Composer and DeepScore: An LSTM for Generation of Film Scores based on Visual Analysis
Richard J. Savery
Gil Weinberg
29
7
0
26 Oct 2020
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality
  Speech Synthesis
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis
Min-Jae Hwang
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
44
32
0
26 Oct 2020
LagNetViP: A Lagrangian Neural Network for Video Prediction
LagNetViP: A Lagrangian Neural Network for Video Prediction
Christine Allen-Blanchette
Sushant Veer
Anirudha Majumdar
Naomi Ehrich Leonard
112
31
0
24 Oct 2020
Autoregressive Score Matching
Autoregressive Score Matching
Chenlin Meng
Lantao Yu
Yang Song
Jiaming Song
Stefano Ermon
DiffM
241
14
0
24 Oct 2020
A Comparison of Discrete Latent Variable Models for Speech
  Representation Learning
A Comparison of Discrete Latent Variable Models for Speech Representation Learning
Henry Zhou
Alexei Baevski
Michael Auli
DRL
67
10
0
24 Oct 2020
Show and Speak: Directly Synthesize Spoken Description of Images
Show and Speak: Directly Synthesize Spoken Description of Images
Xinsheng Wang
Siyuan Feng
Jihua Zhu
M. Hasegawa-Johnson
O. Scharenborg
154
4
0
23 Oct 2020
Listening to Sounds of Silence for Speech Denoising
Listening to Sounds of Silence for Speech Denoising
Ruilin Xu
Rundi Wu
Y. Ishiwaka
Carl Vondrick
Changxi Zheng
66
33
0
22 Oct 2020
Limitations of Autoregressive Models and Their Alternatives
Limitations of Autoregressive Models and Their Alternatives
Chu-cheng Lin
Aaron Jaech
Xin Li
Matthew R. Gormley
Jason Eisner
89
64
0
22 Oct 2020
CryptoGRU: Low Latency Privacy-Preserving Text Analysis With GRU
CryptoGRU: Low Latency Privacy-Preserving Text Analysis With GRU
Bo Feng
Qian Lou
Lei Jiang
Geoffrey C. Fox
68
15
0
22 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
Yao Shi
Hui Bu
Xin Xu
Shaojing Zhang
Ming Li
119
223
0
22 Oct 2020
How Similar or Different Is Rakugo Speech Synthesizer to Professional
  Performers?
How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers?
Shuhei Kato
Yusuke Yasuda
Xin Wang
Erica Cooper
Junichi Yamagishi
21
0
0
22 Oct 2020
Convolutional Autoencoders for Human Motion Infilling
Convolutional Autoencoders for Human Motion Infilling
Manuel Kaufmann
Emre Aksan
Mingli Song
Fabrizio Pece
R. Ziegler
Otmar Hilliges
3DH
54
102
0
22 Oct 2020
The NTU-AISG Text-to-speech System for Blizzard Challenge 2020
The NTU-AISG Text-to-speech System for Blizzard Challenge 2020
Haobo Zhang
Tingzhi Mao
Haihua Xu
Hao-Ming Huang
92
1
0
22 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
78
103
0
22 Oct 2020
NU-GAN: High resolution neural upsampling with GAN
NU-GAN: High resolution neural upsampling with GAN
Rithesh Kumar
Kundan Kumar
Vicki Anand
Yoshua Bengio
Aaron Courville
65
26
0
22 Oct 2020
Learning to Summarize Long Texts with Memory Compression and Transfer
Learning to Summarize Long Texts with Memory Compression and Transfer
Jaehong Park
Jonathan Pilault
C. Pal
44
0
0
21 Oct 2020
Transferable Graph Optimizers for ML Compilers
Transferable Graph Optimizers for ML Compilers
Yanqi Zhou
Sudip Roy
AmirAli Abdolrashidi
Daniel Wong
Peter C. Ma
...
Mangpo Phitchaya Phothilimtha
Shen Wang
Anna Goldie
Azalia Mirhoseini
James Laudon
GNN
73
55
0
21 Oct 2020
Improving Audio Anomalies Recognition Using Temporal Convolutional
  Attention Network
Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Network
Qiang Huang
Thomas Hain
42
10
0
21 Oct 2020
WaveTransformer: A Novel Architecture for Audio Captioning Based on
  Learning Temporal and Time-Frequency Information
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information
An Tran
Konstantinos Drossos
Tuomas Virtanen
106
19
0
21 Oct 2020
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
68
17
0
19 Oct 2020
A combined full-reference image quality assessment approach based on
  convolutional activation maps
A combined full-reference image quality assessment approach based on convolutional activation maps
D. Varga
57
7
0
19 Oct 2020
Melody Classifier with Stacked-LSTM
Melody Classifier with Stacked-LSTM
You Li
Zhuowen Lin
15
1
0
16 Oct 2020
Sobolev training of thermodynamic-informed neural networks for smoothed
  elasto-plasticity models with level set hardening
Sobolev training of thermodynamic-informed neural networks for smoothed elasto-plasticity models with level set hardening
Nikolaos N. Vlassis
WaiChing Sun
AI4CE
41
2
0
15 Oct 2020
The NeteaseGames System for Voice Conversion Challenge 2020 with
  Vector-quantization Variational Autoencoder and WaveNet
The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet
Haitong Zhang
DRL
31
4
0
15 Oct 2020
Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit
  Latent Features
Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features
Myeongah Cho
Taeoh Kim
Woojin Kim
Suhwan Cho
Sangyoun Lee
98
95
0
15 Oct 2020
Medical Code Assignment with Gated Convolution and Note-Code Interaction
Medical Code Assignment with Gated Convolution and Note-Code Interaction
Shaoxiong Ji
Shirui Pan
Pekka Marttinen
MedIm
103
18
0
14 Oct 2020
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context
  Modeling
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu
Wei Han
Anmol Gulati
Chung-Cheng Chiu
Yue Liu
Tara N. Sainath
Yonghui Wu
Ruoming Pang
125
19
0
12 Oct 2020
The Cone of Silence: Speech Separation by Localization
The Cone of Silence: Speech Separation by Localization
Teerapat Jenrungrot
V. Jayaram
S. M. Seitz
Ira Kemelmacher-Shlizerman
83
56
0
12 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
183
1,958
0
12 Oct 2020
Enhancement Of Coded Speech Using a Mask-Based Post-Filter
Enhancement Of Coded Speech Using a Mask-Based Post-Filter
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
36
14
0
12 Oct 2020
AI Song Contest: Human-AI Co-Creation in Songwriting
AI Song Contest: Human-AI Co-Creation in Songwriting
Cheng-Zhi Anna Huang
Hendrik Vincent Koops
Ed Newton-Rex
Monica Dinculescu
Carrie J. Cai
57
92
0
12 Oct 2020
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
  Systems for the WMT20 News Translation Task
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
Z. Li
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
66
15
0
11 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020:
  On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural
  Vocoders
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
86
8
0
09 Oct 2020
Baseline System of Voice Conversion Challenge 2020 with Cyclic
  Variational Autoencoder and Parallel WaveGAN
Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN
Patrick Lumban Tobing
Yi-Chiao Wu
Tomoki Toda
DRL
60
14
0
09 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis
  Including Unsupervised Duration Modeling
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
106
112
0
08 Oct 2020
Randomized Overdrive Neural Networks
Randomized Overdrive Neural Networks
C. Steinmetz
Joshua D. Reiss
50
4
0
08 Oct 2020
FastVC: Fast Voice Conversion with non-parallel data
FastVC: Fast Voice Conversion with non-parallel data
Oriol Barbany
Milos Cernak
43
7
0
08 Oct 2020
Automating Inference of Binary Microlensing Events with Neural Density
  Estimation
Automating Inference of Binary Microlensing Events with Neural Density Estimation
Keming 名 Zhang 张 可
J. Bloom
B. Gaudi
F. Lanusse
C. Lam
Jessica R. Lu
21
1
0
08 Oct 2020
A Survey of Deep Meta-Learning
A Survey of Deep Meta-Learning
Mike Huisman
Jan N. van Rijn
Aske Plaat
201
335
0
07 Oct 2020
Improving Sequential Latent Variable Models with Autoregressive Flows
Improving Sequential Latent Variable Models with Autoregressive Flows
Joseph Marino
Lei Chen
Jiawei He
Stephan Mandt
BDLAI4TS
127
12
0
07 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed
  Langevin Dynamics
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
124
21
0
06 Oct 2020
Digital Voicing of Silent Speech
Digital Voicing of Silent Speech
David Gaddy
Dana Klein
64
56
0
06 Oct 2020
A Contrastive Learning Approach for Training Variational Autoencoder
  Priors
A Contrastive Learning Approach for Training Variational Autoencoder Priors
J. Aneja
Alex Schwing
Jan Kautz
Arash Vahdat
DRL
126
83
0
06 Oct 2020
Previous
123...353637...606162
Next