ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
The Effectiveness of Discretization in Forecasting: An Empirical Study
  on Neural Time Series Models
The Effectiveness of Discretization in Forecasting: An Empirical Study on Neural Time Series Models
Stephan Rabanser
Tim Januschowski
Valentin Flunkert
David Salinas
Jan Gasthaus
BDLAI4TS
80
20
0
20 May 2020
Deep learning approaches for neural decoding: from CNNs to LSTMs and
  spikes to fMRI
Deep learning approaches for neural decoding: from CNNs to LSTMs and spikes to fMRI
J. Livezey
Joshua I. Glaser
AI4CE
100
9
0
19 May 2020
Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate
Anand Ramakrishnan
Brian Zylich
Erin Ottmar
Jennifer LoCasale-Crouch
Jacob Whitehill
36
27
0
19 May 2020
Vector-quantized neural networks for acoustic unit discovery in the
  ZeroSpeech 2020 challenge
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge
Benjamin van Niekerk
Leanne Nortje
Herman Kamper
120
117
0
19 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End
  Text-To-Speech
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Wenjie Li
Benlai Tang
Xiang Yin
Yushi Zhao
Wei Li
Kang Wang
Hao Huang
Yuxuan Wang
Zejun Ma
70
13
0
19 May 2020
Defending Your Voice: Adversarial Attack on Voice Conversion
Defending Your Voice: Adversarial Attack on Voice Conversion
Chien-yu Huang
Yist Y. Lin
Hung-yi Lee
Lin-Shan Lee
AAML
87
52
0
18 May 2020
A Cyclical Post-filtering Approach to Mismatch Refinement of Neural
  Vocoder for Text-to-speech Systems
A Cyclical Post-filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-speech Systems
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
50
5
0
18 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive
  Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
Tomoki Toda
73
4
0
18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with
  Monotonic Boundary Search
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming-Yuan Liu
Ming Zhou
50
6
0
18 May 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
Tomoki Toda
ViT
94
30
0
18 May 2020
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
71
113
0
17 May 2020
Universal Adversarial Perturbations: A Survey
Universal Adversarial Perturbations: A Survey
Ashutosh Chaubey
Nikhil Agrawal
Kavya Barnwal
K. K. Guliani
Pramod Mehta
OODAAML
110
47
0
16 May 2020
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis
  Using Discrete Speech Representation
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation
Tao Tu
Yuan-Jui Chen
Alexander H. Liu
Hung-yi Lee
54
7
0
16 May 2020
DAMIA: Leveraging Domain Adaptation as a Defense against Membership
  Inference Attacks
DAMIA: Leveraging Domain Adaptation as a Defense against Membership Inference Attacks
Hongwei Huang
Weiqi Luo
Guoqiang Zeng
J. Weng
Yue Zhang
Anjia Yang
AAML
35
26
0
16 May 2020
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE
  Speech Waveform Reconstruction
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction
Yi Zhao
Haoyu Li
Cheng-I Jeff Lai
Jennifer Williams
Erica Cooper
Junichi Yamagishi
84
18
0
16 May 2020
Unsupervised Cross-Domain Speech-to-Speech Conversion with
  Time-Frequency Consistency
Unsupervised Cross-Domain Speech-to-Speech Conversion with Time-Frequency Consistency
M. A. Khan
Fabien Cardinaux
Stefan Uhlich
Marc Ferras
Asja Fischer
26
0
0
15 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech
  without Explicit Alignment
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
D. Lim
Won Jang
Gyeonghwan O
Heayoung Park
Bongwan Kim
Jaesam Yoon
71
37
0
15 May 2020
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
Po-Chun Hsu
Hung-yi Lee
44
16
0
15 May 2020
Reverberation Modeling for Source-Filter-based Neural Vocoder
Reverberation Modeling for Source-Filter-based Neural Vocoder
Yang Ai
Xin Wang
Junichi Yamagishi
Zhenhua Ling
59
3
0
15 May 2020
OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression
OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression
Lila Huang
Shenlong Wang
K. Wong
Jerry Liu
R. Urtasun
3DPC
68
146
0
14 May 2020
Neural Networks Versus Conventional Filters for Inertial-Sensor-based
  Attitude Estimation
Neural Networks Versus Conventional Filters for Inertial-Sensor-based Attitude Estimation
Daniel Weber
C. Gühmann
Thomas Seel
37
35
0
14 May 2020
Foundations and modelling of dynamic networks using Dynamic Graph Neural
  Networks: A survey
Foundations and modelling of dynamic networks using Dynamic Graph Neural Networks: A survey
Joakim Skarding
Bogdan Gabrys
Katarzyna Musial
AI4CE
122
240
0
13 May 2020
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
Zewang Zhang
Qiao Tian
Heng Lu
Ling-Hao Chen
Shan Liu
62
27
0
12 May 2020
FeatherWave: An efficient high-fidelity neural vocoder with multi-band
  linear prediction
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction
Qiao Tian
Zewang Zhang
Heng Lu
Linghui Chen
Shan Liu
69
22
0
12 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem
DiscreTalk: Text-to-Speech as a Machine Translation Problem
Tomoki Hayashi
Shinji Watanabe
70
32
0
12 May 2020
TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian
  Portuguese
TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese
Edresson Casanova
A. Júnior
C. Shulby
F. S. Oliveira
João Paulo Teixeira
M. Ponti
S. Aluísio
75
24
0
11 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality
  Text-to-Speech
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
153
200
0
11 May 2020
GACELA -- A generative adversarial context encoder for long audio
  inpainting
GACELA -- A generative adversarial context encoder for long audio inpainting
Andrés Marafioti
P. Majdak
Nicki Holighaus
Nathanael Perraudin
100
46
0
11 May 2020
A review of radar-based nowcasting of precipitation and applicable
  machine learning techniques
A review of radar-based nowcasting of precipitation and applicable machine learning techniques
R. Prudden
Samantha V. Adams
D. Kangin
Nial H. Robinson
Suman V. Ravuri
S. Mohamed
A. Arribas
AI4ClOffRL
94
45
0
11 May 2020
From Speaker Verification to Multispeaker Speech Synthesis, Deep
  Transfer with Feedback Constraint
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
Zexin Cai
Chuxiong Zhang
Ming Li
73
42
0
10 May 2020
Temporal-Framing Adaptive Network for Heart Sound Segmentation without
  Prior Knowledge of State Duration
Temporal-Framing Adaptive Network for Heart Sound Segmentation without Prior Knowledge of State Duration
Xingyao Wang
Chengyu Liu
Yuwen Li
Xianghong Cheng
Jianqing Li
Gari D. Clifford
MedIm
32
24
0
09 May 2020
Learning to Understand Child-directed and Adult-directed Speech
Learning to Understand Child-directed and Adult-directed Speech
Lieke Gelderloos
Grzegorz Chrupała
Afra Alishahi
61
6
0
06 May 2020
Neural Networks and Value at Risk
Neural Networks and Value at Risk
Alexander Arimond
Damian Borth
Andreas G. F. Hoepner
M. Klawunn
S. Weisheit
37
8
0
04 May 2020
Hard-Coded Gaussian Attention for Neural Machine Translation
Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu You
Simeng Sun
Mohit Iyyer
103
67
0
02 May 2020
Generative Adversarial Networks (GANs Survey): Challenges, Solutions,
  and Future Directions
Generative Adversarial Networks (GANs Survey): Challenges, Solutions, and Future Directions
Divya Saxena
Jiannong Cao
AAMLAI4CE
158
307
0
30 Apr 2020
Jukebox: A Generative Model for Music
Jukebox: A Generative Model for Music
Prafulla Dhariwal
Heewoo Jun
Christine Payne
Jong Wook Kim
Alec Radford
Ilya Sutskever
VLM
214
758
0
30 Apr 2020
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural
  Text-to-Speech
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech
S. Karlapati
Alexis Moinet
Arnaud Joly
V. Klimkov
Daniel Sáez-Trigueros
Thomas Drugman
52
67
0
30 Apr 2020
Detecting Deep-Fake Videos from Appearance and Behavior
Detecting Deep-Fake Videos from Appearance and Behavior
S. Agarwal
Tarek El-Gaaly
Hany Farid
Ser-Nam Lim
PICV
64
169
0
29 Apr 2020
Conditional Spoken Digit Generation with StyleGAN
Conditional Spoken Digit Generation with StyleGAN
Kasperi Palkama
Lauri Juvela
Alexander Ilin
GAN
61
10
0
28 Apr 2020
Time Series Forecasting With Deep Learning: A Survey
Time Series Forecasting With Deep Learning: A Survey
Bryan Lim
S. Zohren
AI4TSAI4CE
128
1,257
0
28 Apr 2020
A Summary of the First Workshop on Language Technology for Language
  Documentation and Revitalization
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization
Graham Neubig
Shruti Rijhwani
Alexis Palmer
Jordan MacKenzie
Hilaria Cruz
...
Yiyuan Li
S. Zink
Mengzhou Xia
Roshan S. Sharma
Patrick Littell
30
8
0
27 Apr 2020
Autoencoding Neural Networks as Musical Audio Synthesizers
Autoencoding Neural Networks as Musical Audio Synthesizers
Joseph T Colonel
C. Curro
S. Keene
MGen
16
2
0
27 Apr 2020
Interpretation of Deep Temporal Representations by Selective
  Visualization of Internally Activated Nodes
Interpretation of Deep Temporal Representations by Selective Visualization of Internally Activated Nodes
Sohee Cho
Ginkyeng Lee
Wonjoon Chang
Jaesik Choi
73
16
0
27 Apr 2020
Low-latency hand gesture recognition with a low resolution thermal
  imager
Low-latency hand gesture recognition with a low resolution thermal imager
Maarten Vandersteegen
Wouter Reusen
Kristof Van Beeck
38
17
0
24 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration
  Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Yu Gu
Xiang Yin
Yonghui Rao
Yuan Wan
Benlai Tang
Yang Zhang
Jitong Chen
Yuxuan Wang
Zejun Ma
91
70
0
23 Apr 2020
Group Activity Detection from Trajectory and Video Data in Soccer
Group Activity Detection from Trajectory and Video Data in Soccer
Ryan Sanford
Siavash Gorji
L. G. Hafemann
B. Pourbabaee
Mehrsan Javan
61
34
0
21 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature
  Survey
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey
Konstantinos Benidis
Syama Sundar Rangapuram
Valentin Flunkert
Bernie Wang
Danielle C. Maddix
...
David Salinas
Lorenzo Stella
François-Xavier Aubet
Laurent Callot
Tim Januschowski
AI4TS
99
202
0
21 Apr 2020
ESPnet-ST: All-in-One Speech Translation Toolkit
ESPnet-ST: All-in-One Speech Translation Toolkit
Hirofumi Inaguma
Shun Kiyono
Kevin Duh
Shigeki Karita
Nelson Yalta
Tomoki Hayashi
Shinji Watanabe
118
166
0
21 Apr 2020
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech
  System
Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
V. Phung
Phan Huy Kinh
Anh-Tuan Dinh
Quoc Bao Nguyen
35
5
0
20 Apr 2020
Speech Paralinguistic Approach for Detecting Dementia Using Gated
  Convolutional Neural Network
Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
M. R. Makiuchi
Tifani Warnita
Nakamasa Inoue
Koichi Shinoda
M. Yoshimura
Momoko Kitazawa
K. Funaki
Yoko Eguchi
T. Kishimoto
56
11
0
16 Apr 2020
Previous
123...404142...606162
Next