ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
A Differentiable Perceptual Audio Metric Learned from Just Noticeable
  Differences
A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences
Pranay Manocha
Adam Finkelstein
Richard Y. Zhang
Nicholas J. Bryan
G. J. Mysore
Zeyu Jin
89
69
0
13 Jan 2020
Advbox: a toolbox to generate adversarial examples that fool neural
  networks
Advbox: a toolbox to generate adversarial examples that fool neural networks
Dou Goodman
Xin Hao
Yang Wang
Yuesheng Wu
Junfeng Xiong
Huan Zhang
AAML
138
55
0
13 Jan 2020
Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and
  On-Device Inference
Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and On-Device Inference
Changhao Chen
Peijun Zhao
Chris Xiaoxuan Lu
Wei Wang
Andrew Markham
A. Trigoni
71
114
0
13 Jan 2020
Temporally Folded Convolutional Neural Networks for Sequence Forecasting
Temporally Folded Convolutional Neural Networks for Sequence Forecasting
Matthias Weissenbacher
AI4TS
41
0
0
10 Jan 2020
Mel-spectrogram augmentation for sequence to sequence voice conversion
Mel-spectrogram augmentation for sequence to sequence voice conversion
Yeongtae Hwang
Hyemin Cho
Hongsun Yang
Dong-Ok Won
Insoo Oh
Seong-Whan Lee
65
15
0
06 Jan 2020
Temporal Tensor Transformation Network for Multivariate Time Series
  Prediction
Temporal Tensor Transformation Network for Multivariate Time Series Prediction
Yuya Jeremy Ong
Mu Qiao
D. Jadav
AI4TS
20
4
0
04 Jan 2020
Deep Representation Learning in Speech Processing: Challenges, Recent
  Advances, and Future Trends
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
S. Latif
R. Rana
Sara Khalifa
Raja Jurdak
Junaid Qadir
Björn W. Schuller
AI4TS
96
82
0
02 Jan 2020
Prediction of wall-bounded turbulence from wall quantities using
  convolutional neural networks
Prediction of wall-bounded turbulence from wall quantities using convolutional neural networks
L. Guastoni
M. P. Encinar
P. Schlatter
Hossein Azizpour
R. Vinuesa
MDE
37
33
0
30 Dec 2019
nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using
  1D Convolution Neural Networks
nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks
K. Cheuk
Hans Anderson
Kat R. Agres
Dorien Herremans
248
5
0
27 Dec 2019
Synthesising Expressiveness in Peking Opera via Duration Informed
  Attention Network
Synthesising Expressiveness in Peking Opera via Duration Informed Attention Network
Yusong Wu
Shengchen Li
Chengzhu Yu
Heng Lu
Chao Weng
Liqiang Zhang
Dong Yu
59
5
0
27 Dec 2019
Score and Lyrics-Free Singing Voice Generation
Score and Lyrics-Free Singing Voice Generation
Jen-Yu Liu
Yu-Hua Chen
Yin-Cheng Yeh
Yi-Hsuan Yang
70
22
0
26 Dec 2019
Deep Learning-based Vehicle Behaviour Prediction For Autonomous Driving
  Applications: A Review
Deep Learning-based Vehicle Behaviour Prediction For Autonomous Driving Applications: A Review
Sajjad Mozaffari
Omar Y. Al-Jarrah
M. Dianati
P. Jennings
A. Mouzakitis
93
135
0
25 Dec 2019
Multi-level Convolutional Autoencoder Networks for Parametric Prediction
  of Spatio-temporal Dynamics
Multi-level Convolutional Autoencoder Networks for Parametric Prediction of Spatio-temporal Dynamics
Jiayang Xu
Karthik Duraisamy
AI4CE
95
143
0
23 Dec 2019
Probing the phonetic and phonological knowledge of tones in Mandarin TTS
  models
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
Jian Zhu
62
8
0
23 Dec 2019
Learning Singing From Speech
Learning Singing From Speech
Liqiang Zhang
Chengzhu Yu
Heng Lu
Chao Weng
Yusong Wu
Xiang Xie
Zijin Li
Dong Yu
53
8
0
20 Dec 2019
HiLLoC: Lossless Image Compression with Hierarchical Latent Variable
  Models
HiLLoC: Lossless Image Compression with Hierarchical Latent Variable Models
James Townsend
Thomas Bird
Julius Kunze
David Barber
BDLVLM
151
56
0
20 Dec 2019
Vertex Feature Encoding and Hierarchical Temporal Modeling in a
  Spatial-Temporal Graph Convolutional Network for Action Recognition
Vertex Feature Encoding and Hierarchical Temporal Modeling in a Spatial-Temporal Graph Convolutional Network for Action Recognition
Konstantinos Papadopoulos
Enjie Ghorbel
Djamila Aouada
Björn E. Ottersten
GNN
162
42
0
20 Dec 2019
Smart Home Appliances: Chat with Your Fridge
Smart Home Appliances: Chat with Your Fridge
Denis A. Gudovskiy
Gyuri Han
Takuya Yamaguchi
Sotaro Tsukizawa
LRM
26
4
0
19 Dec 2019
Learning a Spatio-Temporal Embedding for Video Instance Segmentation
Learning a Spatio-Temporal Embedding for Video Instance Segmentation
Anthony Hu
Alex Kendall
R. Cipolla
123
19
0
19 Dec 2019
Water Supply Prediction Based on Initialized Attention Residual Network
Water Supply Prediction Based on Initialized Attention Residual Network
Yu Long
Jingcheng Wang
Jingyi Wang
28
1
0
17 Dec 2019
A Transferable Adaptive Domain Adversarial Neural Network for Virtual
  Reality Augmented EMG-Based Gesture Recognition
A Transferable Adaptive Domain Adversarial Neural Network for Virtual Reality Augmented EMG-Based Gesture Recognition
Ulysse Côté-Allard
Gabriel Gagnon-Turcotte
A. Phinyomark
K. Glette
Erik J. Scheme
François Laviolette
Benoit Gosselin
41
7
0
16 Dec 2019
Efficient Convolutional Neural Networks for Diacritic Restoration
Efficient Convolutional Neural Networks for Diacritic Restoration
Sawsan Alqahtani
Ajay K. Mishra
Mona T. Diab
46
24
0
14 Dec 2019
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using
  Transformer with Text-to-Speech Pretraining
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
80
99
0
14 Dec 2019
SPIN: A High Speed, High Resolution Vision Dataset for Tracking and
  Action Recognition in Ping Pong
SPIN: A High Speed, High Resolution Vision Dataset for Tracking and Action Recognition in Ping Pong
S. Schwarcz
Peng Xu
Davide D‘Ambrosio
Juhana Kangaspunta
A. Angelova
Huong Phan
Navdeep Jaitly
65
7
0
13 Dec 2019
Human Motion Anticipation with Symbolic Label
Human Motion Anticipation with Symbolic Label
Julian Tanke
A. Weber
Juergen Gall
76
6
0
12 Dec 2019
Singing Synthesis: with a little help from my attention
Singing Synthesis: with a little help from my attention
Orazio Angelini
Alexis Moinet
K. Yanagisawa
Thomas Drugman
61
17
0
12 Dec 2019
Learning to Model Aspects of Hearing Perception Using Neural Loss
  Functions
Learning to Model Aspects of Hearing Perception Using Neural Loss Functions
Prateek Verma
J. Berger
AAML
31
3
0
11 Dec 2019
Incrementally Improving Graph WaveNet Performance on Traffic Prediction
Incrementally Improving Graph WaveNet Performance on Traffic Prediction
Sam Shleifer
Clara H. McCreery
Vamsi Chitters
GNNAI4TS
74
21
0
11 Dec 2019
Neural Voice Puppetry: Audio-driven Facial Reenactment
Neural Voice Puppetry: Audio-driven Facial Reenactment
Justus Thies
Mohamed A. Elgharib
A. Tewari
Christian Theobalt
Matthias Nießner
VGen3DH
88
375
0
11 Dec 2019
Encoding Musical Style with Transformer Autoencoders
Encoding Musical Style with Transformer Autoencoders
Kristy Choi
Curtis Hawthorne
Ian Simon
Monica Dinculescu
Jesse Engel
95
90
0
10 Dec 2019
Deep symbolic regression: Recovering mathematical expressions from data
  via risk-seeking policy gradients
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients
Brenden K. Petersen
Mikel Landajuela
T. Nathan Mundhenk
Claudio Santiago
Soo K. Kim
Joanne T. Kim
68
320
0
10 Dec 2019
MITAS: A Compressed Time-Domain Audio Separation Network with Parameter
  Sharing
MITAS: A Compressed Time-Domain Audio Separation Network with Parameter Sharing
Chao-I Tuan
Yuan-Kuei Wu
Hung-yi Lee
Yu Tsao
30
2
0
09 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
143
252
0
06 Dec 2019
Normalizing Flows for Probabilistic Modeling and Inference
Normalizing Flows for Probabilistic Modeling and Inference
George Papamakarios
Eric T. Nalisnick
Danilo Jimenez Rezende
S. Mohamed
Balaji Lakshminarayanan
TPMAI4CE
219
1,725
0
05 Dec 2019
Towards Robust Neural Vocoding for Speech Generation: A Survey
Towards Robust Neural Vocoding for Speech Generation: A Survey
Po-Chun Hsu
Chun-hsuan Wang
Andy T. Liu
Hung-yi Lee
OOD
78
25
0
05 Dec 2019
PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial
  Network
PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network
Chen Deng
Chengzhu Yu
Heng Lu
Chao Weng
Dong Yu
SSL
71
40
0
04 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio
WaveFlow: A Compact Flow-based Model for Raw Audio
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
104
117
0
03 Dec 2019
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
Leyuan Sheng
Dong-Yan Huang
Evgeny Nikolaevich Pavlovskiy
82
15
0
03 Dec 2019
Bayesian-Deep-Learning Estimation of Earthquake Location from
  Single-Station Observations
Bayesian-Deep-Learning Estimation of Earthquake Location from Single-Station Observations
S. Mousavi
G. Beroza
BDL
26
104
0
03 Dec 2019
Long Distance Relationships without Time Travel: Boosting the
  Performance of a Sparse Predictive Autoencoder in Sequence Modeling
Long Distance Relationships without Time Travel: Boosting the Performance of a Sparse Predictive Autoencoder in Sequence Modeling
J. Gordon
D. Rawlinson
Subutai Ahmad
61
5
0
02 Dec 2019
TeaNet: universal neural network interatomic potential inspired by
  iterative electronic relaxations
TeaNet: universal neural network interatomic potential inspired by iterative electronic relaxations
So Takamoto
S. Izumi
Ju Li
GNN
65
80
0
02 Dec 2019
Flow Contrastive Estimation of Energy-Based Models
Flow Contrastive Estimation of Energy-Based Models
Ruiqi Gao
Erik Nijkamp
Diederik P. Kingma
Zhen Xu
Andrew M. Dai
Ying Nian Wu
GAN
101
115
0
02 Dec 2019
STConvS2S: Spatiotemporal Convolutional Sequence to Sequence Network for
  Weather Forecasting
STConvS2S: Spatiotemporal Convolutional Sequence to Sequence Network for Weather Forecasting
Rafaela C. Nascimento
Y. M. Souto
Eduardo S. Ogasawara
Fábio Porto
Eduardo Bezerra
AI4TS
98
89
0
30 Nov 2019
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis
  of Expressive Speech
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech
Vatsal Aggarwal
Marius Cotescu
N. Prateek
Jaime Lorenzo-Trueba
Roberto Barra-Chicote
98
31
0
28 Nov 2019
Designing the Next Generation of Intelligent Personal Robotic Assistants
  for the Physically Impaired
Designing the Next Generation of Intelligent Personal Robotic Assistants for the Physically Impaired
Basit Ayantunde
Jane Odum
Fadlullah Olawumi
Joshua Olalekan
23
1
0
28 Nov 2019
AR-Net: A simple Auto-Regressive Neural Network for time-series
AR-Net: A simple Auto-Regressive Neural Network for time-series
Oskar Triebe
N. Laptev
Ram Rajagopal
AI4TSAI4CE
102
59
0
27 Nov 2019
ConCare: Personalized Clinical Feature Embedding via Capturing the
  Healthcare Context
ConCare: Personalized Clinical Feature Embedding via Capturing the Healthcare Context
Liantao Ma
Chaohe Zhang
Yasha Wang
Wenjie Ruan
Jiantao Wang
Wen Tang
Xinyu Ma
Xin Gao
Junyi Gao
66
159
0
27 Nov 2019
AdaCare: Explainable Clinical Health Status Representation Learning via
  Scale-Adaptive Feature Extraction and Recalibration
AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration
Liantao Ma
Junyi Gao
Yasha Wang
Chaohe Zhang
Jiangtao Wang
Wenjie Ruan
Wen Tang
Xin Gao
Xinyu Ma
OODAI4TS
79
121
0
27 Nov 2019
Jejueo Datasets for Machine Translation and Speech Synthesis
Jejueo Datasets for Machine Translation and Speech Synthesis
Kyubyong Park
Yo Joong Choe
Jiyeon Ham
26
5
0
27 Nov 2019
SchrödingeRNN: Generative Modeling of Raw Audio as a Continuously
  Observed Quantum State
SchrödingeRNN: Generative Modeling of Raw Audio as a Continuously Observed Quantum State
Beñat Mencia Uranga
A. Lamacraft
96
3
0
26 Nov 2019
Previous
123...434445...606162
Next