ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
Mel Spectrogram Inversion with Stable Pitch
Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi
M. Levy
Richard Sharp
95
6
0
26 Aug 2022
Ab-initio quantum chemistry with neural-network wavefunctions
Ab-initio quantum chemistry with neural-network wavefunctions
J. Hermann
J. Spencer
Kenny Choo
Antonio Mezzacapo
W. Foulkes
David Pfau
Giuseppe Carleo
Frank Noé
AI4CE
83
86
0
26 Aug 2022
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of
  Speech and Image Data
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
CVBM
60
24
0
25 Aug 2022
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
Xiaobai Li
166
2
0
24 Aug 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and
  Benchmarks, and a Meta-Review
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
147
13
0
21 Aug 2022
Visualising Model Training via Vowel Space for Text-To-Speech Systems
Visualising Model Training via Vowel Space for Text-To-Speech Systems
Binu Abeysinghe
Jesin James
C. Watson
Felix Marattukalam
54
2
0
21 Aug 2022
An Initial Investigation for Detecting Vocoder Fingerprints of Fake
  Audio
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
Xin Yan
Jiangyan Yi
J. Tao
Chenglong Wang
Haoxin Ma
Tao Wang
Shiming Wang
Ruibo Fu
76
34
0
20 Aug 2022
Expressing Multivariate Time Series as Graphs with Time Series Attention
  Transformer
Expressing Multivariate Time Series as Graphs with Time Series Attention Transformer
W. Ng
K. Siu
Albert C. Cheung
Michael K. Ng
AI4TS
55
7
0
19 Aug 2022
Sequence Prediction Under Missing Data : An RNN Approach Without
  Imputation
Sequence Prediction Under Missing Data : An RNN Approach Without Imputation
Soumen Pachal
Avinash Achar
AI4TS
36
4
0
18 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
66
0
0
18 Aug 2022
Speech Representation Disentanglement with Adversarial Mutual
  Information Learning for One-shot Voice Conversion
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
Sicheng Yang
Methawee Tantrawenith
Hao-Wen Zhuang
Zhiyong Wu
Aolan Sun
...
Ning Cheng
Huaizhen Tang
Xintao Zhao
Jie Wang
Helen Meng
DRL
56
39
0
18 Aug 2022
Deep Neural Network Approximation of Invariant Functions through
  Dynamical Systems
Deep Neural Network Approximation of Invariant Functions through Dynamical Systems
Qianxiao Li
T. Lin
Zuowei Shen
73
6
0
18 Aug 2022
Musika! Fast Infinite Waveform Music Generation
Musika! Fast Infinite Waveform Music Generation
Marco Pasini
Jan Schluter
MGen
46
31
0
18 Aug 2022
Differentiable WORLD Synthesizer-based Neural Vocoder With Application
  To End-To-End Audio Style Transfer
Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer
S. Nercessian
107
9
0
15 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of
  Spectral Envelope and Wavelet-Based Decomposition of F0
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
50
1
0
15 Aug 2022
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
Franco Caspe
Andrew Mcpherson
Mark Sandler
72
30
0
12 Aug 2022
Uncertainty Quantification for Traffic Forecasting: A Unified Approach
Uncertainty Quantification for Traffic Forecasting: A Unified Approach
Weizhu Qian
Dalin Zhang
Yan Zhao
Kai Zheng
James Jianqiao Yu
BDLAI4TS
69
23
0
11 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A
  Comprehensive Evaluation
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
115
24
0
09 Aug 2022
Vision-Based Activity Recognition in Children with Autism-Related
  Behaviors
Vision-Based Activity Recognition in Children with Autism-Related Behaviors
P. Wei
David Ahmedt-Aristizabal
Harshala Gammulle
Simon Denman
M. Armin
95
33
0
08 Aug 2022
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using
  1D Convolutions and State Space Models
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models
A. E. Gazzar
R. Thomas
G. Wingen
69
3
0
08 Aug 2022
Mining Reaction and Diffusion Dynamics in Social Activities
Mining Reaction and Diffusion Dynamics in Social Activities
Taichi Murayama
Yasuko Matsubara
Yasushi Sakurai
50
1
0
07 Aug 2022
SSDPT: Self-Supervised Dual-Path Transformer for Anomalous Sound
  Detection in Machine Condition Monitoring
SSDPT: Self-Supervised Dual-Path Transformer for Anomalous Sound Detection in Machine Condition Monitoring
Jisheng Bai
Jianfeng Chen
Mou Wang
Muhammad Saad Ayub
Qingli Yan
88
16
0
06 Aug 2022
Model Blending for Text Classification
Model Blending for Text Classification
Ramit Pahwa
35
0
0
05 Aug 2022
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
Qiyang Li
Ajay Jain
Pieter Abbeel
OffRL
84
4
0
03 Aug 2022
A Study of Modeling Rising Intonation in Cantonese Neural Speech
  Synthesis
A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis
Qibing Bai
Tom Ko
Yu Zhang
92
4
0
03 Aug 2022
Conv-NILM-Net, a causal and multi-appliance model for energy source
  separation
Conv-NILM-Net, a causal and multi-appliance model for energy source separation
Mohamed Alami Chehboune
Jérémie Decock
Rim Kaddah
Jesse Read
44
1
0
03 Aug 2022
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Muhammad Hassan
Haifei Guan
Aikaterini Melliou
Yuqi Wang
Qianhui Sun
...
Qi Huang
Jiefu Tan
Qinwang Xing
Peiwu Qin
Dongmei Yu
NAI
109
15
0
31 Jul 2022
Geometric deep learning for computational mechanics Part II: Graph
  embedding for interpretable multiscale plasticity
Geometric deep learning for computational mechanics Part II: Graph embedding for interpretable multiscale plasticity
Nikolaos N. Vlassis
WaiChing Sun
AI4CE
74
34
0
30 Jul 2022
Low-data? No problem: low-resource, language-agnostic conversational
  text-to-speech via F0-conditioned data augmentation
Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Giulia Comini
Goeric Huybrechts
M. Ribeiro
Adam Gabry's
Jaime Lorenzo-Trueba
67
5
0
29 Jul 2022
Generative Extraction of Audio Classifiers for Speaker Identification
Generative Extraction of Audio Classifiers for Speaker Identification
Tejumade Afonja
Lucas Bourtoule
Varun Chandrasekaran
Sageev Oore
Nicolas Papernot
AAML
61
1
0
26 Jul 2022
Dive into Big Model Training
Dive into Big Model Training
Qinghua Liu
Yuxiang Jiang
MoMeAI4CELRM
41
3
0
25 Jul 2022
A Proposal for Foley Sound Synthesis Challenge
A Proposal for Foley Sound Synthesis Challenge
Keunwoo Choi
Sangshin Oh
Minsung Kang
Brian McFee
55
11
0
21 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
111
306
0
20 Jul 2022
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and
  Defense Mechanisms
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
Linbo Liu
Youngsuk Park
T. Hoang
Hilaf Hasson
Jun Huan
AAML
99
8
0
19 Jul 2022
GAFX: A General Audio Feature eXtractor
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
60
0
0
19 Jul 2022
Latent-Domain Predictive Neural Speech Coding
Latent-Domain Predictive Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
81
18
0
18 Jul 2022
Toward reliable signals decoding for electroencephalogram: A benchmark
  study to EEGNeX
Toward reliable signals decoding for electroencephalogram: A benchmark study to EEGNeX
Xia Chen
Xiangbin Teng
Hannah S. Chen
Yafeng Pan
Philipp Geyer
84
46
0
15 Jul 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
126
201
0
13 Jul 2022
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu
Qiao Tian
Chenxu Hu
Xudong Liu
Meng-Che Wu
Yuping Wang
Hang Zhao
Yuxuan Wang
87
10
0
13 Jul 2022
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning
  to Separate
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Nabarun Goswami
Tatsuya Harada
78
5
0
13 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement
  of Neural Post-filter for Low-cost Text-to-speech System
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
69
0
0
13 Jul 2022
CFAD: A Chinese Dataset for Fake Audio Detection
CFAD: A Chinese Dataset for Fake Audio Detection
Haoxin Ma
Jiangyan Yi
Chenglong Wang
Xin Yan
J. Tao
Tao Wang
Shiming Wang
Ruibo Fu
95
30
0
12 Jul 2022
Multi-task Envisioning Transformer-based Autoencoder for Corporate
  Credit Rating Migration Early Prediction
Multi-task Envisioning Transformer-based Autoencoder for Corporate Credit Rating Migration Early Prediction
Han Yue
Steve Q. Xia
Hongfu Liu
40
1
0
10 Jul 2022
Seasonal Encoder-Decoder Architecture for Forecasting
Seasonal Encoder-Decoder Architecture for Forecasting
Avinash Achar
Soumen Pachal
BDLAI4TS
23
0
0
08 Jul 2022
End-to-End Binaural Speech Synthesis
End-to-End Binaural Speech Synthesis
Wen-Chin Huang
Dejan Marković
Alexander Richard
I. D. Gebru
Anjali Menon
65
9
0
08 Jul 2022
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
MQ
117
10
0
07 Jul 2022
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ali Siahkoohi
Michael Chinen
Tom Denton
W. Kleijn
Jan Skoglund
58
9
0
05 Jul 2022
A survey of multimodal deep generative models
A survey of multimodal deep generative models
Masahiro Suzuki
Y. Matsuo
SyDaDRL
84
82
0
05 Jul 2022
Towards trustworthy Energy Disaggregation: A review of challenges,
  methods and perspectives for Non-Intrusive Load Monitoring
Towards trustworthy Energy Disaggregation: A review of challenges, methods and perspectives for Non-Intrusive Load Monitoring
Maria Kaselimi
Eftychios E. Protopapadakis
A. Voulodimos
N. Doulamis
Anastasios Doulamis
61
70
0
05 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and
  Any-to-any Voice Conversion
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Jane Polak Scowcroft
DiffM
92
12
0
05 Jul 2022
Previous
123...192021...606162
Next