ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Diffusion Models: A Comprehensive Survey of Methods and Applications
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
226
1,320
0
02 Sep 2022
Evaluating generative audio systems and their metrics
Evaluating generative audio systems and their metrics
Ashvala Vinay
Alexander Lerch
35
19
0
31 Aug 2022
A Circular Window-based Cascade Transformer for Online Action Detection
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
57
6
0
30 Aug 2022
Spatio-Temporal Wind Speed Forecasting using Graph Networks and Novel
  Transformer Architectures
Spatio-Temporal Wind Speed Forecasting using Graph Networks and Novel Transformer Architectures
Lars Odegaard Bentsen
N. Warakagoda
R. Stenbro
P. Engelstad
AI4TS
29
101
0
29 Aug 2022
Training Text-To-Speech Systems From Synthetic Data: A Practical
  Approach For Accent Transfer Tasks
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks
L. Finkelstein
Heiga Zen
Norman Casagrande
Chun-an Chan
Ye Jia
...
Jonathan Shen
V. Wan
Yu Zhang
Yonghui Wu
R. Clark
33
9
0
28 Aug 2022
Mel Spectrogram Inversion with Stable Pitch
Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi
M. Levy
Richard Sharp
28
6
0
26 Aug 2022
Ab-initio quantum chemistry with neural-network wavefunctions
Ab-initio quantum chemistry with neural-network wavefunctions
J. Hermann
J. Spencer
Kenny Choo
Antonio Mezzacapo
W. Foulkes
David Pfau
Giuseppe Carleo
Frank Noé
AI4CE
42
73
0
26 Aug 2022
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of
  Speech and Image Data
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
CVBM
33
22
0
25 Aug 2022
VISTANet: VIsual Spoken Textual Additive Net for Interpretable
  Multimodal Emotion Recognition
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
Xiaobai Li
31
2
0
24 Aug 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and
  Benchmarks, and a Meta-Review
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
40
11
0
21 Aug 2022
Visualising Model Training via Vowel Space for Text-To-Speech Systems
Visualising Model Training via Vowel Space for Text-To-Speech Systems
Binu Abeysinghe
Jesin James
C. Watson
Felix Marattukalam
32
2
0
21 Aug 2022
An Initial Investigation for Detecting Vocoder Fingerprints of Fake
  Audio
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
Xin Yan
Jiangyan Yi
J. Tao
Chenglong Wang
Haoxin Ma
Tao Wang
Shiming Wang
Ruibo Fu
30
27
0
20 Aug 2022
Expressing Multivariate Time Series as Graphs with Time Series Attention
  Transformer
Expressing Multivariate Time Series as Graphs with Time Series Attention Transformer
W. Ng
K. Siu
Albert C. Cheung
Michael K. Ng
AI4TS
24
7
0
19 Aug 2022
Sequence Prediction Under Missing Data : An RNN Approach Without
  Imputation
Sequence Prediction Under Missing Data : An RNN Approach Without Imputation
Soumen Pachal
Avinash Achar
AI4TS
14
4
0
18 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
39
0
0
18 Aug 2022
Speech Representation Disentanglement with Adversarial Mutual
  Information Learning for One-shot Voice Conversion
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
Sicheng Yang
Methawee Tantrawenith
Hao-Wen Zhuang
Zhiyong Wu
Aolan Sun
...
Ning Cheng
Huaizhen Tang
Xintao Zhao
Jie Wang
Helen Meng
DRL
27
38
0
18 Aug 2022
Deep Neural Network Approximation of Invariant Functions through
  Dynamical Systems
Deep Neural Network Approximation of Invariant Functions through Dynamical Systems
Qianxiao Li
T. Lin
Zuowei Shen
34
6
0
18 Aug 2022
Musika! Fast Infinite Waveform Music Generation
Musika! Fast Infinite Waveform Music Generation
Marco Pasini
Jan Schluter
MGen
20
29
0
18 Aug 2022
Differentiable WORLD Synthesizer-based Neural Vocoder With Application
  To End-To-End Audio Style Transfer
Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer
S. Nercessian
15
9
0
15 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of
  Spectral Envelope and Wavelet-Based Decomposition of F0
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
19
1
0
15 Aug 2022
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
Franco Caspe
Andrew Mcpherson
Mark Sandler
33
30
0
12 Aug 2022
Uncertainty Quantification for Traffic Forecasting: A Unified Approach
Uncertainty Quantification for Traffic Forecasting: A Unified Approach
Weizhu Qian
Dalin Zhang
Yan Zhao
Kai Zheng
James J. Q. Yu
BDL
AI4TS
40
22
0
11 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A
  Comprehensive Evaluation
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
39
24
0
09 Aug 2022
Vision-Based Activity Recognition in Children with Autism-Related
  Behaviors
Vision-Based Activity Recognition in Children with Autism-Related Behaviors
P. Wei
David Ahmedt-Aristizabal
Harshala Gammulle
Simon Denman
M. Armin
48
31
0
08 Aug 2022
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using
  1D Convolutions and State Space Models
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models
A. E. Gazzar
R. Thomas
G. Wingen
29
3
0
08 Aug 2022
Mining Reaction and Diffusion Dynamics in Social Activities
Mining Reaction and Diffusion Dynamics in Social Activities
Taichi Murayama
Yasuko Matsubara
Yasushi Sakurai
27
1
0
07 Aug 2022
SSDPT: Self-Supervised Dual-Path Transformer for Anomalous Sound
  Detection in Machine Condition Monitoring
SSDPT: Self-Supervised Dual-Path Transformer for Anomalous Sound Detection in Machine Condition Monitoring
Jisheng Bai
Jianfeng Chen
Mou Wang
Muhammad Saad Ayub
Qingli Yan
54
15
0
06 Aug 2022
Model Blending for Text Classification
Model Blending for Text Classification
Ramit Pahwa
26
0
0
05 Aug 2022
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
Qiyang Li
Ajay Jain
Pieter Abbeel
OffRL
45
4
0
03 Aug 2022
A Study of Modeling Rising Intonation in Cantonese Neural Speech
  Synthesis
A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis
Qibing Bai
Tom Ko
Yu Zhang
32
4
0
03 Aug 2022
Conv-NILM-Net, a causal and multi-appliance model for energy source
  separation
Conv-NILM-Net, a causal and multi-appliance model for energy source separation
Mohamed Alami Chehboune
Jérémie Decock
Rim Kaddah
Jesse Read
35
1
0
03 Aug 2022
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Muhammad Hassan
Haifei Guan
Aikaterini Melliou
Yuqi Wang
Qianhui Sun
...
Qi Huang
Jiefu Tan
Qinwang Xing
Peiwu Qin
Dongmei Yu
NAI
54
14
0
31 Jul 2022
Geometric deep learning for computational mechanics Part II: Graph
  embedding for interpretable multiscale plasticity
Geometric deep learning for computational mechanics Part II: Graph embedding for interpretable multiscale plasticity
Nikolaos N. Vlassis
WaiChing Sun
AI4CE
37
33
0
30 Jul 2022
Low-data? No problem: low-resource, language-agnostic conversational
  text-to-speech via F0-conditioned data augmentation
Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Giulia Comini
Goeric Huybrechts
M. Ribeiro
Adam Gabry's
Jaime Lorenzo-Trueba
35
5
0
29 Jul 2022
Generative Extraction of Audio Classifiers for Speaker Identification
Generative Extraction of Audio Classifiers for Speaker Identification
Tejumade Afonja
Lucas Bourtoule
Varun Chandrasekaran
Sageev Oore
Nicolas Papernot
AAML
15
1
0
26 Jul 2022
Dive into Big Model Training
Dive into Big Model Training
Qinghua Liu
Yuxiang Jiang
MoMe
AI4CE
LRM
21
3
0
25 Jul 2022
A Proposal for Foley Sound Synthesis Challenge
A Proposal for Foley Sound Synthesis Challenge
Keunwoo Choi
Sangshin Oh
Minsung Kang
Brian McFee
26
11
0
21 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
36
297
0
20 Jul 2022
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and
  Defense Mechanisms
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
Linbo Liu
Youngsuk Park
T. Hoang
Hilaf Hasson
Jun Huan
AAML
63
6
0
19 Jul 2022
GAFX: A General Audio Feature eXtractor
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
Latent-Domain Predictive Neural Speech Coding
Latent-Domain Predictive Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
46
17
0
18 Jul 2022
Toward reliable signals decoding for electroencephalogram: A benchmark
  study to EEGNeX
Toward reliable signals decoding for electroencephalogram: A benchmark study to EEGNeX
Xia Chen
Xiangbin Teng
Hannah S. Chen
Yafeng Pan
Philipp Geyer
34
44
0
15 Jul 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
44
196
0
13 Jul 2022
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu
Qiao Tian
Chenxu Hu
Xudong Liu
Meng-Che Wu
Yuping Wang
Hang Zhao
Yuxuan Wang
36
10
0
13 Jul 2022
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning
  to Separate
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Nabarun Goswami
Tatsuya Harada
26
5
0
13 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement
  of Neural Post-filter for Low-cost Text-to-speech System
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
42
0
0
13 Jul 2022
CFAD: A Chinese Dataset for Fake Audio Detection
CFAD: A Chinese Dataset for Fake Audio Detection
Haoxin Ma
Jiangyan Yi
Chenglong Wang
Xin Yan
J. Tao
Tao Wang
Shiming Wang
Ruibo Fu
24
26
0
12 Jul 2022
Multi-task Envisioning Transformer-based Autoencoder for Corporate
  Credit Rating Migration Early Prediction
Multi-task Envisioning Transformer-based Autoencoder for Corporate Credit Rating Migration Early Prediction
Han Yue
Steve Q. Xia
Hongfu Liu
26
1
0
10 Jul 2022
Seasonal Encoder-Decoder Architecture for Forecasting
Seasonal Encoder-Decoder Architecture for Forecasting
Avinash Achar
Soumen Pachal
BDL
AI4TS
19
0
0
08 Jul 2022
End-to-End Binaural Speech Synthesis
End-to-End Binaural Speech Synthesis
Wen-Chin Huang
Dejan Marković
Alexander Richard
I. D. Gebru
Anjali Menon
32
8
0
08 Jul 2022
Previous
123...181920...596061
Next