Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Bin Cui
Ming-Hsuan Yang
DiffM
MedIm
226
1,320
0
02 Sep 2022
Evaluating generative audio systems and their metrics
Ashvala Vinay
Alexander Lerch
35
19
0
31 Aug 2022
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
57
6
0
30 Aug 2022
Spatio-Temporal Wind Speed Forecasting using Graph Networks and Novel Transformer Architectures
Lars Odegaard Bentsen
N. Warakagoda
R. Stenbro
P. Engelstad
AI4TS
29
101
0
29 Aug 2022
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks
L. Finkelstein
Heiga Zen
Norman Casagrande
Chun-an Chan
Ye Jia
...
Jonathan Shen
V. Wan
Yu Zhang
Yonghui Wu
R. Clark
33
9
0
28 Aug 2022
Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi
M. Levy
Richard Sharp
28
6
0
26 Aug 2022
Ab-initio quantum chemistry with neural-network wavefunctions
J. Hermann
J. Spencer
Kenny Choo
Antonio Mezzacapo
W. Foulkes
David Pfau
Giuseppe Carleo
Frank Noé
AI4CE
42
73
0
26 Aug 2022
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
CVBM
33
22
0
25 Aug 2022
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
Xiaobai Li
31
2
0
24 Aug 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
40
11
0
21 Aug 2022
Visualising Model Training via Vowel Space for Text-To-Speech Systems
Binu Abeysinghe
Jesin James
C. Watson
Felix Marattukalam
32
2
0
21 Aug 2022
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
Xin Yan
Jiangyan Yi
J. Tao
Chenglong Wang
Haoxin Ma
Tao Wang
Shiming Wang
Ruibo Fu
30
27
0
20 Aug 2022
Expressing Multivariate Time Series as Graphs with Time Series Attention Transformer
W. Ng
K. Siu
Albert C. Cheung
Michael K. Ng
AI4TS
24
7
0
19 Aug 2022
Sequence Prediction Under Missing Data : An RNN Approach Without Imputation
Soumen Pachal
Avinash Achar
AI4TS
14
4
0
18 Aug 2022
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
39
0
0
18 Aug 2022
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
Sicheng Yang
Methawee Tantrawenith
Hao-Wen Zhuang
Zhiyong Wu
Aolan Sun
...
Ning Cheng
Huaizhen Tang
Xintao Zhao
Jie Wang
Helen Meng
DRL
27
38
0
18 Aug 2022
Deep Neural Network Approximation of Invariant Functions through Dynamical Systems
Qianxiao Li
T. Lin
Zuowei Shen
34
6
0
18 Aug 2022
Musika! Fast Infinite Waveform Music Generation
Marco Pasini
Jan Schluter
MGen
20
29
0
18 Aug 2022
Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer
S. Nercessian
15
9
0
15 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
19
1
0
15 Aug 2022
DDX7: Differentiable FM Synthesis of Musical Instrument Sounds
Franco Caspe
Andrew Mcpherson
Mark Sandler
33
30
0
12 Aug 2022
Uncertainty Quantification for Traffic Forecasting: A Unified Approach
Weizhu Qian
Dalin Zhang
Yan Zhao
Kai Zheng
James J. Q. Yu
BDL
AI4TS
40
22
0
11 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
39
24
0
09 Aug 2022
Vision-Based Activity Recognition in Children with Autism-Related Behaviors
P. Wei
David Ahmedt-Aristizabal
Harshala Gammulle
Simon Denman
M. Armin
48
31
0
08 Aug 2022
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models
A. E. Gazzar
R. Thomas
G. Wingen
29
3
0
08 Aug 2022
Mining Reaction and Diffusion Dynamics in Social Activities
Taichi Murayama
Yasuko Matsubara
Yasushi Sakurai
27
1
0
07 Aug 2022
SSDPT: Self-Supervised Dual-Path Transformer for Anomalous Sound Detection in Machine Condition Monitoring
Jisheng Bai
Jianfeng Chen
Mou Wang
Muhammad Saad Ayub
Qingli Yan
54
15
0
06 Aug 2022
Model Blending for Text Classification
Ramit Pahwa
26
0
0
05 Aug 2022
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
Qiyang Li
Ajay Jain
Pieter Abbeel
OffRL
45
4
0
03 Aug 2022
A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis
Qibing Bai
Tom Ko
Yu Zhang
32
4
0
03 Aug 2022
Conv-NILM-Net, a causal and multi-appliance model for energy source separation
Mohamed Alami Chehboune
Jérémie Decock
Rim Kaddah
Jesse Read
35
1
0
03 Aug 2022
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
Muhammad Hassan
Haifei Guan
Aikaterini Melliou
Yuqi Wang
Qianhui Sun
...
Qi Huang
Jiefu Tan
Qinwang Xing
Peiwu Qin
Dongmei Yu
NAI
54
14
0
31 Jul 2022
Geometric deep learning for computational mechanics Part II: Graph embedding for interpretable multiscale plasticity
Nikolaos N. Vlassis
WaiChing Sun
AI4CE
37
33
0
30 Jul 2022
Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
Giulia Comini
Goeric Huybrechts
M. Ribeiro
Adam Gabry's
Jaime Lorenzo-Trueba
35
5
0
29 Jul 2022
Generative Extraction of Audio Classifiers for Speaker Identification
Tejumade Afonja
Lucas Bourtoule
Varun Chandrasekaran
Sageev Oore
Nicolas Papernot
AAML
15
1
0
26 Jul 2022
Dive into Big Model Training
Qinghua Liu
Yuxiang Jiang
MoMe
AI4CE
LRM
21
3
0
25 Jul 2022
A Proposal for Foley Sound Synthesis Challenge
Keunwoo Choi
Sangshin Oh
Minsung Kang
Brian McFee
26
11
0
21 Jul 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
36
297
0
20 Jul 2022
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
Linbo Liu
Youngsuk Park
T. Hoang
Hilaf Hasson
Jun Huan
AAML
63
6
0
19 Jul 2022
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
Latent-Domain Predictive Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
46
17
0
18 Jul 2022
Toward reliable signals decoding for electroencephalogram: A benchmark study to EEGNeX
Xia Chen
Xiangbin Teng
Hannah S. Chen
Yafeng Pan
Philipp Geyer
34
44
0
15 Jul 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
44
196
0
13 Jul 2022
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu
Qiao Tian
Chenxu Hu
Xudong Liu
Meng-Che Wu
Yuping Wang
Hang Zhao
Yuxuan Wang
36
10
0
13 Jul 2022
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Nabarun Goswami
Tatsuya Harada
26
5
0
13 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
Tomoki Toda
42
0
0
13 Jul 2022
CFAD: A Chinese Dataset for Fake Audio Detection
Haoxin Ma
Jiangyan Yi
Chenglong Wang
Xin Yan
J. Tao
Tao Wang
Shiming Wang
Ruibo Fu
24
26
0
12 Jul 2022
Multi-task Envisioning Transformer-based Autoencoder for Corporate Credit Rating Migration Early Prediction
Han Yue
Steve Q. Xia
Hongfu Liu
26
1
0
10 Jul 2022
Seasonal Encoder-Decoder Architecture for Forecasting
Avinash Achar
Soumen Pachal
BDL
AI4TS
19
0
0
08 Jul 2022
End-to-End Binaural Speech Synthesis
Wen-Chin Huang
Dejan Marković
Alexander Richard
I. D. Gebru
Anjali Menon
32
8
0
08 Jul 2022
Previous
1
2
3
...
18
19
20
...
59
60
61
Next