Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
v1
v2 (latest)
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,082 papers shown
Title
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
137
108
0
06 Mar 2022
Variational Auto-Encoder based Mandarin Speech Cloning
Qingyu Xing
Xiaohan Ma
133
0
0
06 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Tao Wang
Ruibo Fu
Jiangyan Yi
J. Tao
Zhengqi Wen
25
2
0
05 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
89
62
0
04 Mar 2022
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Jun Xiong
Yu Zhou
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
72
22
0
04 Mar 2022
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
131
4
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
96
11
0
01 Mar 2022
Explainable deepfake and spoofing detection: an attack analysis using SHapley Additive exPlanations
W. Ge
Massimiliano Todisco
Nicholas W. D. Evans
AAML
54
9
0
28 Feb 2022
Concept Graph Neural Networks for Surgical Video Understanding
Yutong Ban
J. Eckhoff
Thomas M. Ward
Daniel A. Hashimoto
O. Meireles
Daniela Rus
Guy Rosman
NAI
86
18
0
27 Feb 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier
Jinglin Liu
Chengxi Li
Yi Ren
Zhiying Zhu
Zhou Zhao
DiffM
94
17
0
27 Feb 2022
Continuous Human Action Recognition for Human-Machine Interaction: A Review
Harshala Gammulle
David Ahmedt-Aristizabal
Simon Denman
Lachlan Tychsen-Smith
L. Petersson
Clinton Fookes
124
28
0
26 Feb 2022
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
148
64
0
26 Feb 2022
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting
Jiabin Tang
Tang Qian
Shikun Liu
Shengdong Du
Jie Hu
Tianrui Li
AI4TS
58
23
0
25 Feb 2022
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting
Dazhao Du
Fuchun Sun
Zhewei Wei
AI4TS
87
51
0
23 Feb 2022
End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation
Krishna Subramani
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
70
11
0
23 Feb 2022
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
62
4
0
22 Feb 2022
Wavebender GAN: An architecture for phonetically meaningful speech manipulation
Gustavo Teodoro Döhler Beck
Ulme Wennberg
Zofia Malisz
G. Henter
AI4CE
88
8
0
22 Feb 2022
Benchmarking Generative Latent Variable Models for Speech
Jakob Drachmann Havtorn
Lasse Borgholt
Søren Hauberg
J. Frellsen
Lars Maaløe
80
3
0
22 Feb 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
KELM
69
20
0
21 Feb 2022
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
110
195
0
20 Feb 2022
Learning to Detect Slip with Barometric Tactile Sensors and a Temporal Convolutional Neural Network
Abhinav Grover
Philippe Nadeau
C. Grebe
Jonathan Kelly
69
10
0
19 Feb 2022
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks
V. Nia
Alireza Ghaffari
Mahdi Zolnouri
Yvon Savaria
58
5
0
18 Feb 2022
Dynamic Relation Discovery and Utilization in Multi-Entity Time Series Forecasting
Lin Huang
Lijun Wu
Jia Zhang
Jiang Bian
Tie-Yan Liu
AI4TS
48
2
0
18 Feb 2022
PGCN: Progressive Graph Convolutional Networks for Spatial-Temporal Traffic Forecasting
Y. Shin
Yoonjin Yoon
GNN
AI4TS
74
47
0
18 Feb 2022
Speech Denoising in the Waveform Domain with Self-Attention
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
89
63
0
15 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
118
66
0
15 Feb 2022
Interpreting a Machine Learning Model for Detecting Gravitational Waves
M. Safarzadeh
Asad Khan
Eliu A. Huerta
Martin Wattenberg
108
2
0
15 Feb 2022
NewsPod: Automatic and Interactive News Podcasts
Philippe Laban
Elicia Ye
Srujay Korlakunta
John F. Canny
Marti A. Hearst
54
22
0
15 Feb 2022
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
79
58
0
14 Feb 2022
An Introduction to Neural Data Compression
Yibo Yang
Stephan Mandt
Lucas Theis
149
125
0
14 Feb 2022
Distribution augmentation for low-resource expressive text-to-speech
Mateusz Lajszczak
Animesh Prasad
Arent van Korlaar
Bajibabu Bollepalli
Antonio Bonafonte
...
M. Nicolis
Alexis Moinet
Thomas Drugman
Trevor Wood
Elena Sokolova
61
7
0
13 Feb 2022
SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography
Kevin Kotzen
Peter H. Charlton
Sharon Salabi
Lea Amar
A. Landesberg
Joachim A. Behar
67
33
0
11 Feb 2022
Bernstein Flows for Flexible Posteriors in Variational Bayes
Oliver Durr
Stephan Hörling
Daniel Dold
Ivonne Kovylov
Beate Sick
BDL
102
4
0
11 Feb 2022
A Graph-based U-Net Model for Predicting Traffic in unseen Cities
L. Hermes
Barbara Hammer
Andrew Melnik
Riza Velioglu
Markus Vieth
M. Schilling
GNN
AI4TS
AI4CE
77
6
0
11 Feb 2022
Conditional Diffusion Probabilistic Model for Speech Enhancement
Yen-Ju Lu
Zhongqiu Wang
Shinji Watanabe
Alexander Richard
Cheng Yu
Yu Tsao
DiffM
84
191
0
10 Feb 2022
Diffusion bridges vector quantized Variational AutoEncoders
Max H. Cohen
Guillaume Quispe
Sylvain Le Corff
Charles Ollion
Eric Moulines
DiffM
90
15
0
10 Feb 2022
Deconstructing the Inductive Biases of Hamiltonian Neural Networks
Nate Gruver
Marc Finzi
Samuel Stanton
A. Wilson
AI4CE
69
42
0
10 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
71
31
0
08 Feb 2022
TACTiS: Transformer-Attentional Copulas for Time Series
Alexandre Drouin
Étienne Marcotte
Nicolas Chapados
AI4TS
283
39
0
07 Feb 2022
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks
Alexander Richard
Peter Dodds
V. Ithapu
71
37
0
07 Feb 2022
Building Synthetic Speaker Profiles in Text-to-Speech Systems
Jie Pu
Yi Meng
Oguz H. Elibol
48
2
0
07 Feb 2022
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Shimaa Ahmed
Yash R. Wani
Ali Shahin Shamsabadi
Mohammad Yaghini
Ilia Shumailov
Nicolas Papernot
Kassem Fawaz
AAML
62
4
0
06 Feb 2022
GhostTalk: Interactive Attack on Smartphone Voice System Through Power Line
Yuanda Wang
Hanqing Guo
Qiben Yan
AAML
77
41
0
05 Feb 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators
Lois Orosa
Skanda Koppula
Yaman Umuroglu
Konstantinos Kanellopoulos
Juan Gómez Luna
Michaela Blott
K. Vissers
O. Mutlu
82
4
0
04 Feb 2022
A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective
Wenhao Ding
Chejian Xu
Mansur Arief
Hao-ming Lin
Yue Liu
Ding Zhao
119
165
0
04 Feb 2022
Deep Learning for Epidemiologists: An Introduction to Neural Networks
S. Serghiou
K. Rough
FedML
54
14
0
02 Feb 2022
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge
Ziyi Chen
Hua Hua
Yuxiang Zhang
Ming Li
Pengyuan Zhang
102
0
0
29 Jan 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation
Shoule Wu
Ziqiang Shi
DiffM
456
9
0
29 Jan 2022
Electra: Conditional Generative Model based Predicate-Aware Query Approximation
Nikhil Sheoran
Subrata Mitra
Vibhor Porwal
Siddharth Ghetia
Jatin Varshney
Tung Mai
Anup B. Rao
Vikas Maddukuri
91
13
0
28 Jan 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Jane Polak Scowcroft
Dong Yu
DiffM
150
67
0
28 Jan 2022
Previous
1
2
3
...
23
24
25
...
60
61
62
Next