Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
v1
v2 (latest)
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,082 papers shown
Title
A Survey of Machine Learning Methods and Challenges for Windows Malware Classification
Edward Raff
Charles K. Nicholas
AAML
74
57
0
15 Jun 2020
COT-GAN: Generating Sequential Data via Causal Optimal Transport
Tianlin Xu
L. Wenliang
Michael Munn
Beatrice Acciaio
GAN
CML
89
99
0
15 Jun 2020
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
223
1,650
0
15 Jun 2020
NeuroCard: One Cardinality Estimator for All Tables
Zongheng Yang
Amog Kamsetty
Sifei Luan
Eric Liang
Yan Duan
Xi Chen
Ion Stoica
72
107
0
15 Jun 2020
Exponential Tilting of Generative Models: Improving Sample Quality by Training and Sampling from Latent Energy
Zhisheng Xiao
Qing Yan
Y. Amit
DRL
58
8
0
15 Jun 2020
UWSpeech: Speech to Speech Translation for Unwritten Languages
Chen Zhang
Xu Tan
Yi Ren
Tao Qin
Ke-jun Zhang
Tie-Yan Liu
59
56
0
14 Jun 2020
SE-MelGAN -- Speaker Agnostic Rapid Speech Enhancement
Luka Chkhetiani
Levan Bejanidze
44
1
0
13 Jun 2020
Are we done with ImageNet?
Lucas Beyer
Olivier J. Hénaff
Alexander Kolesnikov
Xiaohua Zhai
Aaron van den Oord
VLM
159
408
0
12 Jun 2020
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections
Csaba Tóth
Patric Bonnier
Harald Oberhauser
AI4TS
87
14
0
12 Jun 2020
Neural voice cloning with a few low-quality samples
Sunghee Jung
Hoi-Rim Kim
37
3
0
12 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
117
342
0
11 Jun 2020
NADS: Neural Architecture Distribution Search for Uncertainty Awareness
Randy Ardywibowo
Shahin Boluki
Xinyu Gong
Zhangyang Wang
Xiaoning Qian
UQCV
72
18
0
11 Jun 2020
NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity
Sang-gil Lee
Sungwon Kim
Sungroh Yoon
77
17
0
11 Jun 2020
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning
Ruozi Huang
Huang Hu
Wei Wu
Kei Sawada
Mi Zhang
Daxin Jiang
125
122
0
11 Jun 2020
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Jiaqi Su
Zeyu Jin
Adam Finkelstein
69
140
0
10 Jun 2020
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
213
20
0
10 Jun 2020
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers
Tsung-Han Wu
Chun-Chen Hsieh
Yen-Hao Chen
Po-Han Chi
Hung-yi Lee
48
1
0
09 Jun 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
65
110
0
08 Jun 2020
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Hyeongju Kim
Hyeongseung Lee
Woohyun Kang
Sung Jun Cheon
Byoung Jin Choi
N. Kim
67
12
0
08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
169
1,415
0
08 Jun 2020
A non-causal FFTNet architecture for speech enhancement
P. V. M. Shifas
Nagaraj Adiga
Vassilis Tsiaras
Y. Stylianou
AI4TS
29
11
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
98
187
0
05 Jun 2020
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning
Sameer Khurana
Antoine Laurent
James R. Glass
SSL
72
12
0
04 Jun 2020
A study on more realistic room simulation for far-field keyword spotting
Eric Bezzam
Robin Scheibler
C. Cadoux
Thibault Gisselbrecht
42
10
0
04 Jun 2020
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
Sameer Khurana
Antoine Laurent
Wei-Ning Hsu
J. Chorowski
A. Lancucki
R. Marxer
James R. Glass
SSL
BDL
80
29
0
03 Jun 2020
Least
k
k
k
th-Order and Rényi Generative Adversarial Networks
Himesh Bhatia
William Paul
F. Alajaji
Bahman Gharesifard
Philippe Burlina
GAN
66
8
0
03 Jun 2020
Training End-to-End Analog Neural Networks with Equilibrium Propagation
Jack D. Kendall
Ross D. Pantone
Kalpana Manickavasagam
Yoshua Bengio
B. Scellier
95
85
0
02 Jun 2020
Event Arguments Extraction via Dilate Gated Convolutional Neural Network with Enhanced Local Features
Zhigang Kan
Linbo Qiao
Sen Yang
Feng Liu
Feng Huang
14
10
0
02 Jun 2020
Dilated U-net based approach for multichannel speech enhancement from First-Order Ambisonics recordings
Amélie Bosca
Alexandre Guérin
L. Perotin
Srdan Kitic
50
20
0
02 Jun 2020
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
Arun Baby
Saranya Vinnaitherthan
Nagaraj Adiga
Pranav Jawale
Sumukh Badam
Sharath Adavanne
Srikanth Konjeti
49
7
0
02 Jun 2020
Hyperparameter optimization with REINFORCE and Transformers
C. Krishna
Ashish Gupta
Swarnim Narayan
Himanshu Rai
Diksha Manchanda
58
2
0
01 Jun 2020
EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery Brain-Machine Interfaces
T. Ingolfsson
Michael Hersche
Xiaying Wang
Nobuaki Kobayashi
Lukas Cavigelli
Luca Benini
89
199
0
31 May 2020
Introducing Latent Timbre Synthesis
Kıvanç Tatar
D. Bisig
Philippe Pasquier
48
14
0
31 May 2020
IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition
Hyeokhyen Kwon
C. Tong
H. Haresamudram
Yan Gao
G. Abowd
Nicholas D. Lane
Thomas Ploetz
129
88
0
29 May 2020
SNR-Based Teachers-Student Technique for Speech Enhancement
Xiang Hao
Xiangdong Su
Zhiyu Wang
Qiang Zhang
Huali Xu
Guanglai Gao
48
15
0
29 May 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Da-Yi Wu
Yi-Hsuan Yang
GAN
74
8
0
28 May 2020
3D human pose estimation with adaptive receptive fields and dilated temporal convolutions
Michael Shin
Eduardo Castillo
Irene Font Peradejordi
S. Jayaraman
3DH
23
0
0
28 May 2020
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices
Run Wang
Felix Juefei Xu
Yihao Huang
Qing Guo
Xiaofei Xie
Lei Ma
Yang Liu
AAML
80
107
0
28 May 2020
Network-to-Network Translation with Conditional Invertible Neural Networks
Robin Rombach
Patrick Esser
Bjorn Ommer
40
3
0
27 May 2020
Semi-supervised source localization with deep generative modeling
Michael J. Bianco
Sharon Gannot
Peter Gerstoft
DRL
70
21
0
27 May 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
Phan Huy Kinh
V. Phung
Anh-Tuan Dinh
Quoc Bao Nguyen
29
1
0
26 May 2020
Network Bending: Expressive Manipulation of Deep Generative Models
Terence Broad
F. Leymarie
M. Grierson
AI4CE
41
2
0
25 May 2020
How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey
Jiexia Ye
Juanjuan Zhao
Kejiang Ye
Chengzhong Xu
GNN
AI4TS
AI4CE
99
199
0
24 May 2020
Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks
Zonghan Wu
Shirui Pan
Guodong Long
Jing Jiang
Xiaojun Chang
Chengqi Zhang
AI4TS
123
1,424
0
24 May 2020
Effective and Efficient Computation with Multiple-timescale Spiking Recurrent Neural Networks
Bojian Yin
Federico Corradi
Sander M. Bohté
78
104
0
24 May 2020
TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework for Deep Learning with Anonymized Intermediate Representations
Ang Li
Yixiao Duan
Huanrui Yang
Yiran Chen
Jianlei Yang
99
50
0
23 May 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
132
499
0
22 May 2020
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi Luong
Junichi Yamagishi
100
53
0
22 May 2020
Conversational End-to-End TTS for Voice Agent
Haohan Guo
Shaofei Zhang
Frank Soong
Lei He
Lei Xie
94
69
0
21 May 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
76
31
0
20 May 2020
Previous
1
2
3
...
39
40
41
...
60
61
62
Next