ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
A Survey of Machine Learning Methods and Challenges for Windows Malware
  Classification
A Survey of Machine Learning Methods and Challenges for Windows Malware Classification
Edward Raff
Charles K. Nicholas
AAML
74
57
0
15 Jun 2020
COT-GAN: Generating Sequential Data via Causal Optimal Transport
COT-GAN: Generating Sequential Data via Causal Optimal Transport
Tianlin Xu
L. Wenliang
Michael Munn
Beatrice Acciaio
GANCML
89
99
0
15 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
223
1,650
0
15 Jun 2020
NeuroCard: One Cardinality Estimator for All Tables
NeuroCard: One Cardinality Estimator for All Tables
Zongheng Yang
Amog Kamsetty
Sifei Luan
Eric Liang
Yan Duan
Xi Chen
Ion Stoica
72
107
0
15 Jun 2020
Exponential Tilting of Generative Models: Improving Sample Quality by
  Training and Sampling from Latent Energy
Exponential Tilting of Generative Models: Improving Sample Quality by Training and Sampling from Latent Energy
Zhisheng Xiao
Qing Yan
Y. Amit
DRL
58
8
0
15 Jun 2020
UWSpeech: Speech to Speech Translation for Unwritten Languages
UWSpeech: Speech to Speech Translation for Unwritten Languages
Chen Zhang
Xu Tan
Yi Ren
Tao Qin
Ke-jun Zhang
Tie-Yan Liu
59
56
0
14 Jun 2020
SE-MelGAN -- Speaker Agnostic Rapid Speech Enhancement
SE-MelGAN -- Speaker Agnostic Rapid Speech Enhancement
Luka Chkhetiani
Levan Bejanidze
44
1
0
13 Jun 2020
Are we done with ImageNet?
Are we done with ImageNet?
Lucas Beyer
Olivier J. Hénaff
Alexander Kolesnikov
Xiaohua Zhai
Aaron van den Oord
VLM
159
408
0
12 Jun 2020
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor
  Projections
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections
Csaba Tóth
Patric Bonnier
Harald Oberhauser
AI4TS
87
14
0
12 Jun 2020
Neural voice cloning with a few low-quality samples
Neural voice cloning with a few low-quality samples
Sunghee Jung
Hoi-Rim Kim
37
3
0
12 Jun 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction
FastPitch: Parallel Text-to-speech with Pitch Prediction
Adrian Lañcucki
117
342
0
11 Jun 2020
NADS: Neural Architecture Distribution Search for Uncertainty Awareness
NADS: Neural Architecture Distribution Search for Uncertainty Awareness
Randy Ardywibowo
Shahin Boluki
Xinyu Gong
Zhangyang Wang
Xiaoning Qian
UQCV
72
18
0
11 Jun 2020
NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity
NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity
Sang-gil Lee
Sungwon Kim
Sungroh Yoon
77
17
0
11 Jun 2020
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning
Ruozi Huang
Huang Hu
Wei Wu
Kei Sawada
Mi Zhang
Daxin Jiang
125
122
0
11 Jun 2020
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech
  Deep Features in Adversarial Networks
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Jiaqi Su
Zeyu Jin
Adam Finkelstein
69
140
0
10 Jun 2020
Deep generative models for musical audio synthesis
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
213
20
0
10 Jun 2020
Input-independent Attention Weights Are Expressive Enough: A Study of
  Attention in Self-supervised Audio Transformers
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers
Tsung-Han Wu
Chun-Chen Hsieh
Yen-Hao Chen
Po-Han Chi
Hung-yi Lee
48
1
0
09 Jun 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer
MultiSpeech: Multi-Speaker Text to Speech with Transformer
Mingjian Chen
Xu Tan
Yi Ren
Jin Xu
Hao Sun
Sheng Zhao
Tao Qin
Tie-Yan Liu
65
110
0
08 Jun 2020
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Hyeongju Kim
Hyeongseung Lee
Woohyun Kang
Sung Jun Cheon
Byoung Jin Choi
N. Kim
67
12
0
08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
169
1,415
0
08 Jun 2020
A non-causal FFTNet architecture for speech enhancement
A non-causal FFTNet architecture for speech enhancement
P. V. M. Shifas
Nagaraj Adiga
Vassilis Tsiaras
Y. Stylianou
AI4TS
29
11
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
98
187
0
05 Jun 2020
CSTNet: Contrastive Speech Translation Network for Self-Supervised
  Speech Representation Learning
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning
Sameer Khurana
Antoine Laurent
James R. Glass
SSL
72
12
0
04 Jun 2020
A study on more realistic room simulation for far-field keyword spotting
A study on more realistic room simulation for far-field keyword spotting
Eric Bezzam
Robin Scheibler
C. Cadoux
Thibault Gisselbrecht
42
10
0
04 Jun 2020
A Convolutional Deep Markov Model for Unsupervised Speech Representation
  Learning
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
Sameer Khurana
Antoine Laurent
Wei-Ning Hsu
J. Chorowski
A. Lancucki
R. Marxer
James R. Glass
SSLBDL
80
29
0
03 Jun 2020
Least $k$th-Order and Rényi Generative Adversarial Networks
Least kkkth-Order and Rényi Generative Adversarial Networks
Himesh Bhatia
William Paul
F. Alajaji
Bahman Gharesifard
Philippe Burlina
GAN
66
8
0
03 Jun 2020
Training End-to-End Analog Neural Networks with Equilibrium Propagation
Training End-to-End Analog Neural Networks with Equilibrium Propagation
Jack D. Kendall
Ross D. Pantone
Kalpana Manickavasagam
Yoshua Bengio
B. Scellier
95
85
0
02 Jun 2020
Event Arguments Extraction via Dilate Gated Convolutional Neural Network
  with Enhanced Local Features
Event Arguments Extraction via Dilate Gated Convolutional Neural Network with Enhanced Local Features
Zhigang Kan
Linbo Qiao
Sen Yang
Feng Liu
Feng Huang
14
10
0
02 Jun 2020
Dilated U-net based approach for multichannel speech enhancement from
  First-Order Ambisonics recordings
Dilated U-net based approach for multichannel speech enhancement from First-Order Ambisonics recordings
Amélie Bosca
Alexandre Guérin
L. Perotin
Srdan Kitic
50
20
0
02 Jun 2020
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
Arun Baby
Saranya Vinnaitherthan
Nagaraj Adiga
Pranav Jawale
Sumukh Badam
Sharath Adavanne
Srikanth Konjeti
49
7
0
02 Jun 2020
Hyperparameter optimization with REINFORCE and Transformers
Hyperparameter optimization with REINFORCE and Transformers
C. Krishna
Ashish Gupta
Swarnim Narayan
Himanshu Rai
Diksha Manchanda
58
2
0
01 Jun 2020
EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded
  Motor-Imagery Brain-Machine Interfaces
EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery Brain-Machine Interfaces
T. Ingolfsson
Michael Hersche
Xiaying Wang
Nobuaki Kobayashi
Lukas Cavigelli
Luca Benini
89
199
0
31 May 2020
Introducing Latent Timbre Synthesis
Introducing Latent Timbre Synthesis
Kıvanç Tatar
D. Bisig
Philippe Pasquier
48
14
0
31 May 2020
IMUTube: Automatic Extraction of Virtual on-body Accelerometry from
  Video for Human Activity Recognition
IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition
Hyeokhyen Kwon
C. Tong
H. Haresamudram
Yan Gao
G. Abowd
Nicholas D. Lane
Thomas Ploetz
129
88
0
29 May 2020
SNR-Based Teachers-Student Technique for Speech Enhancement
SNR-Based Teachers-Student Technique for Speech Enhancement
Xiang Hao
Xiangdong Su
Zhiyu Wang
Qiang Zhang
Huali Xu
Guanglai Gao
48
15
0
29 May 2020
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Speech-to-Singing Conversion based on Boundary Equilibrium GAN
Da-Yi Wu
Yi-Hsuan Yang
GAN
74
8
0
28 May 2020
3D human pose estimation with adaptive receptive fields and dilated
  temporal convolutions
3D human pose estimation with adaptive receptive fields and dilated temporal convolutions
Michael Shin
Eduardo Castillo
Irene Font Peradejordi
S. Jayaraman
3DH
23
0
0
28 May 2020
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake
  Voices
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices
Run Wang
Felix Juefei Xu
Yihao Huang
Qing Guo
Xiaofei Xie
Lei Ma
Yang Liu
AAML
80
107
0
28 May 2020
Network-to-Network Translation with Conditional Invertible Neural
  Networks
Network-to-Network Translation with Conditional Invertible Neural Networks
Robin Rombach
Patrick Esser
Bjorn Ommer
40
3
0
27 May 2020
Semi-supervised source localization with deep generative modeling
Semi-supervised source localization with deep generative modeling
Michael J. Bianco
Sharon Gannot
Peter Gerstoft
DRL
70
21
0
27 May 2020
A comparison of Vietnamese Statistical Parametric Speech Synthesis
  Systems
A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
Phan Huy Kinh
V. Phung
Anh-Tuan Dinh
Quoc Bao Nguyen
29
1
0
26 May 2020
Network Bending: Expressive Manipulation of Deep Generative Models
Network Bending: Expressive Manipulation of Deep Generative Models
Terence Broad
F. Leymarie
M. Grierson
AI4CE
41
2
0
25 May 2020
How to Build a Graph-Based Deep Learning Architecture in Traffic Domain:
  A Survey
How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey
Jiexia Ye
Juanjuan Zhao
Kejiang Ye
Chengzhong Xu
GNNAI4TSAI4CE
99
199
0
24 May 2020
Connecting the Dots: Multivariate Time Series Forecasting with Graph
  Neural Networks
Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks
Zonghan Wu
Shirui Pan
Guodong Long
Jing Jiang
Xiaojun Chang
Chengqi Zhang
AI4TS
123
1,424
0
24 May 2020
Effective and Efficient Computation with Multiple-timescale Spiking
  Recurrent Neural Networks
Effective and Efficient Computation with Multiple-timescale Spiking Recurrent Neural Networks
Bojian Yin
Federico Corradi
Sander M. Bohté
78
104
0
24 May 2020
TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
  for Deep Learning with Anonymized Intermediate Representations
TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework for Deep Learning with Anonymized Intermediate Representations
Ang Li
Yixiao Duan
Huanrui Yang
Yiran Chen
Jianlei Yang
99
50
0
23 May 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
132
499
0
22 May 2020
NAUTILUS: a Versatile Voice Cloning System
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi Luong
Junichi Yamagishi
100
53
0
22 May 2020
Conversational End-to-End TTS for Voice Agent
Conversational End-to-End TTS for Voice Agent
Haohan Guo
Shaofei Zhang
Frank Soong
Lei He
Lei Xie
94
69
0
21 May 2020
Investigation of learning abilities on linguistic features in
  sequence-to-sequence text-to-speech synthesis
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
76
31
0
20 May 2020
Previous
123...394041...606162
Next