Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
v1
v2 (latest)
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,082 papers shown
Title
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
72
12
0
25 Nov 2023
An NMF-Based Building Block for Interpretable Neural Networks With Continual Learning
Brian K. Vogel
43
0
0
20 Nov 2023
Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers
Staphord Bengesi
Hoda El-Sayed
Md Kamruzzaman Sarker
Yao Houkpati
John Irungu
T. Oladunni
128
93
0
17 Nov 2023
Formal Verification of Long Short-Term Memory based Audio Classifiers: A Star based Approach
Neelanjana Pal
Taylor T. Johnson
55
0
0
16 Nov 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation
Yimin Deng
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
77
3
0
15 Nov 2023
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Ge Zhu
Yutong Wen
M. Carbonneau
Zhiyao Duan
DiffM
76
8
0
15 Nov 2023
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion
A. R. Bargum
Stefania Serafin
Cumhur Erkut
70
4
0
14 Nov 2023
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation
Jiangzong Wang
Pengcheng Li
Xulong Zhang
Ning Cheng
Jing Xiao
81
0
0
14 Nov 2023
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion
Anders Vestergaard Norskov
Alexander Neergaard Zahid
Morten Morup
65
3
0
13 Nov 2023
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model
Pierre-Amaury Grumiaux
Mathieu Lagrange
68
3
0
13 Nov 2023
SponTTS: modeling and transferring spontaneous style for TTS
Hanzhao Li
Xinfa Zhu
Liumeng Xue
Yang Song
Yunlin Chen
Lei Xie
89
7
0
13 Nov 2023
Music ControlNet: Multiple Time-varying Controls for Music Generation
Shih-Lun Wu
Chris Donahue
Shinji Watanabe
Nicholas J. Bryan
DiffM
MGen
111
61
0
13 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Jiahao Li
Hao Tan
Kai Zhang
Zexiang Xu
Fujun Luan
Yinghao Xu
Yicong Hong
Kalyan Sunkavalli
Greg Shakhnarovich
Sai Bi
131
275
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
100
30
0
10 Nov 2023
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models
Haejin Lee
Jeongwoo Ju
Jonghyuck Lee
Yeoun Joo Lee
Heechul Jung
DiffM
MedIm
65
0
0
10 Nov 2023
Synthetic Speaking Children -- Why We Need Them and How to Make Them
Muhammad Ali Farooq
Dan Bigioi
Rishabh Jain
Wang Yao
Mariam Yiwere
Peter Corcoran
86
0
0
08 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
72
30
0
08 Nov 2023
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Prasad Gabbur
DiffM
50
1
0
08 Nov 2023
Impact of HPO on AutoML Forecasting Ensembles
David Hoffmann
46
0
0
07 Nov 2023
TS-Diffusion: Generating Highly Complex Time Series with Diffusion Models
Yangming Li
DiffM
AI4TS
97
5
0
06 Nov 2023
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection
Sahibzada Adil Shahzad
Ammarah Hashmi
Yan-Tsung Peng
Yu Tsao
Hsin-Min Wang
96
7
0
05 Nov 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
44
5
0
01 Nov 2023
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive Learning
Maxwell A. Xu
Alexander Moreno
Hui Wei
Benjamin M. Marlin
James M. Rehg
AI4TS
SSL
105
13
0
01 Nov 2023
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Bandhav Veluri
Malek Itani
Justin Chan
Takuya Yoshioka
Shyamnath Gollakota
67
18
0
01 Nov 2023
Deepfake detection by exploiting surface anomalies: the SurFake approach
Andrea Ciamarra
R. Caldelli
Federico Becattini
Lorenzo Seidenari
A. Bimbo
86
14
0
31 Oct 2023
Enabling Acoustic Audience Feedback in Large Virtual Events
Tamay Aykut
M. Hofbauer
Christopher B. Kuhn
Eckehard Steinbach
Bernd Girod
77
0
0
27 Oct 2023
Learning an Inventory Control Policy with General Inventory Arrival Dynamics
Sohrab Andaz
Carson Eisenach
Dhruv Madeka
Kari Torkkola
Randy Jia
Dean Phillips Foster
Sham Kakade
59
2
0
26 Oct 2023
Real-time Neonatal Chest Sound Separation using Deep Learning
Yang Yi Poh
Ethan Grooby
Kenneth Tan
Lindsay Zhou
Arrabella King
Ashwin Ramanathan
Atul Malhotra
Mehrtash Harandi
F. Marzbanrad
57
1
0
26 Oct 2023
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as a Neurodevelopmental Cue
Shaotong Zhu
Michael Wan
Sai Kumar Reddy Manne
Emily B. Zimmerman
Sarah Ostadabbas
31
2
0
24 Oct 2023
Synthetic Data as Validation
Qixing Hu
Alan Yuille
Zongwei Zhou
SyDa
OOD
73
8
0
24 Oct 2023
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding
Qu Yang
Malu Zhang
Jibin Wu
Kay Chen Tan
Haizhou Li
63
10
0
23 Oct 2023
Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN
Zhou Lan
Ben Liu
Yi Feng
Danhuang Dong
Peng Zhang
AI4TS
30
1
0
23 Oct 2023
An overview of text-to-speech systems and media applications
Mohammad Reza Hasanabadi
28
3
0
22 Oct 2023
MFCC-GAN Codec: A New AI-based Audio Coding
Mohammad Reza Hasanabadi
40
0
0
22 Oct 2023
Neural Likelihood Approximation for Integer Valued Time Series Data
Luke O'Loughlin
John Maclean
Andrew Black
AI4TS
53
0
0
19 Oct 2023
Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube
Kazuya Yokota
Takahiko Kurahashi
Masajiro Abe
28
5
0
18 Oct 2023
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
Xueyao Zhang
Yicheng Gu
Haopeng Chen
Zihao Fang
Lexiao Zou
Junan Zhang
Liumeng Xue
Jinchao Zhang
Jie Zhou
Zhizheng Wu
DiffM
64
2
0
17 Oct 2023
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval
Kaixing Yang
Xukun Zhou
Xulong Tang
Ran Diao
Hongyan Liu
Jun He
Zhaoxin Fan
71
3
0
16 Oct 2023
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations
Heyuan Yao
Zhenhua Song
Yuyang Zhou
Tenglong Ao
Baoquan Chen
Libin Liu
135
44
0
16 Oct 2023
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling
Tiberiu Boros
Stefan Daniel Dumitrescu
Ionut Mironica
Radu Chivereanu
GAN
38
1
0
14 Oct 2023
Machine Learning for Urban Air Quality Analytics: A Survey
Jindong Han
Weijiao Zhang
Hao Liu
Hui Xiong
AI4CE
114
12
0
14 Oct 2023
A decoder-only foundation model for time-series forecasting
Abhimanyu Das
Weihao Kong
Rajat Sen
Yichen Zhou
AI4TS
AI4CE
137
243
0
14 Oct 2023
ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning
Jiecheng Lu
Xu Han
Shihao Yang
AI4TS
49
4
0
14 Oct 2023
LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations
Ahmed Khalil
Robert Piechocki
Raúl Santos-Rodríguez
54
2
0
13 Oct 2023
Large Language Models Are Zero-Shot Time Series Forecasters
Nate Gruver
Marc Finzi
Shikai Qiu
Andrew Gordon Wilson
AI4TS
97
375
0
11 Oct 2023
Prosody Analysis of Audiobooks
Charuta Pethe
Yunting Yin
Felix D Childress
Yunting Yin
Steven Skiena
89
1
0
10 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio tokens
Robin Algayres
Yossi Adi
Tu Nguyen
Jade Copet
Gabriel Synnaeve
Benoît Sagot
Emmanuel Dupoux
AuLLM
119
16
0
08 Oct 2023
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset
Ze Liu
53
1
0
08 Oct 2023
FM Tone Transfer with Envelope Learning
Franco Caspe
Andrew Mcpherson
Mark Sandler
57
2
0
07 Oct 2023
Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation
Aman Khullar
Daniel K. Nkemelu
Cuong V. Nguyen
Michael L. Best
80
5
0
04 Oct 2023
Previous
1
2
3
...
9
10
11
...
60
61
62
Next