Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations
Ahmed Khalil
Robert Piechocki
Raúl Santos-Rodríguez
15
2
0
13 Oct 2023
Large Language Models Are Zero-Shot Time Series Forecasters
Nate Gruver
Marc Finzi
Shikai Qiu
Andrew Gordon Wilson
AI4TS
38
325
0
11 Oct 2023
Prosody Analysis of Audiobooks
Charuta Pethe
Yunting Yin
Felix D Childress
Yunting Yin
Steven Skiena
32
1
0
10 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio tokens
Robin Algayres
Yossi Adi
Tu Nguyen
Jade Copet
Gabriel Synnaeve
Benoît Sagot
Emmanuel Dupoux
AuLLM
46
13
0
08 Oct 2023
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset
Ze Liu
24
1
0
08 Oct 2023
FM Tone Transfer with Envelope Learning
Franco Caspe
Andrew Mcpherson
Mark Sandler
40
2
0
07 Oct 2023
Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation
Aman Khullar
Daniel K. Nkemelu
Cuong V. Nguyen
Michael L. Best
45
2
0
04 Oct 2023
Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs
Ilan Naiman
N. Benjamin Erichson
Pu Ren
Lbnl Michael W. Mahoney ICSI
Omri Azencot
AI4TS
37
20
0
04 Oct 2023
SEA: Sparse Linear Attention with Estimated Attention Mask
Heejun Lee
Jina Kim
Jeffrey Willette
Sung Ju Hwang
38
6
0
03 Oct 2023
DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
Roi Benita
Michael Elad
Joseph Keshet
DiffM
38
7
0
02 Oct 2023
A Comprehensive Review of Generative AI in Healthcare
Yasin Shokrollahi
Sahar Yarmohammadtoosky
Matthew M. Nikahd
Pengfei Dong
Xianqi Li
Linxia Gu
MedIm
AI4CE
32
19
0
01 Oct 2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Dongchao Yang
Jinchuan Tian
Xuejiao Tan
Rongjie Huang
Songxiang Liu
...
Jiang Bian
Xixin Wu
Zhou Zhao
Shinji Watanabe
Helen M. Meng
CVBM
AuLLM
30
116
0
01 Oct 2023
AI ensemble for signal detection of higher order gravitational wave modes of quasi-circular, spinning, non-precessing binary black hole mergers
Minyang Tian
Eliu A. Huerta
Huihuo Zheng
22
1
0
29 Sep 2023
MotionLM: Multi-Agent Motion Forecasting as Language Modeling
Ari Seff
Brian Cera
Dian Chen
Mason Ng
Aurick Zhou
Nigamaa Nayakanti
Khaled S. Refaat
Rami Al-Rfou
Benjamin Sapp
35
92
0
28 Sep 2023
A Unified View of Differentially Private Deep Generative Modeling
Dingfan Chen
Raouf Kerkouche
Mario Fritz
SyDa
38
4
0
27 Sep 2023
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Chunyu Qiang
Hao Li
Yixin Tian
Yi Zhao
Ying Zhang
Longbiao Wang
Jianwu Dang
DiffM
41
2
0
27 Sep 2023
Privacy-preserving and Privacy-attacking Approaches for Speech and Audio -- A Survey
Yuchen Liu
Apu Kapadia
Donald Williamson
AAML
44
0
0
26 Sep 2023
Deep Generative Methods for Producing Forecast Trajectories in Power Systems
Nathan Weill
Jonathan Dumas
AI4TS
33
0
0
26 Sep 2023
Optimization Techniques for a Physical Model of Human Vocalisation
Mateo Cámara
Zhiyuan Xu
Yi-Chen Zong
José-Luis Blanco
Joshua D. Reiss
19
3
0
26 Sep 2023
Audio classification with Dilated Convolution with Learnable Spacings
Ismail Khalfaoui-Hassani
T. Masquelier
Thomas Pellegrini
25
1
0
25 Sep 2023
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis
Yu Gu
Yianrao Bian
Guangzhi Lei
Chao Weng
Dan Su
DiffM
26
2
0
22 Sep 2023
CrossSinger: A Cross-Lingual Multi-Singer High-Fidelity Singing Voice Synthesizer Trained on Monolingual Singers
Xintong Wang
Chang Zeng
Jun Chen
Chunhui Wang
27
6
0
22 Sep 2023
Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis
Ben Maman
Johannes Zeitler
Meinard Muller
Amit H. Bermano
DiffM
22
4
0
21 Sep 2023
The Impact of Silence on Speech Anti-Spoofing
Yuxiang Zhang
Zhuo Li
Jingze Lu
Hua Hua
Wenchao Wang
Pengyuan Zhang
40
19
0
21 Sep 2023
SpeechAlign: a Framework for Speech Translation Alignment Evaluation
Belen Alastruey
Aleix Sant
Gerard I. Gállego
David Dale
Marta R. Costa-jussá
AuLLM
33
3
0
20 Sep 2023
Speak While You Think: Streaming Speech Synthesis During Text Generation
Avihu Dekel
Slava Shechtman
Raul Fernandez
David Haws
Zvi Kons
R. Hoory
27
8
0
20 Sep 2023
Towards Generative Modeling of Urban Flow through Knowledge-enhanced Denoising Diffusion
Zhilun Zhou
Jingtao Ding
Yu Liu
Depeng Jin
Yong Li
DiffM
AI4CE
50
22
0
19 Sep 2023
Speeding Up Speech Synthesis In Diffusion Models By Reducing Data Distribution Recovery Steps Via Content Transfer
Peter Ochieng
DiffM
33
0
0
18 Sep 2023
PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts
Jixun Yao
Yuguang Yang
Yinjiao Lei
Ziqian Ning
Yanni Hu
Yu Pan
Jingjing Yin
Hongbin Zhou
Heng Lu
Linfu Xie
DiffM
50
19
0
17 Sep 2023
Test-Time Compensated Representation Learning for Extreme Traffic Forecasting
Zhiwei Zhang
Weizhong Zhang
Yaowei Huang
Kani Chen
AI4TS
22
0
0
16 Sep 2023
Fewer-token Neural Speech Codec with Time-invariant Codes
Yong Ren
Tao Wang
Jiangyan Yi
Le Xu
Jianhua Tao
Chuyuan Zhang
Jun Zhou
25
33
0
15 Sep 2023
MASTERKEY: Practical Backdoor Attack Against Speaker Verification Systems
Hanqing Guo
Xun Chen
Junfeng Guo
Li Xiao
Qiben Yan
20
11
0
13 Sep 2023
CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
27
7
0
12 Sep 2023
AudRandAug: Random Image Augmentations for Audio Classification
Teerath Kumar
Muhammad Turab
Alessandra Mileo
Malika Bendechache
Takfarinas Saber
28
7
0
09 Sep 2023
A Two-Stage Training Framework for Joint Speech Compression and Enhancement
Jiayi Huang
Zeyu Yan
Wenbin Jiang
Fei Wen
32
0
0
08 Sep 2023
Large-Scale Automatic Audiobook Creation
Brendan Walsh
Mark Hamilton
Greg Newby
Xi Wang
Serena Ruan
...
Lei He
Shaofei Zhang
Eric Dettinger
William T. Freeman
Markus Weimer
41
1
0
07 Sep 2023
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
Takashi Shibuya
Yuhta Takida
Yuki Mitsufuji
23
11
0
06 Sep 2023
Self-Supervised Disentanglement of Harmonic and Rhythmic Features in Music Audio Signals
Yiming Wu
CoGe
DRL
38
0
0
06 Sep 2023
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023
Zhihang Xu
Shaofei Zhang
Xi Wang
Jiajun Zhang
Wenning Wei
Lei He
Sheng Zhao
23
2
0
06 Sep 2023
Object Size-Driven Design of Convolutional Neural Networks: Virtual Axle Detection based on Raw Data
Henik Riedel
Robert Steven Lorenzen
Clemens Hubler
36
1
0
04 Sep 2023
NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Wen Wang
Dongchao Yang
Qichen Ye
Bowen Cao
Yuexian Zou
DiffM
40
3
0
03 Sep 2023
Advances in machine-learning-based sampling motivated by lattice quantum chromodynamics
Kyle Cranmer
G. Kanwar
S. Racanière
Danilo Jimenez Rezende
P. Shanahan
AI4CE
37
27
0
03 Sep 2023
Timbre-reserved Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Li Lyna Zhang
Pengcheng Guo
Linfu Xie
AAML
37
4
0
02 Sep 2023
The FruitShell French synthesis system at the Blizzard 2023 Challenge
Xin Qi
Xiaopeng Wang
Zhiyong Wang
Wang Liu
Mingming Ding
Shuchen Shi
18
1
0
01 Sep 2023
Ten Years of Generative Adversarial Nets (GANs): A survey of the state-of-the-art
Tanujit Chakraborty
Ujjwal Reddy K S
Shraddha M. Naik
Madhurima Panja
B. Manvitha
40
62
0
30 Aug 2023
RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
Mel Vecerík
Carl Doersch
Yi Yang
Todor Davchev
Y. Aytar
Guangyao Zhou
R. Hadsell
Lourdes Agapito
Jonathan Scholz
64
47
0
30 Aug 2023
MASA-TCN: Multi-anchor Space-aware Temporal Convolutional Neural Networks for Continuous and Discrete EEG Emotion Recognition
Yi Ding
Su Zhang
Chuangao Tang
Cuntai Guan
29
11
0
30 Aug 2023
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis
B. Hayes
Jordie Shier
Gyorgy Fazekas
Andrew Mcpherson
C. Saitis
27
21
0
29 Aug 2023
MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection
Yixuan Zhou
Xing Xu
Jingkuan Song
Fumin Shen
Hengtao Shen
AI4CE
33
19
0
29 Aug 2023
Audio Deepfake Detection: A Survey
Jiangyan Yi
Chenglong Wang
J. Tao
Xiaohui Zhang
Chu Yuan Zhang
Yan Zhao
40
45
0
29 Aug 2023
Previous
1
2
3
...
9
10
11
...
59
60
61
Next