Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
MQ
44
9
0
07 Jul 2022
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ali Siahkoohi
Michael Chinen
Tom Denton
W. Kleijn
Jan Skoglund
35
8
0
05 Jul 2022
A survey of multimodal deep generative models
Masahiro Suzuki
Y. Matsuo
SyDa
DRL
62
76
0
05 Jul 2022
Towards trustworthy Energy Disaggregation: A review of challenges, methods and perspectives for Non-Intrusive Load Monitoring
Maria Kaselimi
Eftychios E. Protopapadakis
A. Voulodimos
N. Doulamis
Anastasios Doulamis
30
64
0
05 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Dan Su
DiffM
64
12
0
05 Jul 2022
An adaptive music generation architecture for games based on the deep learning Transformer mode
Gustavo Amaral Costa dos Santos
A. Baffa
Jean-Pierre Briot
Bruno Feijó
Antonio Luz Furtado
MGen
30
2
0
04 Jul 2022
Multivariate Time Series Anomaly Detection with Few Positive Samples
Feng Xue
Weizhong Yan
AI4TS
24
3
0
02 Jul 2022
Simulating financial time series using attention
Weilong Fu
Ali Hirsa
Jörg Osterrieder
AI4TS
AIFin
GAN
24
4
0
01 Jul 2022
DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With Autoencoding Generative Adversarial Networks
J. Nistal
Cyran Aouameur
Ithan Velarde
Stefan Lattner
GAN
45
4
0
29 Jun 2022
Expressive, Variable, and Controllable Duration Modelling in TTS
Ammar Abbas
Thomas Merritt
Alexis Moinet
S. Karlapati
Ewa Muszyñska
Simon Slangen
Elia Gatti
Thomas Drugman
38
10
0
28 Jun 2022
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
68
0
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
27
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
51
23
0
27 Jun 2022
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
16
9
0
27 Jun 2022
Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach
Rohit Arora
13
0
0
27 Jun 2022
Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One
Yezhen Wang
Tong Che
Yue Liu
Kaitao Song
Hengzhi Pei
Yoshua Bengio
Dongsheng Li
37
3
0
26 Jun 2022
Data Augmentation techniques in time series domain: A survey and taxonomy
Guillermo Iglesias
Edgar Talavera
Ángel González-Prieto
Alberto Mozo
S. Gómez-Canaval
AI4TS
27
157
0
25 Jun 2022
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Danilo de Oliveira
Tal Peer
Timo Gerkmann
29
20
0
23 Jun 2022
Restoring speech intelligibility for hearing aid users with deep learning
P. U. Diehl
Y. Singer
Hannes Zilly
U. Schonfeld
Paul Meyer-Rachner
Mark Berry
Henning Sprekeler
Elias Sprengel
A. Pudszuhn
V. Hofmann
19
19
0
23 Jun 2022
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon
Robin Algayres
Tristan Ricoul
Julien Karadayi
Hugo Laurenccon
Salah Zaiem
Abdel-rahman Mohamed
Benoît Sagot
Emmanuel Dupoux
16
13
0
22 Jun 2022
Behavior Transformers: Cloning
k
k
k
modes with one stone
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
28
225
0
22 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
28
0
0
20 Jun 2022
Predicting Hate Intensity of Twitter Conversation Threads
Qing Meng
Tharun Suresh
Roy Ka-wei Lee
Tanmoy Chakraborty
27
19
0
16 Jun 2022
Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings
Sabera Talukder
Jennifer J. Sun
Matthew K. Leonard
Bingni W. Brunton
Yisong Yue
SyDa
19
17
0
16 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
32
7
0
16 Jun 2022
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Ting-Han Fan
Ta-Chung Chi
Alexander I. Rudnicky
Peter J. Ramadge
BDL
40
7
0
15 Jun 2022
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Mehdi Azabou
Michael J. Mendelson
Maks Sorokin
S. Thakoor
Nauman Ahad
Carolina Urzay
Eva L. Dyer
AI4CE
37
6
0
14 Jun 2022
LPCSE: Neural Speech Enhancement through Linear Predictive Coding
Yang Liu
Na Tang
Xia Chu
Yang Yang
Jun Wang
39
1
0
14 Jun 2022
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu
Grigorios G. Chrysos
V. Cevher
DiffM
27
4
0
14 Jun 2022
Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows
Phillip Si
Zeyi Chen
Subham S. Sahoo
Yair Schiff
Volodymyr Kuleshov
32
7
0
14 Jun 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao
Zhong-Yu Li
Qi Han
Ming-Ming Cheng
Liang Wang
44
35
0
14 Jun 2022
Invariant Structure Learning for Better Generalization and Causal Explainability
Yunhao Ge
Sercan O. Arik
Jinsung Yoon
Ao Xu
Laurent Itti
Tomas Pfister
OOD
CML
40
2
0
13 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
25
49
0
11 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
33
232
0
09 Jun 2022
On Neural Architecture Inductive Biases for Relational Tasks
Giancarlo Kerg
Sarthak Mittal
David Rolnick
Yoshua Bengio
Blake A. Richards
Guillaume Lajoie
OOD
25
25
0
09 Jun 2022
Reinforced Inverse Scattering
Hanyang Jiang
Y. Khoo
Haizhao Yang
6
6
0
08 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffM
ViT
33
6
0
08 Jun 2022
Towards a General Purpose CNN for Long Range Dependencies in
N
N
N
D
David W. Romero
David M. Knigge
Albert Gu
Erik J. Bekkers
E. Gavves
Jakub M. Tomczak
Mark Hoogendoorn
26
19
0
07 Jun 2022
Improving trajectory calculations using deep learning inspired single image superresolution
Rudiger Brecht
L. Bakels
Alexander Bihlo
A. Stohl
27
0
0
07 Jun 2022
Improving the Diagnosis of Psychiatric Disorders with Self-Supervised Graph State Space Models
A. E. Gazzar
R. Thomas
G. Wingen
AI4MH
18
6
0
07 Jun 2022
FlexLip: A Controllable Text-to-Lip System
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
31
3
0
07 Jun 2022
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Ziyue Jiang
Zhe Su
Zhou Zhao
Qian Yang
Yi Ren
Jinglin Liu
Zhe Ye
26
4
0
05 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
27
5
0
01 Jun 2022
Extensive Study of Multiple Deep Neural Networks for Complex Random Telegraph Signals
Marcel Robitaille
HeeBong Yang
Lu Wang
Na Young Kim
21
1
0
31 May 2022
Sepsis Prediction with Temporal Convolutional Networks
Xing Wang
Yuntian He
38
2
0
31 May 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
53
38
0
30 May 2022
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin
Yanzhao Zou
Dorien Herremans
26
33
0
30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
204
52
0
30 May 2022
A Graph and Attentive Multi-Path Convolutional Network for Traffic Prediction
Jianzhong Qi
Zhuowei Zhao
E. Tanin
Tingru Cui
Neema Nassir
Majid Sarvi
GNN
28
24
0
30 May 2022
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Yichong Leng
Zehua Chen
Junliang Guo
Haohe Liu
Jiawei Chen
...
Lei He
Xiang-Yang Li
Tao Qin
Sheng Zhao
Tie-Yan Liu
DiffM
55
58
0
30 May 2022
Previous
1
2
3
...
19
20
21
...
59
60
61
Next