ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
An adaptive music generation architecture for games based on the deep
  learning Transformer mode
An adaptive music generation architecture for games based on the deep learning Transformer mode
Gustavo Amaral Costa dos Santos
A. Baffa
Jean-Pierre Briot
Bruno Feijó
Antonio Luz Furtado
MGen
56
2
0
04 Jul 2022
Multivariate Time Series Anomaly Detection with Few Positive Samples
Multivariate Time Series Anomaly Detection with Few Positive Samples
Feng Xue
Weizhong Yan
AI4TS
49
3
0
02 Jul 2022
Simulating financial time series using attention
Simulating financial time series using attention
Weilong Fu
Ali Hirsa
Jörg Osterrieder
AI4TSAIFinGAN
68
4
0
01 Jul 2022
DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With
  Autoencoding Generative Adversarial Networks
DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With Autoencoding Generative Adversarial Networks
J. Nistal
Cyran Aouameur
Ithan Velarde
Stefan Lattner
GAN
75
5
0
29 Jun 2022
Expressive, Variable, and Controllable Duration Modelling in TTS
Expressive, Variable, and Controllable Duration Modelling in TTS
Ammar Abbas
Thomas Merritt
Alexis Moinet
S. Karlapati
Ewa Muszyñska
Simon Slangen
Elia Gatti
Thomas Drugman
65
10
0
28 Jun 2022
Show Me Your Face, And I'll Tell You How You Speak
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
102
0
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
107
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of
  Audio DeepFake Detection
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
95
23
0
27 Jun 2022
Sound Model Factory: An Integrated System Architecture for Generative
  Audio Modelling
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
41
9
0
27 Jun 2022
Detection of Doctored Speech: Towards an End-to-End Parametric
  Learn-able Filter Approach
Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach
Rohit Arora
29
0
0
27 Jun 2022
Your Autoregressive Generative Model Can be Better If You Treat It as an
  Energy-Based One
Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One
Yezhen Wang
Tong Che
Yue Liu
Kaitao Song
Hengzhi Pei
Yoshua Bengio
Dongsheng Li
56
3
0
26 Jun 2022
Data Augmentation techniques in time series domain: A survey and
  taxonomy
Data Augmentation techniques in time series domain: A survey and taxonomy
Guillermo Iglesias
Edgar Talavera
Ángel González-Prieto
Alberto Mozo
S. Gómez-Canaval
AI4TS
109
171
0
25 Jun 2022
Efficient Transformer-based Speech Enhancement Using Long Frames and
  STFT Magnitudes
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Danilo de Oliveira
Tal Peer
Timo Gerkmann
64
21
0
23 Jun 2022
Restoring speech intelligibility for hearing aid users with deep
  learning
Restoring speech intelligibility for hearing aid users with deep learning
P. U. Diehl
Y. Singer
Hannes Zilly
U. Schonfeld
Paul Meyer-Rachner
Mark Berry
Henning Sprekeler
Elias Sprengel
A. Pudszuhn
V. Hofmann
42
20
0
23 Jun 2022
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance
  Lexicon
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon
Robin Algayres
Tristan Ricoul
Julien Karadayi
Hugo Laurenccon
Salah Zaiem
Abdel-rahman Mohamed
Benoît Sagot
Emmanuel Dupoux
79
14
0
22 Jun 2022
Behavior Transformers: Cloning $k$ modes with one stone
Behavior Transformers: Cloning kkk modes with one stone
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
78
242
0
22 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
37
0
0
20 Jun 2022
Predicting Hate Intensity of Twitter Conversation Threads
Predicting Hate Intensity of Twitter Conversation Threads
Qing Meng
Tharun Suresh
Roy Ka-wei Lee
Tanmoy Chakraborty
124
20
0
16 Jun 2022
Deep Neural Imputation: A Framework for Recovering Incomplete Brain
  Recordings
Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings
Sabera Talukder
Jennifer J. Sun
Matthew K. Leonard
Bingni W. Brunton
Yisong Yue
SyDa
51
16
0
16 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis
  Using Linguistic and Prosodic Contexts of Dialogue History
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
62
8
0
16 Jun 2022
Training Discrete Deep Generative Models via Gapped Straight-Through
  Estimator
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Ting-Han Fan
Ta-Chung Chi
Alexander I. Rudnicky
Peter J. Ramadge
BDL
67
7
0
15 Jun 2022
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Mehdi Azabou
Michael J. Mendelson
Maks Sorokin
S. Thakoor
Nauman Ahad
Carolina Urzay
Eva L. Dyer
AI4CE
71
6
0
14 Jun 2022
LPCSE: Neural Speech Enhancement through Linear Predictive Coding
LPCSE: Neural Speech Enhancement through Linear Predictive Coding
Yang Liu
Na Tang
Xia Chu
Yang Yang
Jun Wang
68
1
0
14 Jun 2022
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu
Grigorios G. Chrysos
Volkan Cevher
DiffM
144
4
0
14 Jun 2022
Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of
  Normalizing Flows
Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows
Phillip Si
Zeyi Chen
Subham S. Sahoo
Yair Schiff
Volodymyr Kuleshov
109
7
0
14 Jun 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural
  Networks
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao
Zhong-Yu Li
Qi Han
Ming-Ming Cheng
Liang Wang
102
35
0
14 Jun 2022
Invariant Structure Learning for Better Generalization and Causal
  Explainability
Invariant Structure Learning for Better Generalization and Causal Explainability
Yunhao Ge
Sercan O. Arik
Jinsung Yoon
Ao Xu
Laurent Itti
Tomas Pfister
OODCML
51
2
0
13 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
79
51
0
11 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
165
255
0
09 Jun 2022
On Neural Architecture Inductive Biases for Relational Tasks
On Neural Architecture Inductive Biases for Relational Tasks
Giancarlo Kerg
Sarthak Mittal
David Rolnick
Yoshua Bengio
Blake A. Richards
Guillaume Lajoie
OOD
104
25
0
09 Jun 2022
Reinforced Inverse Scattering
Reinforced Inverse Scattering
Hanyang Jiang
Y. Khoo
Haizhao Yang
57
7
0
08 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffMViT
85
6
0
08 Jun 2022
Towards a General Purpose CNN for Long Range Dependencies in $N$D
Towards a General Purpose CNN for Long Range Dependencies in NNND
David W. Romero
David M. Knigge
Albert Gu
Erik J. Bekkers
E. Gavves
Jakub M. Tomczak
Mark Hoogendoorn
89
20
0
07 Jun 2022
Improving trajectory calculations using deep learning inspired single
  image superresolution
Improving trajectory calculations using deep learning inspired single image superresolution
Rudiger Brecht
L. Bakels
Alexander Bihlo
A. Stohl
67
0
0
07 Jun 2022
Improving the Diagnosis of Psychiatric Disorders with Self-Supervised
  Graph State Space Models
Improving the Diagnosis of Psychiatric Disorders with Self-Supervised Graph State Space Models
A. E. Gazzar
R. Thomas
G. Wingen
AI4MH
59
7
0
07 Jun 2022
FlexLip: A Controllable Text-to-Lip System
FlexLip: A Controllable Text-to-Lip System
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
55
3
0
07 Jun 2022
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for
  Text-to-Speech
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Ziyue Jiang
Zhe Su
Zhou Zhao
Qian Yang
Yi Ren
Jinglin Liu
Zhe Ye
73
5
0
05 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Jane Polak Scowcroft
105
5
0
01 Jun 2022
Extensive Study of Multiple Deep Neural Networks for Complex Random
  Telegraph Signals
Extensive Study of Multiple Deep Neural Networks for Complex Random Telegraph Signals
Marcel Robitaille
HeeBong Yang
Lu Wang
Na Young Kim
29
1
0
31 May 2022
Sepsis Prediction with Temporal Convolutional Networks
Sepsis Prediction with Temporal Convolutional Networks
Xing Wang
Yuntian He
53
2
0
31 May 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
114
40
0
30 May 2022
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme
  price movement prediction of Bitcoin
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin
Yanzhao Zou
Dorien Herremans
61
36
0
30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech
  with Untranscribed Data
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
249
53
0
30 May 2022
A Graph and Attentive Multi-Path Convolutional Network for Traffic
  Prediction
A Graph and Attentive Multi-Path Convolutional Network for Traffic Prediction
Jianzhong Qi
Zhuowei Zhao
E. Tanin
Tingru Cui
Neema Nassir
Majid Sarvi
GNN
59
28
0
30 May 2022
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for
  Binaural Audio Synthesis
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Yichong Leng
Zehua Chen
Junliang Guo
Haohe Liu
Jiawei Chen
...
Lei He
Xiang-Yang Li
Tao Qin
Sheng Zhao
Tie-Yan Liu
DiffM
157
61
0
30 May 2022
Machine Learning for Microcontroller-Class Hardware: A Review
Machine Learning for Microcontroller-Class Hardware: A Review
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
109
125
0
29 May 2022
Deep Learning-based Spatially Explicit Emulation of an Agent-Based
  Simulator for Pandemic in a City
Deep Learning-based Spatially Explicit Emulation of an Agent-Based Simulator for Pandemic in a City
Varun Madhavan
Adway Mitra
P. Chakrabarti
AI4CE
50
0
0
28 May 2022
Group-level Brain Decoding with Deep Learning
Group-level Brain Decoding with Deep Learning
Richard Csaky
M. Es
Oiwi Parker Jones
M. Woolrich
53
12
0
27 May 2022
Do we really need temporal convolutions in action segmentation?
Do we really need temporal convolutions in action segmentation?
Dazhao Du
Fuchun Sun
Yu Li
Zhongang Qi
Hui Xiong
Ying Shan
ViT
72
17
0
26 May 2022
TDASS: Target Domain Adaptation Speech Synthesis Framework for
  Multi-speaker Low-Resource TTS
TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
69
14
0
24 May 2022
Previous
123...202122...606162
Next