ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
MQ
44
9
0
07 Jul 2022
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ali Siahkoohi
Michael Chinen
Tom Denton
W. Kleijn
Jan Skoglund
35
8
0
05 Jul 2022
A survey of multimodal deep generative models
A survey of multimodal deep generative models
Masahiro Suzuki
Y. Matsuo
SyDa
DRL
62
76
0
05 Jul 2022
Towards trustworthy Energy Disaggregation: A review of challenges,
  methods and perspectives for Non-Intrusive Load Monitoring
Towards trustworthy Energy Disaggregation: A review of challenges, methods and perspectives for Non-Intrusive Load Monitoring
Maria Kaselimi
Eftychios E. Protopapadakis
A. Voulodimos
N. Doulamis
Anastasios Doulamis
30
64
0
05 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and
  Any-to-any Voice Conversion
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Dan Su
DiffM
64
12
0
05 Jul 2022
An adaptive music generation architecture for games based on the deep
  learning Transformer mode
An adaptive music generation architecture for games based on the deep learning Transformer mode
Gustavo Amaral Costa dos Santos
A. Baffa
Jean-Pierre Briot
Bruno Feijó
Antonio Luz Furtado
MGen
30
2
0
04 Jul 2022
Multivariate Time Series Anomaly Detection with Few Positive Samples
Multivariate Time Series Anomaly Detection with Few Positive Samples
Feng Xue
Weizhong Yan
AI4TS
24
3
0
02 Jul 2022
Simulating financial time series using attention
Simulating financial time series using attention
Weilong Fu
Ali Hirsa
Jörg Osterrieder
AI4TS
AIFin
GAN
24
4
0
01 Jul 2022
DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With
  Autoencoding Generative Adversarial Networks
DrumGAN VST: A Plugin for Drum Sound Analysis/Synthesis With Autoencoding Generative Adversarial Networks
J. Nistal
Cyran Aouameur
Ithan Velarde
Stefan Lattner
GAN
45
4
0
29 Jun 2022
Expressive, Variable, and Controllable Duration Modelling in TTS
Expressive, Variable, and Controllable Duration Modelling in TTS
Ammar Abbas
Thomas Merritt
Alexis Moinet
S. Karlapati
Ewa Muszyñska
Simon Slangen
Elia Gatti
Thomas Drugman
38
10
0
28 Jun 2022
Show Me Your Face, And I'll Tell You How You Speak
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
68
0
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
27
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of
  Audio DeepFake Detection
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
51
23
0
27 Jun 2022
Sound Model Factory: An Integrated System Architecture for Generative
  Audio Modelling
Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
L. Wyse
Purnima Kamath
Chitralekha Gupta
16
9
0
27 Jun 2022
Detection of Doctored Speech: Towards an End-to-End Parametric
  Learn-able Filter Approach
Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach
Rohit Arora
13
0
0
27 Jun 2022
Your Autoregressive Generative Model Can be Better If You Treat It as an
  Energy-Based One
Your Autoregressive Generative Model Can be Better If You Treat It as an Energy-Based One
Yezhen Wang
Tong Che
Yue Liu
Kaitao Song
Hengzhi Pei
Yoshua Bengio
Dongsheng Li
37
3
0
26 Jun 2022
Data Augmentation techniques in time series domain: A survey and
  taxonomy
Data Augmentation techniques in time series domain: A survey and taxonomy
Guillermo Iglesias
Edgar Talavera
Ángel González-Prieto
Alberto Mozo
S. Gómez-Canaval
AI4TS
27
157
0
25 Jun 2022
Efficient Transformer-based Speech Enhancement Using Long Frames and
  STFT Magnitudes
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Danilo de Oliveira
Tal Peer
Timo Gerkmann
29
20
0
23 Jun 2022
Restoring speech intelligibility for hearing aid users with deep
  learning
Restoring speech intelligibility for hearing aid users with deep learning
P. U. Diehl
Y. Singer
Hannes Zilly
U. Schonfeld
Paul Meyer-Rachner
Mark Berry
Henning Sprekeler
Elias Sprengel
A. Pudszuhn
V. Hofmann
19
19
0
23 Jun 2022
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance
  Lexicon
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon
Robin Algayres
Tristan Ricoul
Julien Karadayi
Hugo Laurenccon
Salah Zaiem
Abdel-rahman Mohamed
Benoît Sagot
Emmanuel Dupoux
16
13
0
22 Jun 2022
Behavior Transformers: Cloning $k$ modes with one stone
Behavior Transformers: Cloning kkk modes with one stone
Nur Muhammad (Mahi) Shafiullah
Zichen Jeff Cui
Ariuntuya Altanzaya
Lerrel Pinto
OffRL
28
225
0
22 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
28
0
0
20 Jun 2022
Predicting Hate Intensity of Twitter Conversation Threads
Predicting Hate Intensity of Twitter Conversation Threads
Qing Meng
Tharun Suresh
Roy Ka-wei Lee
Tanmoy Chakraborty
27
19
0
16 Jun 2022
Deep Neural Imputation: A Framework for Recovering Incomplete Brain
  Recordings
Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings
Sabera Talukder
Jennifer J. Sun
Matthew K. Leonard
Bingni W. Brunton
Yisong Yue
SyDa
19
17
0
16 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis
  Using Linguistic and Prosodic Contexts of Dialogue History
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
32
7
0
16 Jun 2022
Training Discrete Deep Generative Models via Gapped Straight-Through
  Estimator
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
Ting-Han Fan
Ta-Chung Chi
Alexander I. Rudnicky
Peter J. Ramadge
BDL
40
7
0
15 Jun 2022
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Mehdi Azabou
Michael J. Mendelson
Maks Sorokin
S. Thakoor
Nauman Ahad
Carolina Urzay
Eva L. Dyer
AI4CE
37
6
0
14 Jun 2022
LPCSE: Neural Speech Enhancement through Linear Predictive Coding
LPCSE: Neural Speech Enhancement through Linear Predictive Coding
Yang Liu
Na Tang
Xia Chu
Yang Yang
Jun Wang
39
1
0
14 Jun 2022
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu
Grigorios G. Chrysos
V. Cevher
DiffM
27
4
0
14 Jun 2022
Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of
  Normalizing Flows
Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows
Phillip Si
Zeyi Chen
Subham S. Sahoo
Yair Schiff
Volodymyr Kuleshov
32
7
0
14 Jun 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural
  Networks
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao
Zhong-Yu Li
Qi Han
Ming-Ming Cheng
Liang Wang
44
35
0
14 Jun 2022
Invariant Structure Learning for Better Generalization and Causal
  Explainability
Invariant Structure Learning for Better Generalization and Causal Explainability
Yunhao Ge
Sercan O. Arik
Jinsung Yoon
Ao Xu
Laurent Itti
Tomas Pfister
OOD
CML
40
2
0
13 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
25
49
0
11 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
33
232
0
09 Jun 2022
On Neural Architecture Inductive Biases for Relational Tasks
On Neural Architecture Inductive Biases for Relational Tasks
Giancarlo Kerg
Sarthak Mittal
David Rolnick
Yoshua Bengio
Blake A. Richards
Guillaume Lajoie
OOD
25
25
0
09 Jun 2022
Reinforced Inverse Scattering
Reinforced Inverse Scattering
Hanyang Jiang
Y. Khoo
Haizhao Yang
6
6
0
08 Jun 2022
Patch-based Object-centric Transformers for Efficient Video Generation
Patch-based Object-centric Transformers for Efficient Video Generation
Wilson Yan
Ryogo Okumura
Stephen James
Pieter Abbeel
DiffM
ViT
33
6
0
08 Jun 2022
Towards a General Purpose CNN for Long Range Dependencies in $N$D
Towards a General Purpose CNN for Long Range Dependencies in NNND
David W. Romero
David M. Knigge
Albert Gu
Erik J. Bekkers
E. Gavves
Jakub M. Tomczak
Mark Hoogendoorn
26
19
0
07 Jun 2022
Improving trajectory calculations using deep learning inspired single
  image superresolution
Improving trajectory calculations using deep learning inspired single image superresolution
Rudiger Brecht
L. Bakels
Alexander Bihlo
A. Stohl
27
0
0
07 Jun 2022
Improving the Diagnosis of Psychiatric Disorders with Self-Supervised
  Graph State Space Models
Improving the Diagnosis of Psychiatric Disorders with Self-Supervised Graph State Space Models
A. E. Gazzar
R. Thomas
G. Wingen
AI4MH
18
6
0
07 Jun 2022
FlexLip: A Controllable Text-to-Lip System
FlexLip: A Controllable Text-to-Lip System
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
31
3
0
07 Jun 2022
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for
  Text-to-Speech
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Ziyue Jiang
Zhe Su
Zhou Zhao
Qian Yang
Yi Ren
Jinglin Liu
Zhe Ye
26
4
0
05 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
27
5
0
01 Jun 2022
Extensive Study of Multiple Deep Neural Networks for Complex Random
  Telegraph Signals
Extensive Study of Multiple Deep Neural Networks for Complex Random Telegraph Signals
Marcel Robitaille
HeeBong Yang
Lu Wang
Na Young Kim
21
1
0
31 May 2022
Sepsis Prediction with Temporal Convolutional Networks
Sepsis Prediction with Temporal Convolutional Networks
Xing Wang
Yuntian He
38
2
0
31 May 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
53
38
0
30 May 2022
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme
  price movement prediction of Bitcoin
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin
Yanzhao Zou
Dorien Herremans
26
33
0
30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech
  with Untranscribed Data
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
204
52
0
30 May 2022
A Graph and Attentive Multi-Path Convolutional Network for Traffic
  Prediction
A Graph and Attentive Multi-Path Convolutional Network for Traffic Prediction
Jianzhong Qi
Zhuowei Zhao
E. Tanin
Tingru Cui
Neema Nassir
Majid Sarvi
GNN
28
24
0
30 May 2022
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for
  Binaural Audio Synthesis
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Yichong Leng
Zehua Chen
Junliang Guo
Haohe Liu
Jiawei Chen
...
Lei He
Xiang-Yang Li
Tao Qin
Sheng Zhao
Tie-Yan Liu
DiffM
55
58
0
30 May 2022
Previous
123...192021...596061
Next