ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
Your Diffusion Model is Secretly a Zero-Shot Classifier
Your Diffusion Model is Secretly a Zero-Shot Classifier
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffMVLM
177
240
0
28 Mar 2023
Learning Generative Models with Goal-conditioned Reinforcement Learning
Learning Generative Models with Goal-conditioned Reinforcement Learning
Mariana Vargas Vieyra
Pierre Ménard
GAN
26
0
0
26 Mar 2023
Spatio-Temporal Graph Neural Networks for Predictive Learning in Urban
  Computing: A Survey
Spatio-Temporal Graph Neural Networks for Predictive Learning in Urban Computing: A Survey
G. Jin
Yuxuan Liang
Yuchen Fang
Zezhi Shao
Jincai Huang
Junbo Zhang
Yu Zheng
AI4TSAI4CE
143
211
0
25 Mar 2023
Autoregressive Conditional Neural Processes
Autoregressive Conditional Neural Processes
W. Bruinsma
Stratis Markou
James Requiema
Andrew Y. K. Foong
Tom R. Andersson
Anna Vaughan
Anthony Buonomo
J. S. Hosking
Richard Turner
BDLUQCV
88
25
0
25 Mar 2023
Deep Augmentation: Dropout as Augmentation for Self-Supervised Learning
Deep Augmentation: Dropout as Augmentation for Self-Supervised Learning
Rickard Brüel-Gabrielsson
Tongzhou Wang
Manel Baradad
Justin Solomon
ViT
58
0
0
25 Mar 2023
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for
  Generative Adversarial Network-Based Speech Synthesis
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Shogo Seki
52
9
0
24 Mar 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and
  Enhancement in Generative AI
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI
Chenshuang Zhang
Chaoning Zhang
Sheng Zheng
Mengchun Zhang
Maryam Qamar
Sung-Ho Bae
In So Kweon
DiffMMedIm
132
73
0
23 Mar 2023
It is all Connected: A New Graph Formulation for Spatio-Temporal
  Forecasting
It is all Connected: A New Graph Formulation for Spatio-Temporal Forecasting
Lars Odegaard Bentsen
N. Warakagoda
R. Stenbro
P. Engelstad
AI4TS
28
1
0
23 Mar 2023
A dynamic risk score for early prediction of cardiogenic shock using
  machine learning
A dynamic risk score for early prediction of cardiogenic shock using machine learning
Yuxuan Hu
Albert Y Lui
M. Goldstein
Mukund Sudarshan
Andrea Tinsay
...
G. Fishman
J. Hochman
S. Katz
S. Bernard
Rajesh Ranganath
61
2
0
22 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
186
170
0
21 Mar 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Muhammad Usama
Junaid Qadir
169
48
0
21 Mar 2023
Lipschitz-bounded 1D convolutional neural networks using the Cayley
  transform and the controllability Gramian
Lipschitz-bounded 1D convolutional neural networks using the Cayley transform and the controllability Gramian
Patricia Pauli
Ruigang Wang
I. Manchester
Frank Allgöwer
71
8
0
20 Mar 2023
Knowledge Distillation from Multiple Foundation Models for End-to-End
  Speech Recognition
Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition
Xiaoyu Yang
Qiujia Li
Chuxu Zhang
P. Woodland
85
7
0
20 Mar 2023
Configurable EBEN: Extreme Bandwidth Extension Network to enhance
  body-conducted speech capture
Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture
Hauret Julien
Joubaud Thomas
V. Zimpfer
Bavu Éric
61
7
0
17 Mar 2023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Peng Jin
Hao Li
Ze-Long Cheng
Kehan Li
Xiang Ji
Chang-rui Liu
Li-ming Yuan
Jie Chen
DiffMVGen
93
58
0
17 Mar 2023
Effectively Modeling Time Series with Simple Discrete State Spaces
Effectively Modeling Time Series with Simple Discrete State Spaces
Michael Zhang
Khaled Kamal Saab
Michael Poli
Tri Dao
Karan Goel
Christopher Ré
AI4TS
71
48
0
16 Mar 2023
Relax, it doesn't matter how you get there: A new self-supervised
  approach for multi-timescale behavior analysis
Relax, it doesn't matter how you get there: A new self-supervised approach for multi-timescale behavior analysis
Mehdi Azabou
Michael J. Mendelson
Nauman Ahad
Maks Sorokin
S. Thakoor
Carolina Urzay
Eva L. Dyer
76
4
0
15 Mar 2023
Evaluating gesture generation in a large-scale open challenge: The GENEA
  Challenge 2022
Evaluating gesture generation in a large-scale open challenge: The GENEA Challenge 2022
Taras Kucherenko
Pieter Wolfert
Youngwoo Yoon
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
69
24
0
15 Mar 2023
Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction
  Propagation Networks
Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks
Darius Petermann
Inseon Jang
Minje Kim
47
1
0
14 Mar 2023
Lightweight feature encoder for wake-up word detection based on
  self-supervised speech representation
Lightweight feature encoder for wake-up word detection based on self-supervised speech representation
Hyungjun Lim
Younggwan Kim
Ki-Woong Yeom
E. Seo
Hoodong Lee
Stanley Jungkyu Choi
Honglak Lee
71
1
0
14 Mar 2023
Transformer Encoder with Multiscale Deep Learning for Pain
  Classification Using Physiological Signals
Transformer Encoder with Multiscale Deep Learning for Pain Classification Using Physiological Signals
Zhenyu Lu
Burcu Ozek
S. Kamarthi
ViTMedIm
58
15
0
13 Mar 2023
Resurrecting Recurrent Neural Networks for Long Sequences
Resurrecting Recurrent Neural Networks for Long Sequences
Antonio Orvieto
Samuel L. Smith
Albert Gu
Anushan Fernando
Çağlar Gülçehre
Razvan Pascanu
Soham De
341
299
0
11 Mar 2023
An End-to-End Neural Network for Image-to-Audio Transformation
An End-to-End Neural Network for Image-to-Audio Transformation
Liu Chen
Michael Deisher
Munir Georges
52
3
0
10 Mar 2023
Distribution Preserving Source Separation With Time Frequency Predictive
  Models
Distribution Preserving Source Separation With Time Frequency Predictive Models
Pedro J. Villasana T
J. Klejsa
Lars Villemoes
P. Hedelin
50
2
0
10 Mar 2023
Baldur: Whole-Proof Generation and Repair with Large Language Models
Baldur: Whole-Proof Generation and Repair with Large Language Models
E. First
M. Rabe
Talia Ringer
Yuriy Brun
141
108
0
08 Mar 2023
Vector Quantized Time Series Generation with a Bidirectional Prior Model
Vector Quantized Time Series Generation with a Bidirectional Prior Model
Daesoo Lee
Sara Malacarne
Erlend Aune
BDL
88
29
0
08 Mar 2023
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec
  Language Modeling
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Zi-Hua Zhang
Long Zhou
Chengyi Wang
Sanyuan Chen
Yu Wu
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
VLM
105
187
0
07 Mar 2023
A Challenging Benchmark for Low-Resource Learning
A Challenging Benchmark for Low-Resource Learning
Yudong Wang
Chang Ma
Qingxiu Dong
Lingpeng Kong
Jingjing Xu
74
4
0
07 Mar 2023
On Hierarchical Multi-Resolution Graph Generative Models
On Hierarchical Multi-Resolution Graph Generative Models
Mahdi Karami
Jun Luo
AI4CE
82
0
0
06 Mar 2023
Guiding Energy-based Models via Contrastive Latent Variables
Guiding Energy-based Models via Contrastive Latent Variables
Hankook Lee
Jongheon Jeong
Sejun Park
Jinwoo Shin
BDL
88
15
0
06 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative
  Language Model
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
118
7
0
06 Mar 2023
A General Framework for Learning Procedural Audio Models of
  Environmental Sounds
A General Framework for Learning Procedural Audio Models of Environmental Sounds
Danzel Serrano
M. Cartwright
DiffMDRL
63
1
0
04 Mar 2023
Low-Complexity Audio Embedding Extractors
Low-Complexity Audio Embedding Extractors
Florian Schmid
Khaled Koutini
Gerhard Widmer
46
4
0
03 Mar 2023
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised
  Speech and Text Representations
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Yuma Koizumi
Heiga Zen
Shigeki Karita
Yifan Ding
Kohei Yatabe
Nobuyuki Morioka
Yu Zhang
Wei Han
Ankur Bapna
M. Bacchiani
94
29
0
03 Mar 2023
Speaker-Aware Anti-Spoofing
Speaker-Aware Anti-Spoofing
Xuechen Liu
Md. Sahidullah
Kong Aik Lee
Tomi Kinnunen
81
3
0
02 Mar 2023
ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised
  representations
ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations
N. Shah
Saiteja Kosgi
Vishal Tambrahalli
Neha Sahipjohn
Anil Nelakanti
Vineet Gandhi
74
8
0
01 Mar 2023
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation
  Detection and Correction
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction
R. Anantha
Kriti Bhasin
Daniela Aguilar
Prabal Vashisht
Becci Williamson
Srinivas Chappidi
63
0
0
01 Mar 2023
UniFLG: Unified Facial Landmark Generator from Text or Speech
UniFLG: Unified Facial Landmark Generator from Text or Speech
Kentaro Mitsui
Yukiya Hono
Kei Sawada
CVBM
54
7
0
28 Feb 2023
A Brief Survey on the Approximation Theory for Sequence Modelling
A Brief Survey on the Approximation Theory for Sequence Modelling
Hao Jiang
Qianxiao Li
Zhong Li
Shida Wang
AI4TS
96
12
0
27 Feb 2023
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
Jiyoung Lee
Joon Son Chung
Soo-Whan Chung
DiffM
101
31
0
27 Feb 2023
An algorithmic framework for the optimization of deep neural networks
  architectures and hyperparameters
An algorithmic framework for the optimization of deep neural networks architectures and hyperparameters
Julie Keisler
El-Ghazali Talbi
Sandra Claudel
Gilles Cabriel
84
6
0
27 Feb 2023
SurvivalGAN: Generating Time-to-Event Data for Survival Analysis
SurvivalGAN: Generating Time-to-Event Data for Survival Analysis
Alexander Norcliffe
B. Cebere
F. Imrie
Pietro Lio
M. Schaar
SyDa
74
15
0
24 Feb 2023
LightCTS: A Lightweight Framework for Correlated Time Series Forecasting
LightCTS: A Lightweight Framework for Correlated Time Series Forecasting
Zhichen Lai
Dalin Zhang
Huan Li
Christian S. Jensen
Hua Lu
Yan Zhao
AI4TS
69
32
0
23 Feb 2023
Weakly Supervised Temporal Convolutional Networks for Fine-grained
  Surgical Activity Recognition
Weakly Supervised Temporal Convolutional Networks for Fine-grained Surgical Activity Recognition
Sanat Ramesh
Diego DallÁlba
Cristians Gonzalez
Tong Yu
Pietro Mascagni
Didier Mutter
J. Marescaux
Paolo Fiorini
N. Padoy
MedIm
62
4
0
21 Feb 2023
Hello Me, Meet the Real Me: Audio Deepfake Attacks on Voice Assistants
Hello Me, Meet the Real Me: Audio Deepfake Attacks on Voice Assistants
Domna Bilika
Nikoletta Michopoulou
E. Alepis
Constantinos Patsakis
69
10
0
20 Feb 2023
Spatio-Temporal Momentum: Jointly Learning Time-Series and
  Cross-Sectional Strategies
Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies
Wee Ling Tan
Stephen J. Roberts
S. Zohren
AI4TSAIFin
39
10
0
20 Feb 2023
Because Every Sensor Is Unique, so Is Every Pair: Handling Dynamicity in
  Traffic Forecasting
Because Every Sensor Is Unique, so Is Every Pair: Handling Dynamicity in Traffic Forecasting
Arian Prabowo
Wei Shao
Hao Xue
Piotr Koniusz
Flora D. Salim
AI4TS
92
16
0
20 Feb 2023
Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts
Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Ehab AlBadawy
Siwei Lyu
163
3
0
18 Feb 2023
DTAAD: Dual Tcn-Attention Networks for Anomaly Detection in Multivariate
  Time Series Data
DTAAD: Dual Tcn-Attention Networks for Anomaly Detection in Multivariate Time Series Data
Ling Yu
AI4TS
104
33
0
17 Feb 2023
Continuous-time convolutions model of event sequences
Continuous-time convolutions model of event sequences
Vladislav Zhuzhel
Vsevolod Grabar
Galina Boeva
Artem Zabolotnyi
Alexander Stepikin
...
Mikhail Orlov
Ivan Kireev
Evgeny Burnaev
Rodrigo Rivera-Castro
Alexey Zaytsev
AI4TS
41
0
0
13 Feb 2023
Previous
123...141516...606162
Next