ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
Multi-Scale Sub-Band Constant-Q Transform Discriminator for
  High-Fidelity Vocoder
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
72
12
0
25 Nov 2023
An NMF-Based Building Block for Interpretable Neural Networks With
  Continual Learning
An NMF-Based Building Block for Interpretable Neural Networks With Continual Learning
Brian K. Vogel
43
0
0
20 Nov 2023
Advancements in Generative AI: A Comprehensive Review of GANs, GPT,
  Autoencoders, Diffusion Model, and Transformers
Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers
Staphord Bengesi
Hoda El-Sayed
Md Kamruzzaman Sarker
Yao Houkpati
John Irungu
T. Oladunni
128
93
0
17 Nov 2023
Formal Verification of Long Short-Term Memory based Audio Classifiers: A
  Star based Approach
Formal Verification of Long Short-Term Memory based Audio Classifiers: A Star based Approach
Neelanjana Pal
Taylor T. Johnson
55
0
0
16 Nov 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control
  and Contrastive Learning with Negative Samples Augmentation
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation
Yimin Deng
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
77
3
0
15 Nov 2023
EDMSound: Spectrogram Based Diffusion Models for Efficient and
  High-Quality Audio Synthesis
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Ge Zhu
Yutong Wen
M. Carbonneau
Zhiyao Duan
DiffM
76
8
0
15 Nov 2023
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice
  Conversion
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion
A. R. Bargum
Stefania Serafin
Cumhur Erkut
70
4
0
14 Nov 2023
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized
  Representation
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation
Jiangzong Wang
Pengcheng Li
Xulong Zhang
Ning Cheng
Jing Xiao
81
0
0
14 Nov 2023
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework
  for Zero-Shot Electroencephalography Signal Conversion
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion
Anders Vestergaard Norskov
Alexander Neergaard Zahid
Morten Morup
65
3
0
13 Nov 2023
Efficient bandwidth extension of musical signals using a differentiable
  harmonic plus noise model
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model
Pierre-Amaury Grumiaux
Mathieu Lagrange
68
3
0
13 Nov 2023
SponTTS: modeling and transferring spontaneous style for TTS
SponTTS: modeling and transferring spontaneous style for TTS
Hanzhao Li
Xinfa Zhu
Liumeng Xue
Yang Song
Yunlin Chen
Lei Xie
89
7
0
13 Nov 2023
Music ControlNet: Multiple Time-varying Controls for Music Generation
Music ControlNet: Multiple Time-varying Controls for Music Generation
Shih-Lun Wu
Chris Donahue
Shinji Watanabe
Nicholas J. Bryan
DiffMMGen
111
61
0
13 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large
  Reconstruction Model
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Jiahao Li
Hao Tan
Kai Zhang
Zexiang Xu
Fujun Luan
Yinghao Xu
Yicong Hong
Kalyan Sunkavalli
Greg Shakhnarovich
Sai Bi
131
275
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor
  Cores
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
100
30
0
10 Nov 2023
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using
  Diffusion Models
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models
Haejin Lee
Jeongwoo Ju
Jonghyuck Lee
Yeoun Joo Lee
Heechul Jung
DiffMMedIm
65
0
0
10 Nov 2023
Synthetic Speaking Children -- Why We Need Them and How to Make Them
Synthetic Speaking Children -- Why We Need Them and How to Make Them
Muhammad Ali Farooq
Dan Bigioi
Rishabh Jain
Wang Yao
Mariam Yiwere
Peter Corcoran
86
0
0
08 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust
  Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
72
30
0
08 Nov 2023
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Prasad Gabbur
DiffM
50
1
0
08 Nov 2023
Impact of HPO on AutoML Forecasting Ensembles
Impact of HPO on AutoML Forecasting Ensembles
David Hoffmann
46
0
0
07 Nov 2023
TS-Diffusion: Generating Highly Complex Time Series with Diffusion
  Models
TS-Diffusion: Generating Highly Complex Time Series with Diffusion Models
Yangming Li
DiffMAI4TS
97
5
0
06 Nov 2023
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency
  for Video Deepfake Detection
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection
Sahibzada Adil Shahzad
Ammarah Hashmi
Yan-Tsung Peng
Yu Tsao
Hsin-Min Wang
96
7
0
05 Nov 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and
  Audio
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
44
5
0
01 Nov 2023
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive
  Learning
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive Learning
Maxwell A. Xu
Alexander Moreno
Hui Wei
Benjamin M. Marlin
James M. Rehg
AI4TSSSL
105
13
0
01 Nov 2023
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Bandhav Veluri
Malek Itani
Justin Chan
Takuya Yoshioka
Shyamnath Gollakota
67
18
0
01 Nov 2023
Deepfake detection by exploiting surface anomalies: the SurFake approach
Deepfake detection by exploiting surface anomalies: the SurFake approach
Andrea Ciamarra
R. Caldelli
Federico Becattini
Lorenzo Seidenari
A. Bimbo
86
14
0
31 Oct 2023
Enabling Acoustic Audience Feedback in Large Virtual Events
Enabling Acoustic Audience Feedback in Large Virtual Events
Tamay Aykut
M. Hofbauer
Christopher B. Kuhn
Eckehard Steinbach
Bernd Girod
77
0
0
27 Oct 2023
Learning an Inventory Control Policy with General Inventory Arrival
  Dynamics
Learning an Inventory Control Policy with General Inventory Arrival Dynamics
Sohrab Andaz
Carson Eisenach
Dhruv Madeka
Kari Torkkola
Randy Jia
Dean Phillips Foster
Sham Kakade
59
2
0
26 Oct 2023
Real-time Neonatal Chest Sound Separation using Deep Learning
Real-time Neonatal Chest Sound Separation using Deep Learning
Yang Yi Poh
Ethan Grooby
Kenneth Tan
Lindsay Zhou
Arrabella King
Ashwin Ramanathan
Atul Malhotra
Mehrtash Harandi
F. Marzbanrad
57
1
0
26 Oct 2023
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as
  a Neurodevelopmental Cue
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as a Neurodevelopmental Cue
Shaotong Zhu
Michael Wan
Sai Kumar Reddy Manne
Emily B. Zimmerman
Sarah Ostadabbas
31
2
0
24 Oct 2023
Synthetic Data as Validation
Synthetic Data as Validation
Qixing Hu
Alan Yuille
Zongwei Zhou
SyDaOOD
73
8
0
24 Oct 2023
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks
  with TTFS Coding
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding
Qu Yang
Malu Zhang
Jibin Wu
Kay Chen Tan
Haizhou Li
63
10
0
23 Oct 2023
Mid-Long Term Daily Electricity Consumption Forecasting Based on
  Piecewise Linear Regression and Dilated Causal CNN
Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN
Zhou Lan
Ben Liu
Yi Feng
Danhuang Dong
Peng Zhang
AI4TS
30
1
0
23 Oct 2023
An overview of text-to-speech systems and media applications
An overview of text-to-speech systems and media applications
Mohammad Reza Hasanabadi
28
3
0
22 Oct 2023
MFCC-GAN Codec: A New AI-based Audio Coding
MFCC-GAN Codec: A New AI-based Audio Coding
Mohammad Reza Hasanabadi
40
0
0
22 Oct 2023
Neural Likelihood Approximation for Integer Valued Time Series Data
Neural Likelihood Approximation for Integer Valued Time Series Data
Luke O'Loughlin
John Maclean
Andrew Black
AI4TS
53
0
0
19 Oct 2023
Physics-informed neural network for acoustic resonance analysis in a
  one-dimensional acoustic tube
Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube
Kazuya Yokota
Takahiko Kurahashi
Masajiro Abe
28
5
0
18 Oct 2023
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing
  Voice Conversion
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
Xueyao Zhang
Yicheng Gu
Haopeng Chen
Zihao Fang
Lexiao Zou
Junan Zhang
Liumeng Xue
Jinchao Zhang
Jie Zhou
Zhizheng Wu
DiffM
64
2
0
17 Oct 2023
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework
  for Music-Dance Retrieval
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval
Kaixing Yang
Xukun Zhou
Xulong Tang
Ran Diao
Hongyan Liu
Jun He
Zhaoxin Fan
71
3
0
16 Oct 2023
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete
  Representations
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations
Heyuan Yao
Zhenhua Song
Yuyang Zhou
Tenglong Ao
Baoquan Chen
Libin Liu
135
44
0
16 Oct 2023
Generative Adversarial Training for Text-to-Speech Synthesis Based on
  Raw Phonetic Input and Explicit Prosody Modelling
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling
Tiberiu Boros
Stefan Daniel Dumitrescu
Ionut Mironica
Radu Chivereanu
GAN
38
1
0
14 Oct 2023
Machine Learning for Urban Air Quality Analytics: A Survey
Machine Learning for Urban Air Quality Analytics: A Survey
Jindong Han
Weijiao Zhang
Hao Liu
Hui Xiong
AI4CE
114
12
0
14 Oct 2023
A decoder-only foundation model for time-series forecasting
A decoder-only foundation model for time-series forecasting
Abhimanyu Das
Weihao Kong
Rajat Sen
Yichen Zhou
AI4TSAI4CE
137
243
0
14 Oct 2023
ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual
  Learning
ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning
Jiecheng Lu
Xu Han
Shihao Yang
AI4TS
49
4
0
14 Oct 2023
LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient
  Representations
LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations
Ahmed Khalil
Robert Piechocki
Raúl Santos-Rodríguez
54
2
0
13 Oct 2023
Large Language Models Are Zero-Shot Time Series Forecasters
Large Language Models Are Zero-Shot Time Series Forecasters
Nate Gruver
Marc Finzi
Shikai Qiu
Andrew Gordon Wilson
AI4TS
97
375
0
11 Oct 2023
Prosody Analysis of Audiobooks
Prosody Analysis of Audiobooks
Charuta Pethe
Yunting Yin
Felix D Childress
Yunting Yin
Steven Skiena
89
1
0
10 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio
  tokens
Generative Spoken Language Model based on continuous word-sized audio tokens
Robin Algayres
Yossi Adi
Tu Nguyen
Jade Copet
Gabriel Synnaeve
Benoît Sagot
Emmanuel Dupoux
AuLLM
119
16
0
08 Oct 2023
Comparative Analysis of Transfer Learning in Deep Learning
  Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset
Ze Liu
53
1
0
08 Oct 2023
FM Tone Transfer with Envelope Learning
FM Tone Transfer with Envelope Learning
Franco Caspe
Andrew Mcpherson
Mark Sandler
57
2
0
07 Oct 2023
Hate Speech Detection in Limited Data Contexts using Synthetic Data
  Generation
Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation
Aman Khullar
Daniel K. Nkemelu
Cuong V. Nguyen
Michael L. Best
80
5
0
04 Oct 2023
Previous
123...91011...606162
Next