ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Detecting Voice Cloning Attacks via Timbre Watermarking
Detecting Voice Cloning Attacks via Timbre Watermarking
Chang-rui Liu
Jie Zhang
Tianwei Zhang
Xi Yang
Weiming Zhang
Neng H. Yu
33
29
0
06 Dec 2023
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using
  Synthetic Data and Transfer learning
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning
Raviraj Joshi
Nikesh Garera
33
0
0
02 Dec 2023
Context Retrieval via Normalized Contextual Latent Interaction for
  Conversational Agent
Context Retrieval via Normalized Contextual Latent Interaction for Conversational Agent
Junfeng Liu
Zhuocheng Mei
Kewen Peng
R. Vatsavai
27
1
0
01 Dec 2023
Spatial-Temporal-Decoupled Masked Pre-training for Spatiotemporal
  Forecasting
Spatial-Temporal-Decoupled Masked Pre-training for Spatiotemporal Forecasting
Haotian Gao
Renhe Jiang
Zheng Dong
Jinliang Deng
Yuxin Ma
Xuan Song
AI4TS
46
15
0
01 Dec 2023
DREAM: Diffusion Rectification and Estimation-Adaptive Models
DREAM: Diffusion Rectification and Estimation-Adaptive Models
Jinxin Zhou
Tianyu Ding
Tianyi Chen
Jiachen Jiang
Ilya Zharkov
Zhihui Zhu
Luming Liang
36
7
0
30 Nov 2023
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D
  Face Diffuser
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser
Peng Chen
Xiaobao Wei
Ming Lu
Yitong Zhu
Nai-Ming Yao
Xingyu Xiao
Hui Chen
34
12
0
28 Nov 2023
Stability-Informed Initialization of Neural Ordinary Differential
  Equations
Stability-Informed Initialization of Neural Ordinary Differential Equations
Theodor Westny
Arman Mohammadi
Daniel Jung
Erik Frisk
28
0
0
27 Nov 2023
Multi-Scale Sub-Band Constant-Q Transform Discriminator for
  High-Fidelity Vocoder
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
31
12
0
25 Nov 2023
An NMF-Based Building Block for Interpretable Neural Networks With
  Continual Learning
An NMF-Based Building Block for Interpretable Neural Networks With Continual Learning
Brian K. Vogel
25
0
0
20 Nov 2023
Advancements in Generative AI: A Comprehensive Review of GANs, GPT,
  Autoencoders, Diffusion Model, and Transformers
Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers
Staphord Bengesi
Hoda El-Sayed
Md Kamruzzaman Sarker
Yao Houkpati
John Irungu
T. Oladunni
58
77
0
17 Nov 2023
Formal Verification of Long Short-Term Memory based Audio Classifiers: A
  Star based Approach
Formal Verification of Long Short-Term Memory based Audio Classifiers: A Star based Approach
Neelanjana Pal
Taylor T. Johnson
27
0
0
16 Nov 2023
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control
  and Contrastive Learning with Negative Samples Augmentation
CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation
Yimin Deng
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
35
3
0
15 Nov 2023
EDMSound: Spectrogram Based Diffusion Models for Efficient and
  High-Quality Audio Synthesis
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis
Ge Zhu
Yutong Wen
M. Carbonneau
Zhiyao Duan
DiffM
51
7
0
15 Nov 2023
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice
  Conversion
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion
A. R. Bargum
Stefania Serafin
Cumhur Erkut
26
3
0
14 Nov 2023
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized
  Representation
DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation
Jiangzong Wang
Pengcheng Li
Xulong Zhang
Ning Cheng
Jing Xiao
32
0
0
14 Nov 2023
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework
  for Zero-Shot Electroencephalography Signal Conversion
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion
Anders Vestergaard Norskov
Alexander Neergaard Zahid
Morten Morup
45
2
0
13 Nov 2023
Efficient bandwidth extension of musical signals using a differentiable
  harmonic plus noise model
Efficient bandwidth extension of musical signals using a differentiable harmonic plus noise model
Pierre-Amaury Grumiaux
Mathieu Lagrange
16
2
0
13 Nov 2023
SponTTS: modeling and transferring spontaneous style for TTS
SponTTS: modeling and transferring spontaneous style for TTS
Hanzhao Li
Xinfa Zhu
Liumeng Xue
Yang Song
Yunlin Chen
Lei Xie
48
7
0
13 Nov 2023
Music ControlNet: Multiple Time-varying Controls for Music Generation
Music ControlNet: Multiple Time-varying Controls for Music Generation
Shih-Lun Wu
Chris Donahue
Shinji Watanabe
Nicholas J. Bryan
DiffM
MGen
39
50
0
13 Nov 2023
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large
  Reconstruction Model
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Jiahao Li
Hao Tan
Kai Zhang
Zexiang Xu
Fujun Luan
Yinghao Xu
Yicong Hong
Kalyan Sunkavalli
Greg Shakhnarovich
Sai Bi
61
254
0
10 Nov 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor
  Cores
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Daniel Y. Fu
Hermann Kumbong
Eric N. D. Nguyen
Christopher Ré
VLM
46
29
0
10 Nov 2023
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using
  Diffusion Models
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models
Haejin Lee
Jeongwoo Ju
Jonghyuck Lee
Yeoun Joo Lee
Heechul Jung
DiffM
MedIm
43
0
0
10 Nov 2023
Synthetic Speaking Children -- Why We Need Them and How to Make Them
Synthetic Speaking Children -- Why We Need Them and How to Make Them
Muhammad Ali Farooq
Dan Bigioi
Rishabh Jain
Wang Yao
Mariam Yiwere
Peter Corcoran
27
0
0
08 Nov 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust
  Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
34
24
0
08 Nov 2023
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Prasad Gabbur
DiffM
35
1
0
08 Nov 2023
Impact of HPO on AutoML Forecasting Ensembles
Impact of HPO on AutoML Forecasting Ensembles
David Hoffmann
30
0
0
07 Nov 2023
TS-Diffusion: Generating Highly Complex Time Series with Diffusion
  Models
TS-Diffusion: Generating Highly Complex Time Series with Diffusion Models
Yangming Li
DiffM
AI4TS
45
5
0
06 Nov 2023
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency
  for Video Deepfake Detection
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection
Sahibzada Adil Shahzad
Ammarah Hashmi
Yan-Tsung Peng
Yu Tsao
Hsin-Min Wang
34
5
0
05 Nov 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and
  Audio
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
20
2
0
01 Nov 2023
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive
  Learning
REBAR: Retrieval-Based Reconstruction for Time-series Contrastive Learning
Maxwell A. Xu
Alexander Moreno
Hui Wei
Benjamin M. Marlin
James M. Rehg
AI4TS
SSL
34
11
0
01 Nov 2023
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Bandhav Veluri
Malek Itani
Justin Chan
Takuya Yoshioka
Shyamnath Gollakota
31
15
0
01 Nov 2023
Deepfake detection by exploiting surface anomalies: the SurFake approach
Deepfake detection by exploiting surface anomalies: the SurFake approach
Andrea Ciamarra
R. Caldelli
Federico Becattini
Lorenzo Seidenari
A. Bimbo
38
14
0
31 Oct 2023
Enabling Acoustic Audience Feedback in Large Virtual Events
Enabling Acoustic Audience Feedback in Large Virtual Events
Tamay Aykut
M. Hofbauer
Christopher B. Kuhn
Eckehard Steinbach
Bernd Girod
55
0
0
27 Oct 2023
Learning an Inventory Control Policy with General Inventory Arrival
  Dynamics
Learning an Inventory Control Policy with General Inventory Arrival Dynamics
Sohrab Andaz
Carson Eisenach
Dhruv Madeka
Kari Torkkola
Randy Jia
Dean Phillips Foster
Sham Kakade
33
2
0
26 Oct 2023
Real-time Neonatal Chest Sound Separation using Deep Learning
Real-time Neonatal Chest Sound Separation using Deep Learning
Yang Yi Poh
Ethan Grooby
Kenneth Tan
Lindsay Zhou
Arrabella King
Ashwin Ramanathan
Atul Malhotra
Mehrtash Harandi
F. Marzbanrad
33
1
0
26 Oct 2023
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as
  a Neurodevelopmental Cue
Subtle Signals: Video-based Detection of Infant Non-nutritive Sucking as a Neurodevelopmental Cue
Shaotong Zhu
Michael Wan
Sai Kumar Reddy Manne
Emily B. Zimmerman
Sarah Ostadabbas
27
2
0
24 Oct 2023
Synthetic Data as Validation
Synthetic Data as Validation
Qixing Hu
Alan Yuille
Zongwei Zhou
SyDa
OOD
26
8
0
24 Oct 2023
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks
  with TTFS Coding
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding
Qu Yang
Malu Zhang
Jibin Wu
Kay Chen Tan
Haizhou Li
32
9
0
23 Oct 2023
Mid-Long Term Daily Electricity Consumption Forecasting Based on
  Piecewise Linear Regression and Dilated Causal CNN
Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN
Zhou Lan
Ben Liu
Yi Feng
Danhuang Dong
Peng Zhang
AI4TS
18
1
0
23 Oct 2023
An overview of text-to-speech systems and media applications
An overview of text-to-speech systems and media applications
Mohammad Reza Hasanabadi
21
3
0
22 Oct 2023
MFCC-GAN Codec: A New AI-based Audio Coding
MFCC-GAN Codec: A New AI-based Audio Coding
Mohammad Reza Hasanabadi
21
0
0
22 Oct 2023
Neural Likelihood Approximation for Integer Valued Time Series Data
Neural Likelihood Approximation for Integer Valued Time Series Data
Luke O'Loughlin
John Maclean
Andrew Black
AI4TS
15
0
0
19 Oct 2023
Physics-informed neural network for acoustic resonance analysis in a
  one-dimensional acoustic tube
Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube
Kazuya Yokota
Takahiko Kurahashi
Masajiro Abe
21
5
0
18 Oct 2023
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing
  Voice Conversion
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
Xueyao Zhang
Yicheng Gu
Haopeng Chen
Zihao Fang
Lexiao Zou
Junan Zhang
Liumeng Xue
Jinchao Zhang
Jie Zhou
Zhizheng Wu
DiffM
38
1
0
17 Oct 2023
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework
  for Music-Dance Retrieval
BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval
Kaixing Yang
Xukun Zhou
Xulong Tang
Ran Diao
Hongyan Liu
Jun He
Zhaoxin Fan
37
1
0
16 Oct 2023
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete
  Representations
MoConVQ: Unified Physics-Based Motion Control via Scalable Discrete Representations
Heyuan Yao
Zhenhua Song
Yuyang Zhou
Tenglong Ao
Baoquan Chen
Libin Liu
28
38
0
16 Oct 2023
Generative Adversarial Training for Text-to-Speech Synthesis Based on
  Raw Phonetic Input and Explicit Prosody Modelling
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling
Tiberiu Boros
Stefan Daniel Dumitrescu
Ionut Mironica
Radu Chivereanu
GAN
22
1
0
14 Oct 2023
Machine Learning for Urban Air Quality Analytics: A Survey
Machine Learning for Urban Air Quality Analytics: A Survey
Jindong Han
Weijiao Zhang
Hao Liu
Hui Xiong
AI4CE
80
12
0
14 Oct 2023
A decoder-only foundation model for time-series forecasting
A decoder-only foundation model for time-series forecasting
Abhimanyu Das
Weihao Kong
Rajat Sen
Yichen Zhou
AI4TS
AI4CE
33
199
0
14 Oct 2023
ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual
  Learning
ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning
Jiecheng Lu
Xu Han
Shihao Yang
AI4TS
27
2
0
14 Oct 2023
Previous
123...8910...596061
Next