ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Comparing AutoML and Deep Learning Methods for Condition Monitoring
  using Realistic Validation Scenarios
Comparing AutoML and Deep Learning Methods for Condition Monitoring using Realistic Validation Scenarios
P. Goodarzi
A. Schütze
T. Schneider
31
0
0
28 Aug 2023
Meta Attentive Graph Convolutional Recurrent Network for Traffic
  Forecasting
Meta Attentive Graph Convolutional Recurrent Network for Traffic Forecasting
Adnan Zeb
Yongchao Ye
Shiyao Zhang
James J. Q. Yu
AI4TS
34
0
0
28 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jing Liu
78
31
0
27 Aug 2023
Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code
  Diffusion using Transformers
Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers
Abril Corona-Figueroa
Sam Bond-Taylor
Neelanjan Bhowmik
Yona Falinie A. Gaus
T. Breckon
Hubert P. H. Shum
Chris G. Willcocks
DiffM
31
4
0
27 Aug 2023
A Comprehensive Survey for Evaluation Methodologies of AI-Generated
  Music
A Comprehensive Survey for Evaluation Methodologies of AI-Generated Music
Zeyu Xiong
Weitao Wang
Jing Yu
Yue Lin
Ziyan Wang
MGen
38
6
0
26 Aug 2023
Business Metric-Aware Forecasting for Inventory Management
Business Metric-Aware Forecasting for Inventory Management
Helen Zhou
Sercan O. Arik
Jingtao Wang
AI4TS
26
4
0
24 Aug 2023
Unified Data Management and Comprehensive Performance Evaluation for
  Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]
Jiawei Jiang
Chengkai Han
Wayne Xin Zhao
Jingyuan Wang
29
2
0
24 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Min Zhang
Björn W. Schuller
LM&MA
AuLLM
40
38
0
24 Aug 2023
A Survey of AI Music Generation Tools and Models
A Survey of AI Music Generation Tools and Models
Yueyue Zhu
Jared Baca
Banafsheh Rekabdar
Reza Rawassizadeh
MGen
42
14
0
24 Aug 2023
An Initial Exploration: Learning to Generate Realistic Audio for Silent
  Video
An Initial Exploration: Learning to Generate Realistic Audio for Silent Video
Matthew Martel
Jack Wagner
VGen
21
0
0
23 Aug 2023
Efficient Transfer Learning in Diffusion Models via Adversarial Noise
Efficient Transfer Learning in Diffusion Models via Adversarial Noise
Xiyu Wang
Baijiong Lin
Daochang Liu
Chang Xu
DiffM
39
3
0
23 Aug 2023
Example-Based Framework for Perceptually Guided Audio Texture Generation
Example-Based Framework for Perceptually Guided Audio Texture Generation
Purnima Kamath
Chitralekha Gupta
L. Wyse
Suranga Nanayakkara
24
4
0
23 Aug 2023
Complex-valued neural networks for voice anti-spoofing
Complex-valued neural networks for voice anti-spoofing
Nicolas Müller
Philip Sperl
Konstantin Böttinger
33
14
0
22 Aug 2023
MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive
  Learning with Omics-Inference Modeling
MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive Learning with Omics-Inference Modeling
Ziwei Yang
Zhengjun Chen
Yasuko Matsubara
Yasushi Sakurai
29
2
0
17 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
42
1
0
14 Aug 2023
Human Voice Pitch Estimation: A Convolutional Network with Auto-Labeled
  and Synthetic Data
Human Voice Pitch Estimation: A Convolutional Network with Auto-Labeled and Synthetic Data
Jérémy Cochoy
27
0
0
14 Aug 2023
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using
  1D-2D CNN
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Shogo Seki
38
4
0
14 Aug 2023
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic
  Talking-head Generation
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
Zhichao Wang
M. Dai
Keld Lundgaard
VGen
DiffM
45
2
0
12 Aug 2023
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised
  Pretraining
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Haohe Liu
Yiitan Yuan
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Qiao Tian
Yuping Wang
Wenwu Wang
Yuxuan Wang
Mark D. Plumbley
DiffM
47
224
0
10 Aug 2023
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style
  Transfer
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer
Liyang Chen
Zhiyong Wu
Runnan Li
Weihong Bao
Jun Ling
Xuejiao Tan
Sheng Zhao
29
5
0
09 Aug 2023
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Peike Li
Bo-Yu Chen
Yao Yao
Yikai Wang
Allen Wang
Alex Jinpeng Wang
MGen
VLM
DiffM
72
37
0
09 Aug 2023
Weakly Supervised Multi-Task Representation Learning for Human Activity
  Analysis Using Wearables
Weakly Supervised Multi-Task Representation Learning for Human Activity Analysis Using Wearables
Taoran Sheng
Manfred Huber
SSL
HAI
24
20
0
06 Aug 2023
Adversarial Training of Denoising Diffusion Model Using Dual
  Discriminators for High-Fidelity Multi-Speaker TTS
Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS
Myeongji Ko
Yong-Hoon Choi
DiffM
28
1
0
03 Aug 2023
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
Robin San Roman
Yossi Adi
Antoine Deleforge
Romain Serizel
Gabriel Synnaeve
Alexandre Défossez
DiffM
27
21
0
02 Aug 2023
Music De-limiter Networks via Sample-wise Gain Inversion
Music De-limiter Networks via Sample-wise Gain Inversion
Chang-Bin Jeon
Kyogu Lee
18
1
0
02 Aug 2023
A Novel Temporal Multi-Gate Mixture-of-Experts Approach for Vehicle
  Trajectory and Driving Intention Prediction
A Novel Temporal Multi-Gate Mixture-of-Experts Approach for Vehicle Trajectory and Driving Intention Prediction
Renteng Yuan
Mohamed Abdel-Aty
Q. Xiang
Zijin Wang
Xin Gu
11
10
0
01 Aug 2023
Generative models for wearables data
Generative models for wearables data
Arinbjorn Kolbeinsson
L. Foschini
MedIm
28
0
0
31 Jul 2023
Audio-visual video-to-speech synthesis with synthesized input audio
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
DiffM
38
1
0
31 Jul 2023
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive
  Speech Synthesis with Prosody Conditional Adversarial Training
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
H. Oh
Sang-Hoon Lee
Seong-Whan Lee
DiffM
33
14
0
31 Jul 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech
  with Adversarial Learning and Architecture Design
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Jungil Kong
Jihoon Park
Beomjeong Kim
Jeongmin Kim
Dohee Kong
Sangjin Kim
37
37
0
31 Jul 2023
DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement
  Estimation in Conversation
DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation
Vu Ngoc Tu
V. Huynh
Hyung-Jeong Yang
M. Zaheer
Shah Nawaz
Karthik Nandakumar
Soo-Hyung Kim
27
4
0
31 Jul 2023
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model
  and Language Model: A Comparative Study of Semantic Coding
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Chunyu Qiang
Hao Li
Hao Ni
He Qu
Ruibo Fu
Tao Wang
Longbiao Wang
J. Dang
DiffM
30
8
0
28 Jul 2023
Self-Supervised Visual Acoustic Matching
Self-Supervised Visual Acoustic Matching
Arjun Somayazulu
Changan Chen
Kristen Grauman
SSL
43
11
0
27 Jul 2023
CQNV: A combination of coarsely quantized bitstream and neural vocoder
  for low rate speech coding
CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding
Youqiang Zheng
Li Xiao
Weiping Tu
Yuhong Yang
Xinmeng Xu
41
6
0
25 Jul 2023
SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic
  Spaces
SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Iván Vallés-Pérez
Grzegorz Beringer
Piotr Bilinski
G. Cook
Roberto Barra-Chicote
19
1
0
23 Jul 2023
PartDiff: Image Super-resolution with Partial Diffusion Models
PartDiff: Image Super-resolution with Partial Diffusion Models
Kai Zhao
A. Hung
Kai-Lin Pang
Haoxin Zheng
Kyunghyun Sung
DiffM
MedIm
25
3
0
21 Jul 2023
Learning minimal representations of stochastic processes with
  variational autoencoders
Learning minimal representations of stochastic processes with variational autoencoders
Gabriel Fernández-Fernández
Carlo Manzo
M. Lewenstein
A. Dauphin
Gorka Muñoz-Gil
DiffM
33
4
0
21 Jul 2023
Progressive distillation diffusion for raw music generation
Progressive distillation diffusion for raw music generation
Svetlana Pavlova
DiffM
28
0
0
20 Jul 2023
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Daegyeom Kim
Seong-soo Hong
Yong-Hoon Choi
25
2
0
20 Jul 2023
TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical
  Phase Recognition
TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition
Isabel Funke
Dominik Rivoir
Stefanie Krell
Stefanie Speidel
31
3
0
19 Jul 2023
Synthetic Lagrangian Turbulence by Generative Diffusion Models
Synthetic Lagrangian Turbulence by Generative Diffusion Models
Tianyi Li
Luca Biferale
F. Bonaccorso
M. A. Scarpolini
M. Buzzicotti
DiffM
44
33
0
17 Jul 2023
GBT: Two-stage transformer framework for non-stationary time series
  forecasting
GBT: Two-stage transformer framework for non-stationary time series forecasting
Li Shen
Yuning Wei
Yangzhu Wang
AI4TS
27
23
0
17 Jul 2023
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Tianyang Hu
Fei Chen
Hong Wang
Jiawei Li
Wei Cao
Jiacheng Sun
Zechao Li
DiffM
35
8
0
17 Jul 2023
Is Imitation All You Need? Generalized Decision-Making with Dual-Phase
  Training
Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training
Yao Wei
Yanchao Sun
Ruijie Zheng
Sai H. Vemprala
Rogerio Bonatti
Shuhang Chen
Ratnesh Madaan
Zhongjie Ba
Ashish Kapoor
Shuang Ma
OffRL
32
15
0
16 Jul 2023
Benchmarks and Custom Package for Electrical Load Forecasting
Benchmarks and Custom Package for Electrical Load Forecasting
Zhixian Wang
Qingsong Wen
Chaoli Zhang
Liang Sun
Leandro Von Krannichfeldt
Yi Wang
AI4TS
36
3
0
14 Jul 2023
Tapestry of Time and Actions: Modeling Human Activity Sequences using
  Temporal Point Process Flows
Tapestry of Time and Actions: Modeling Human Activity Sequences using Temporal Point Process Flows
Vinayak Gupta
Srikanta J. Bedathur
AI4TS
35
1
0
13 Jul 2023
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and
  Future Prospects
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects
Rishabh Ranjan
Mayank Vatsa
Richa Singh
26
4
0
13 Jul 2023
SnakeSynth: New Interactions for Generative Audio Synthesis
SnakeSynth: New Interactions for Generative Audio Synthesis
Eric Easthope
43
0
0
11 Jul 2023
DE-TGN: Uncertainty-Aware Human Motion Forecasting using Deep Ensembles
DE-TGN: Uncertainty-Aware Human Motion Forecasting using Deep Ensembles
Kareem A. Eltouny
Wansong Liu
Sibo Tian
Minghui Zheng
Xiao Liang
3DH
31
7
0
07 Jul 2023
Encoder-Decoder Networks for Self-Supervised Pretraining and Downstream
  Signal Bandwidth Regression on Digital Antenna Arrays
Encoder-Decoder Networks for Self-Supervised Pretraining and Downstream Signal Bandwidth Regression on Digital Antenna Arrays
R. Bhattacharjea
Nathan E. West
SSL
23
1
0
06 Jul 2023
Previous
123...101112...596061
Next