Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
Comparing AutoML and Deep Learning Methods for Condition Monitoring using Realistic Validation Scenarios
P. Goodarzi
A. Schütze
T. Schneider
31
0
0
28 Aug 2023
Meta Attentive Graph Convolutional Recurrent Network for Traffic Forecasting
Adnan Zeb
Yongchao Ye
Shiyao Zhang
James J. Q. Yu
AI4TS
34
0
0
28 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jing Liu
78
31
0
27 Aug 2023
Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers
Abril Corona-Figueroa
Sam Bond-Taylor
Neelanjan Bhowmik
Yona Falinie A. Gaus
T. Breckon
Hubert P. H. Shum
Chris G. Willcocks
DiffM
31
4
0
27 Aug 2023
A Comprehensive Survey for Evaluation Methodologies of AI-Generated Music
Zeyu Xiong
Weitao Wang
Jing Yu
Yue Lin
Ziyan Wang
MGen
38
6
0
26 Aug 2023
Business Metric-Aware Forecasting for Inventory Management
Helen Zhou
Sercan O. Arik
Jingtao Wang
AI4TS
26
4
0
24 Aug 2023
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]
Jiawei Jiang
Chengkai Han
Wayne Xin Zhao
Jingyuan Wang
29
2
0
24 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Min Zhang
Björn W. Schuller
LM&MA
AuLLM
40
38
0
24 Aug 2023
A Survey of AI Music Generation Tools and Models
Yueyue Zhu
Jared Baca
Banafsheh Rekabdar
Reza Rawassizadeh
MGen
42
14
0
24 Aug 2023
An Initial Exploration: Learning to Generate Realistic Audio for Silent Video
Matthew Martel
Jack Wagner
VGen
21
0
0
23 Aug 2023
Efficient Transfer Learning in Diffusion Models via Adversarial Noise
Xiyu Wang
Baijiong Lin
Daochang Liu
Chang Xu
DiffM
39
3
0
23 Aug 2023
Example-Based Framework for Perceptually Guided Audio Texture Generation
Purnima Kamath
Chitralekha Gupta
L. Wyse
Suranga Nanayakkara
24
4
0
23 Aug 2023
Complex-valued neural networks for voice anti-spoofing
Nicolas Müller
Philip Sperl
Konstantin Böttinger
33
14
0
22 Aug 2023
MoCLIM: Towards Accurate Cancer Subtyping via Multi-Omics Contrastive Learning with Omics-Inference Modeling
Ziwei Yang
Zhengjun Chen
Yasuko Matsubara
Yasushi Sakurai
29
2
0
17 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
42
1
0
14 Aug 2023
Human Voice Pitch Estimation: A Convolutional Network with Auto-Labeled and Synthetic Data
Jérémy Cochoy
27
0
0
14 Aug 2023
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Shogo Seki
38
4
0
14 Aug 2023
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
Zhichao Wang
M. Dai
Keld Lundgaard
VGen
DiffM
45
2
0
12 Aug 2023
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining
Haohe Liu
Yiitan Yuan
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Qiao Tian
Yuping Wang
Wenwu Wang
Yuxuan Wang
Mark D. Plumbley
DiffM
47
224
0
10 Aug 2023
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer
Liyang Chen
Zhiyong Wu
Runnan Li
Weihong Bao
Jun Ling
Xuejiao Tan
Sheng Zhao
29
5
0
09 Aug 2023
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Peike Li
Bo-Yu Chen
Yao Yao
Yikai Wang
Allen Wang
Alex Jinpeng Wang
MGen
VLM
DiffM
72
37
0
09 Aug 2023
Weakly Supervised Multi-Task Representation Learning for Human Activity Analysis Using Wearables
Taoran Sheng
Manfred Huber
SSL
HAI
24
20
0
06 Aug 2023
Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS
Myeongji Ko
Yong-Hoon Choi
DiffM
28
1
0
03 Aug 2023
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
Robin San Roman
Yossi Adi
Antoine Deleforge
Romain Serizel
Gabriel Synnaeve
Alexandre Défossez
DiffM
27
21
0
02 Aug 2023
Music De-limiter Networks via Sample-wise Gain Inversion
Chang-Bin Jeon
Kyogu Lee
18
1
0
02 Aug 2023
A Novel Temporal Multi-Gate Mixture-of-Experts Approach for Vehicle Trajectory and Driving Intention Prediction
Renteng Yuan
Mohamed Abdel-Aty
Q. Xiang
Zijin Wang
Xin Gu
11
10
0
01 Aug 2023
Generative models for wearables data
Arinbjorn Kolbeinsson
L. Foschini
MedIm
28
0
0
31 Jul 2023
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
DiffM
38
1
0
31 Jul 2023
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
H. Oh
Sang-Hoon Lee
Seong-Whan Lee
DiffM
33
14
0
31 Jul 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Jungil Kong
Jihoon Park
Beomjeong Kim
Jeongmin Kim
Dohee Kong
Sangjin Kim
37
37
0
31 Jul 2023
DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation
Vu Ngoc Tu
V. Huynh
Hyung-Jeong Yang
M. Zaheer
Shah Nawaz
Karthik Nandakumar
Soo-Hyung Kim
27
4
0
31 Jul 2023
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Chunyu Qiang
Hao Li
Hao Ni
He Qu
Ruibo Fu
Tao Wang
Longbiao Wang
J. Dang
DiffM
30
8
0
28 Jul 2023
Self-Supervised Visual Acoustic Matching
Arjun Somayazulu
Changan Chen
Kristen Grauman
SSL
43
11
0
27 Jul 2023
CQNV: A combination of coarsely quantized bitstream and neural vocoder for low rate speech coding
Youqiang Zheng
Li Xiao
Weiping Tu
Yuhong Yang
Xinmeng Xu
41
6
0
25 Jul 2023
SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Iván Vallés-Pérez
Grzegorz Beringer
Piotr Bilinski
G. Cook
Roberto Barra-Chicote
19
1
0
23 Jul 2023
PartDiff: Image Super-resolution with Partial Diffusion Models
Kai Zhao
A. Hung
Kai-Lin Pang
Haoxin Zheng
Kyunghyun Sung
DiffM
MedIm
25
3
0
21 Jul 2023
Learning minimal representations of stochastic processes with variational autoencoders
Gabriel Fernández-Fernández
Carlo Manzo
M. Lewenstein
A. Dauphin
Gorka Muñoz-Gil
DiffM
33
4
0
21 Jul 2023
Progressive distillation diffusion for raw music generation
Svetlana Pavlova
DiffM
28
0
0
20 Jul 2023
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Daegyeom Kim
Seong-soo Hong
Yong-Hoon Choi
25
2
0
20 Jul 2023
TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition
Isabel Funke
Dominik Rivoir
Stefanie Krell
Stefanie Speidel
31
3
0
19 Jul 2023
Synthetic Lagrangian Turbulence by Generative Diffusion Models
Tianyi Li
Luca Biferale
F. Bonaccorso
M. A. Scarpolini
M. Buzzicotti
DiffM
44
33
0
17 Jul 2023
GBT: Two-stage transformer framework for non-stationary time series forecasting
Li Shen
Yuning Wei
Yangzhu Wang
AI4TS
27
23
0
17 Jul 2023
Complexity Matters: Rethinking the Latent Space for Generative Modeling
Tianyang Hu
Fei Chen
Hong Wang
Jiawei Li
Wei Cao
Jiacheng Sun
Zechao Li
DiffM
35
8
0
17 Jul 2023
Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training
Yao Wei
Yanchao Sun
Ruijie Zheng
Sai H. Vemprala
Rogerio Bonatti
Shuhang Chen
Ratnesh Madaan
Zhongjie Ba
Ashish Kapoor
Shuang Ma
OffRL
32
15
0
16 Jul 2023
Benchmarks and Custom Package for Electrical Load Forecasting
Zhixian Wang
Qingsong Wen
Chaoli Zhang
Liang Sun
Leandro Von Krannichfeldt
Yi Wang
AI4TS
36
3
0
14 Jul 2023
Tapestry of Time and Actions: Modeling Human Activity Sequences using Temporal Point Process Flows
Vinayak Gupta
Srikanta J. Bedathur
AI4TS
35
1
0
13 Jul 2023
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects
Rishabh Ranjan
Mayank Vatsa
Richa Singh
26
4
0
13 Jul 2023
SnakeSynth: New Interactions for Generative Audio Synthesis
Eric Easthope
43
0
0
11 Jul 2023
DE-TGN: Uncertainty-Aware Human Motion Forecasting using Deep Ensembles
Kareem A. Eltouny
Wansong Liu
Sibo Tian
Minghui Zheng
Xiao Liang
3DH
31
7
0
07 Jul 2023
Encoder-Decoder Networks for Self-Supervised Pretraining and Downstream Signal Bandwidth Regression on Digital Antenna Arrays
R. Bhattacharjea
Nathan E. West
SSL
23
1
0
06 Jul 2023
Previous
1
2
3
...
10
11
12
...
59
60
61
Next