Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
v1
v2 (latest)
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,082 papers shown
Title
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation
Zhe Ye
Wei Xue
Xuejiao Tan
Qi-fei Liu
Yi-Ting Guo
79
2
0
22 May 2023
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu
Rongjie Huang
Xuan Lin
Wenqiang Xu
Maozong Zheng
Hong Chen
Jinzheng He
Zhou Zhao
DiffM
127
20
0
22 May 2023
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages
Shivam Mhaskar
Vineet Bhat
Akshay Batheja
S. Deoghare
Paramveer Choudhary
P. Bhattacharyya
76
5
0
21 May 2023
Exploring How Generative Adversarial Networks Learn Phonological Representations
Jing Chen
Micha Elsner
GAN
63
4
0
21 May 2023
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Detai Xin
Shinnosuke Takamichi
Ai Morimatsu
Hiroshi Saruwatari
66
10
0
21 May 2023
North Sámi Dialect Identification with Self-supervised Speech Models
Sofoklis Kakouros
Katri Hiovain-Asikainen
55
5
0
19 May 2023
mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra
Chenhao Shuai
Chaohua Shi
Lu Gan
Hongqing Liu
69
8
0
18 May 2023
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Won Jang
D. Lim
Heayoung Park
88
1
0
18 May 2023
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Jinzheng He
Jinglin Liu
Zhenhui Ye
Rongjie Huang
Chenye Cui
Huadai Liu
Zhou Zhao
DiffM
136
20
0
18 May 2023
Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Einari Vaaras
Manu Airaksinen
S. Vanhatalo
Okko Räsänen
102
4
0
16 May 2023
LoViT: Long Video Transformer for Surgical Phase Recognition
Yang Liu
Maxence Boels
Luis C. Garcia-Peraza-Herrera
Tom Vercauteren
P. Dasgupta
Alejandro Granados
Sebastien Ourselin
127
35
0
15 May 2023
Smart Home Energy Management: VAE-GAN synthetic dataset generator and Q-learning
Mina Razghandi
Hao Zhou
Melike Erol-Kantarci
D. Turgut
63
27
0
14 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
101
14
0
13 May 2023
Using Deepfake Technologies for Word Emphasis Detection
Eran Kaufman
Lee-Ad Gottlieb
66
0
0
12 May 2023
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
L. Yu
Daniel Simig
Colin Flaherty
Armen Aghajanyan
Luke Zettlemoyer
M. Lewis
116
93
0
12 May 2023
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Zhe Ye
Wei Xue
Xuejiao Tan
Jie Chen
Qi-fei Liu
Yi-Ting Guo
DiffM
95
46
0
11 May 2023
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su
Judith Yue Li
Qingqing Huang
Dima Kuzmin
Joonseok Lee
...
Fei Sha
A. Jansen
Yu Wang
Mauro Verzetti
Timo I. Denk
VGen
86
14
0
11 May 2023
Message Passing Neural Networks for Traffic Forecasting
Arian Prabowo
Hao Xue
Wei Shao
Piotr Koniusz
Flora D. Salim
GNN
58
6
0
09 May 2023
VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Yuanda Wang
Hanqing Guo
Guangjing Wang
Bocheng Chen
Qiben Yan
AAML
60
18
0
09 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
77
1
0
09 May 2023
Traffic Forecasting on New Roads Using Spatial Contrastive Pre-Training (SCPT)
Arian Prabowo
Hao Xue
Wei Shao
Piotr Koniusz
Flora D. Salim
AI4TS
94
14
0
09 May 2023
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Jingbei Li
Sipan Li
Ping Chen
Lu Zhang
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
79
3
0
09 May 2023
Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
Peter J. Ramadge
LRM
75
13
0
05 May 2023
Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization
Théophile Cabannes
Shreya Ghosh
Raphaël Marinier
Tom Gedeon
Alexandre M. Bayen
Munawar Hayat
159
29
0
03 May 2023
HappyQuokka System for ICASSP 2023 Auditory EEG Challenge
Zhenyu Piao
Miseul Kim
Hyungchan Yoon
Hong-Goo Kang
27
7
0
03 May 2023
Cheap and Deterministic Inference for Deep State-Space Models of Interacting Dynamical Systems
Andreas Look
M. Kandemir
Barbara Rakitsch
Jan Peters
BDL
65
6
0
02 May 2023
Sequence Modeling with Multiresolution Convolutional Memory
Jiaxin Shi
Ke Alexander Wang
E. Fox
104
14
0
02 May 2023
Long-Term Rhythmic Video Soundtracker
Jiashuo Yu
Yaohui Wang
Xinyuan Chen
Xiao Sun
Yu Qiao
DiffM
105
13
0
02 May 2023
Diffusion Models for Time Series Applications: A Survey
Lequan Lin
Zhengkun Li
Ruikun Li
Xuliang Li
Junbin Gao
MedIm
DiffM
111
71
0
01 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
148
85
0
27 Apr 2023
TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation
Zhaoyan Liu
Noël Vouitsis
S. Gorti
Jimmy Ba
Gabriel Loaiza-Ganem
ViT
73
1
0
26 Apr 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
105
1
0
26 Apr 2023
Lane Change Intention Recognition and Vehicle Status Prediction for Autonomous Vehicles
Renteng Yuan
Mohamed Abdel-Aty
Xin Gu
Ou Zheng
Q. Xiang
52
2
0
25 Apr 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Siwei Lyu
72
45
0
25 Apr 2023
A Two-part Transformer Network for Controllable Motion Synthesis
Shuaiying Hou
Hongyu Tao
Hujun Bao
Weiwei Xu
ViT
78
6
0
25 Apr 2023
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
161
284
0
24 Apr 2023
Restoring Original Signal From Pile-up Signal using Deep Learning
C. H. Kim
S. Ahn
K. Y. Chae
J. Hooker
G. Rogachev
18
1
0
24 Apr 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
234
152
0
24 Apr 2023
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model
Jianzong Wang
Xulong Zhang
Haobin Tang
Aolan Sun
Ning Cheng
Jing Xiao
129
1
0
23 Apr 2023
Affective social anthropomorphic intelligent system
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Md. Golam Rabiul Alam
Muhammad Mehedi Hassan
Md. Zia Uddin
52
1
0
19 Apr 2023
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Kai Shen
Zeqian Ju
Xu Tan
Yanqing Liu
Yichong Leng
Lei He
Tao Qin
Sheng Zhao
Jiang Bian
DiffM
117
247
0
18 Apr 2023
A Deep Learning Framework for Traffic Data Imputation Considering Spatiotemporal Dependencies
Li Jiang
Ting Zhang
Qiruyi Zuo
Chenyu Tian
George P. Chan
Wai Kin Victor Chan
Chan
AI4TS
16
2
0
18 Apr 2023
HGWaveNet: A Hyperbolic Graph Neural Network for Temporal Link Prediction
Qijie Bai
Chang Nie
Haiwei Zhang
Dongming Zhao
Xiaojie Yuan
75
21
0
14 Apr 2023
Dynamic Graph Representation Learning with Neural Networks: A Survey
Leshanshui Yang
Sébastien Adam
Clément Chatelain
AI4TS
AI4CE
83
19
0
12 Apr 2023
ChiroDiff: Modelling chirographic data with Diffusion Models
Ayan Das
Yongxin Yang
Timothy M. Hospedales
Tao Xiang
Yi-Zhe Song
DiffM
110
10
0
07 Apr 2023
ArmanTTS single-speaker Persian dataset
Mohammd Hasan Shamgholi
Vahid Saeedi
J. Peymanfard
Leila Alhabib
Hossein Zeinali
48
2
0
07 Apr 2023
One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era
Chaoning Zhang
Chenshuang Zhang
Chenghao Li
Yu Qiao
Sheng Zheng
...
Sung-Ho Bae
Lik-Hang Lee
Pan Hui
In So Kweon
Choong Seon Hong
LM&MA
AI4MH
LRM
ELM
106
137
0
04 Apr 2023
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models
Yuancheng Wang
Zeqian Ju
Xuejiao Tan
Lei He
Zhizheng Wu
Jiang Bian
Sheng Zhao
DiffM
154
55
0
03 Apr 2023
Streaming Video Model
Yucheng Zhao
Chong Luo
Chuanxin Tang
DongDong Chen
Noel Codella
Zhengjun Zha
86
13
0
30 Mar 2023
Implicit Diffusion Models for Continuous Super-Resolution
Sicheng Gao
Xuhui Liu
Bo-Wen Zeng
Sheng Xu
Yanjing Li
Xiaonan Luo
Jianzhuang Liu
Xiantong Zhen
Baochang Zhang
DiffM
111
232
0
29 Mar 2023
Previous
1
2
3
...
13
14
15
...
60
61
62
Next