Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
26
2
0
23 May 2023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Ziyue Jiang
Qiang Yang
Jia-li Zuo
Zhe Ye
Rongjie Huang
Yixiang Ren
Zhou Zhao
DiffM
70
14
0
23 May 2023
Handling Label Uncertainty on the Example of Automatic Detection of Shepherd's Crook RCA in Coronary CT Angiography
Felix Denzinger
M. Wels
O. Taubmann
Florian Kordon
Fabian Wagner
...
F. André
S. Buss
Johannes Görich
M. Sühling
Andreas Maier
20
0
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
97
562
0
22 May 2023
Towards generalizing deep-audio fake detection networks
Konstantin Gasenzer
Moritz Wolter
36
4
0
22 May 2023
Forecasting Irregularly Sampled Time Series using Graphs
Vijaya Krishna Yalavarthi
Kiran Madusudanan
Randolf Scholz
Nourhan Ahmed
Johannes Burchert
Shayan Jawed
Stefan Born
Lars Schmidt-Thieme
AI4TS
24
2
0
22 May 2023
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation
Zhe Ye
Wei Xue
Xuejiao Tan
Qi-fei Liu
Yi-Ting Guo
26
2
0
22 May 2023
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu
Rongjie Huang
Xuan Lin
Wenqiang Xu
Maozong Zheng
Hong Chen
Jinzheng He
Zhou Zhao
DiffM
55
20
0
22 May 2023
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages
Shivam Mhaskar
Vineet Bhat
Akshay Batheja
S. Deoghare
Paramveer Choudhary
P. Bhattacharyya
53
4
0
21 May 2023
Exploring How Generative Adversarial Networks Learn Phonological Representations
Jing Chen
Micha Elsner
GAN
21
3
0
21 May 2023
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
Detai Xin
Shinnosuke Takamichi
Ai Morimatsu
Hiroshi Saruwatari
24
10
0
21 May 2023
North Sámi Dialect Identification with Self-supervised Speech Models
Sofoklis Kakouros
Katri Hiovain-Asikainen
36
4
0
19 May 2023
mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra
Chenhao Shuai
Chaohua Shi
Lu Gan
Hongqing Liu
33
8
0
18 May 2023
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Won Jang
D. Lim
Heayoung Park
34
1
0
18 May 2023
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Jinzheng He
Jinglin Liu
Zhenhui Ye
Rongjie Huang
Chenye Cui
Huadai Liu
Zhou Zhao
DiffM
22
19
0
18 May 2023
Evaluation of self-supervised pre-training for automatic infant movement classification using wearable movement sensors
Einari Vaaras
Manu Airaksinen
S. Vanhatalo
Okko Rasanen
35
4
0
16 May 2023
LoViT: Long Video Transformer for Surgical Phase Recognition
Yang Liu
Maxence Boels
Luis C. García-Peraza-Herrera
Tom Kamiel Magda Vercauteren
P. Dasgupta
Alejandro Granados
Sebastien Ourselin
60
31
0
15 May 2023
Smart Home Energy Management: VAE-GAN synthetic dataset generator and Q-learning
Mina Razghandi
Hao Zhou
Melike Erol-Kantarci
D. Turgut
37
22
0
14 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra
Yang Ai
Zhenhua Ling
39
13
0
13 May 2023
Using Deepfake Technologies for Word Emphasis Detection
Eran Kaufman
Lee-Ad Gottlieb
35
0
0
12 May 2023
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
L. Yu
Daniel Simig
Colin Flaherty
Armen Aghajanyan
Luke Zettlemoyer
M. Lewis
32
84
0
12 May 2023
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Zhe Ye
Wei Xue
Xuejiao Tan
Jie Chen
Qi-fei Liu
Yi-Ting Guo
DiffM
32
40
0
11 May 2023
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
Kun Su
Judith Yue Li
Qingqing Huang
Dima Kuzmin
Joonseok Lee
...
Fei Sha
A. Jansen
Yu Wang
Mauro Verzetti
Timo I. Denk
VGen
41
12
0
11 May 2023
Message Passing Neural Networks for Traffic Forecasting
Arian Prabowo
Hao Xue
Wei Shao
Piotr Koniusz
Flora D. Salim
GNN
30
6
0
09 May 2023
VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Yuanda Wang
Hanqing Guo
Guangjing Wang
Bocheng Chen
Qiben Yan
AAML
35
17
0
09 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings
Wei Xue
Yiwen Wang
Qi-fei Liu
Yi-Ting Guo
44
1
0
09 May 2023
Traffic Forecasting on New Roads Using Spatial Contrastive Pre-Training (SCPT)
Arian Prabowo
Hao Xue
Wei Shao
Piotr Koniusz
Flora D. Salim
AI4TS
32
13
0
09 May 2023
Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Jingbei Li
Sipan Li
Ping Chen
Lu Zhang
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
40
3
0
09 May 2023
Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
Peter J. Ramadge
LRM
14
13
0
05 May 2023
Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization
Théophile Cabannes
Shreya Ghosh
Raphaël Marinier
Tom Gedeon
Alexandre M. Bayen
Munawar Hayat
86
22
0
03 May 2023
HappyQuokka System for ICASSP 2023 Auditory EEG Challenge
Zhenyu Piao
Miseul Kim
Hyungchan Yoon
Hong-Goo Kang
22
6
0
03 May 2023
Cheap and Deterministic Inference for Deep State-Space Models of Interacting Dynamical Systems
Andreas Look
M. Kandemir
Barbara Rakitsch
Jan Peters
BDL
38
6
0
02 May 2023
Sequence Modeling with Multiresolution Convolutional Memory
Jiaxin Shi
Ke Alexander Wang
E. Fox
47
13
0
02 May 2023
Long-Term Rhythmic Video Soundtracker
Jiashuo Yu
Yaohui Wang
Xinyuan Chen
Xiao Sun
Yu Qiao
DiffM
69
14
0
02 May 2023
Diffusion Models for Time Series Applications: A Survey
Lequan Lin
Zhengkun Li
Ruikun Li
Xuliang Li
Junbin Gao
MedIm
DiffM
39
65
0
01 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
52
76
0
27 Apr 2023
TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation
Zhaoyan Liu
Noël Vouitsis
S. Gorti
Jimmy Ba
Gabriel Loaiza-Ganem
ViT
35
1
0
26 Apr 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Ye-Xin Lu
Yang Ai
Zhenhua Ling
24
1
0
26 Apr 2023
Lane Change Intention Recognition and Vehicle Status Prediction for Autonomous Vehicles
Renteng Yuan
Mohamed Abdel-Aty
Xin Gu
Ou Zheng
Q. Xiang
16
2
0
25 Apr 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
Chengzhe Sun
Shan Jia
Shuwei Hou
Siwei Lyu
38
40
0
25 Apr 2023
A Two-part Transformer Network for Controllable Motion Synthesis
Shuaiying Hou
Hongyu Tao
Hujun Bao
Weiwei Xu
ViT
39
6
0
25 Apr 2023
A Cookbook of Self-Supervised Learning
Randall Balestriero
Mark Ibrahim
Vlad Sobal
Ari S. Morcos
Shashank Shekhar
...
Pierre Fernandez
Amir Bar
Hamed Pirsiavash
Yann LeCun
Micah Goldblum
SyDa
FedML
SSL
50
275
0
24 Apr 2023
Restoring Original Signal From Pile-up Signal using Deep Learning
C. H. Kim
S. Ahn
K. Y. Chae
J. Hooker
G. Rogachev
11
1
0
24 Apr 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
152
145
0
24 Apr 2023
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model
Jianzong Wang
Xulong Zhang
Haobin Tang
Aolan Sun
Ning Cheng
Jing Xiao
26
1
0
23 Apr 2023
Affective social anthropomorphic intelligent system
Md. Adyelullahil Mamun
Hasnat Md. Abdullah
Md. Golam Rabiul Alam
Muhammad Mehedi Hassan
Md. Zia Uddin
24
1
0
19 Apr 2023
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
Kai Shen
Zeqian Ju
Xu Tan
Yanqing Liu
Yichong Leng
Lei He
Tao Qin
Sheng Zhao
Jiang Bian
DiffM
44
228
0
18 Apr 2023
A Deep Learning Framework for Traffic Data Imputation Considering Spatiotemporal Dependencies
Li Jiang
Ting Zhang
Qiruyi Zuo
Chenyu Tian
George P. Chan
Wai Kin Victor Chan
Chan
AI4TS
6
2
0
18 Apr 2023
HGWaveNet: A Hyperbolic Graph Neural Network for Temporal Link Prediction
Qijie Bai
Chang Nie
Haiwei Zhang
Dongming Zhao
Xiaojie Yuan
27
20
0
14 Apr 2023
Dynamic Graph Representation Learning with Neural Networks: A Survey
Leshanshui Yang
Sébastien Adam
Clément Chatelain
AI4TS
AI4CE
44
14
0
12 Apr 2023
Previous
1
2
3
...
12
13
14
...
59
60
61
Next