Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
v1
v2 (latest)
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,082 papers shown
Title
Arabic Text-To-Speech (TTS) Data Preparation
Hala Al Masri
Muhy Eddin Za'ter
28
1
0
07 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
47
0
0
05 Apr 2022
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation
Eleonora Grassucci
Gioia Mancini
Christian Brignone
A. Uncini
Danilo Comminiello
72
16
0
04 Apr 2022
Learning Neural Acoustic Fields
Andrew F. Luo
Yilun Du
Michael J. Tarr
J. Tenenbaum
Antonio Torralba
Chuang Gan
AI4CE
84
84
0
04 Apr 2022
SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators
Karolis Martinkus
Andreas Loukas
Nathanael Perraudin
Roger Wattenhofer
102
70
0
04 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
124
54
0
04 Apr 2022
On incorporating social speaker characteristics in synthetic speech
S. Rallabandi
Sebastian Möller
93
0
0
03 Apr 2022
StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks
Antoine Lavault
Axel Roebel
Matthieu Voiry
GAN
46
3
0
02 Apr 2022
Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu
Kyle Olszewski
Yuehua Wu
Panos Achlioptas
Menglei Chai
Yan Yan
Sergey Tulyakov
MGen
118
46
0
01 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
60
0
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
81
33
0
31 Mar 2022
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors
Steven Bohez
S. Tunyasuvunakool
Philemon Brakel
Fereshteh Sadeghi
Leonard Hasenclever
...
Nathan Batchelor
Federico Casarini
J. Merel
R. Hadsell
N. Heess
98
51
0
31 Mar 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
75
6
0
31 Mar 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
135
30
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
95
53
0
31 Mar 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
101
49
0
31 Mar 2022
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Jiachen Lian
Chunlei Zhang
Dong Yu
DRL
65
52
0
30 Mar 2022
Online Motion Style Transfer for Interactive Character Control
Ying Tang
Jiangtao Liu
Cheng Zhou
Tingguang Li
OffRL
22
1
0
30 Mar 2022
Does Audio Deepfake Detection Generalize?
Nicolas Müller
Pavel Czempin
Franziska Dieckmann
Adam Froghyar
Konstantin Böttinger
115
154
0
30 Mar 2022
Symbolic music generation conditioned on continuous-valued emotions
Serkan Sulun
M. Davies
Paula Viana
MGen
86
27
0
30 Mar 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
63
5
0
30 Mar 2022
ReIL: A Framework for Reinforced Intervention-based Imitation Learning
Rom N. Parnichkun
M. Dailey
Atsushi Yamashita
42
3
0
29 Mar 2022
Improving Source Separation by Explicitly Modeling Dependencies Between Sources
Ethan Manilow
Curtis Hawthorne
Cheng-Zhi Anna Huang
Bryan Pardo
Jesse Engel
BDL
70
10
0
28 Mar 2022
vTTS: visual-text to speech
Yoshifumi Nakano
Takaaki Saeki
Shinnosuke Takamichi
Katsuhito Sudoh
Hiroshi Saruwatari
61
4
0
28 Mar 2022
Attacker Attribution of Audio Deepfakes
Nicolas Müller
Franziska Dieckmann
Jennifer Williams
60
15
0
28 Mar 2022
MolGenSurvey: A Systematic Survey in Machine Learning Models for Molecule Design
Yuanqi Du
Tianfan Fu
Jimeng Sun
Shengchao Liu
AI4CE
187
91
0
28 Mar 2022
Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Sangjun Park
Kihyun Choo
Joohyung Lee
A. Porov
Konstantin Osipov
June Sig Sung
70
6
0
27 Mar 2022
A Neural Vocoder Based Packet Loss Concealment Algorithm
Yaofeng Zhou
C. Bao
54
2
0
26 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Jane Polak Scowcroft
Dong Yu
DiffM
105
97
0
25 Mar 2022
An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms
Abulikemu Abuduweili
Changliu Liu
37
1
0
23 Mar 2022
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis
Rishabh Jain
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
69
14
0
22 Mar 2022
AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling
Bac Nguyen
Fabien Cardinaux
Stefan Uhlich
25
2
0
21 Mar 2022
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
81
7
0
21 Mar 2022
Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
T. Raitio
Petko N. Petkov
Jiangchuan Li
M. Shifas
Andrea Davis
Y. Stylianou
48
2
0
20 Mar 2022
AdaVocoder: Adaptive Vocoder for Custom Voice
Xin Yuan
Yongbin Feng
Mingming Ye
Cheng Tuo
Minghang Zhang
128
3
0
18 Mar 2022
Improve few-shot voice cloning using multi-modal learning
Haitong Zhang
Yue Lin
46
8
0
18 Mar 2022
A
3
^3
3
T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Richard He Bai
Renjie Zheng
Junkun Chen
Xintong Li
Mingbo Ma
Liang Huang
119
53
0
18 Mar 2022
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
Paritosh Mittal
Y. Cheng
Maneesh Singh
Shubham Tulsiani
133
230
0
17 Mar 2022
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
121
38
0
17 Mar 2022
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation
Yichao Yan
Zanwei Zhou
Zi Wang
Chen-Ning Yang
Xiaokang Yang
CVBM
70
21
0
15 Mar 2022
Reinforced Imitative Graph Learning for Mobile User Profiling
Dongjie Wang
Pengyang Wang
Yanjie Fu
Kunpeng Liu
Hui Xiong
C. Hughes
44
11
0
13 Mar 2022
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
62
11
0
12 Mar 2022
End-to-End Multi-Tab Website Fingerprinting Attack: A Detection Perspective
Mantun Chen
Yong Chen
Yongjun Wang
Peidai Xie
Shaojing Fu
Xiatian Zhu
44
3
0
12 Mar 2022
Masked Visual Pre-training for Motor Control
Tete Xiao
Ilija Radosavovic
Trevor Darrell
Jitendra Malik
SSL
125
250
0
11 Mar 2022
Neural Forecasting of the Italian Sovereign Bond Market with Economic News
Sergio Consoli
L. Pezzoli
Elisa Tosetti
54
4
0
11 Mar 2022
Climate Change & Computer Audition: A Call to Action and Overview on Audio Intelligence to Help Save the Planet
Björn W. Schuller
Ali Akman
Yi-Fen Chang
H. Coppock
Alexander Gebhard
Alexander Kathan
Esther Rituerto-González
Andreas Triantafyllopoulos
Florian B. Pokorny
67
1
0
10 Mar 2022
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
59
2
0
08 Mar 2022
Dynamic Dual-Output Diffusion Models
Yaniv Benny
Lior Wolf
DiffM
81
26
0
08 Mar 2022
Learning from Few Examples: A Summary of Approaches to Few-Shot Learning
Archit Parnami
Minwoo Lee
MQ
114
168
0
07 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
102
29
0
07 Mar 2022
Previous
1
2
3
...
22
23
24
...
60
61
62
Next