ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,042 papers shown
Title
End-to-End Multi-Tab Website Fingerprinting Attack: A Detection
  Perspective
End-to-End Multi-Tab Website Fingerprinting Attack: A Detection Perspective
Mantun Chen
Yong Chen
Yongjun Wang
Peidai Xie
Shaojing Fu
Xiatian Zhu
27
3
0
12 Mar 2022
Masked Visual Pre-training for Motor Control
Masked Visual Pre-training for Motor Control
Tete Xiao
Ilija Radosavovic
Trevor Darrell
Jitendra Malik
SSL
45
242
0
11 Mar 2022
Neural Forecasting of the Italian Sovereign Bond Market with Economic
  News
Neural Forecasting of the Italian Sovereign Bond Market with Economic News
Sergio Consoli
L. Pezzoli
Elisa Tosetti
32
4
0
11 Mar 2022
Climate Change & Computer Audition: A Call to Action and Overview on
  Audio Intelligence to Help Save the Planet
Climate Change & Computer Audition: A Call to Action and Overview on Audio Intelligence to Help Save the Planet
Björn W. Schuller
Ali Akman
Yi-Fen Chang
H. Coppock
Alexander Gebhard
Alexander Kathan
Esther Rituerto-González
Andreas Triantafyllopoulos
Florian B. Pokorny
43
1
0
10 Mar 2022
Practical cognitive speech compression
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
40
2
0
08 Mar 2022
Dynamic Dual-Output Diffusion Models
Dynamic Dual-Output Diffusion Models
Yaniv Benny
Lior Wolf
DiffM
31
26
0
08 Mar 2022
Learning from Few Examples: A Summary of Approaches to Few-Shot Learning
Learning from Few Examples: A Summary of Approaches to Few-Shot Learning
Archit Parnami
Minwoo Lee
MQ
49
157
0
07 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with
  Articulatory Features
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
54
29
0
07 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
42
101
0
06 Mar 2022
Variational Auto-Encoder based Mandarin Speech Cloning
Variational Auto-Encoder based Mandarin Speech Cloning
Qingyu Xing
Xiaohan Ma
26
0
0
06 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband
  Excitation for Noise-Controllable Waveform Generation
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Tao Wang
Ruibo Fu
Jiangyan Yi
J. Tao
Zhengqi Wen
14
2
0
05 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating
  Inverse Short-Time Fourier Transform
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
33
60
0
04 Mar 2022
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker
  Detection and Speech Enhancement
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Jun Xiong
Yu Zhou
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
38
20
0
04 Mar 2022
Real time spectrogram inversion on mobile phone
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
52
4
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
24
11
0
01 Mar 2022
Explainable deepfake and spoofing detection: an attack analysis using
  SHapley Additive exPlanations
Explainable deepfake and spoofing detection: an attack analysis using SHapley Additive exPlanations
W. Ge
Massimiliano Todisco
Nicholas W. D. Evans
AAML
34
8
0
28 Feb 2022
Concept Graph Neural Networks for Surgical Video Understanding
Concept Graph Neural Networks for Surgical Video Understanding
Yutong Ban
J. Eckhoff
Thomas M. Ward
Daniel A. Hashimoto
O. Meireles
Daniela Rus
Guy Rosman
NAI
38
17
0
27 Feb 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier
Learning the Beauty in Songs: Neural Singing Voice Beautifier
Jinglin Liu
Chengxi Li
Yi Ren
Zhiying Zhu
Zhou Zhao
DiffM
35
16
0
27 Feb 2022
Continuous Human Action Recognition for Human-Machine Interaction: A
  Review
Continuous Human Action Recognition for Human-Machine Interaction: A Review
Harshala Gammulle
David Ahmedt-Aristizabal
Simon Denman
Lachlan Tychsen-Smith
L. Petersson
Clinton Fookes
48
25
0
26 Feb 2022
Revisiting Over-Smoothness in Text to Speech
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
87
61
0
26 Feb 2022
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting
Jiabin Tang
Tang Qian
Shikun Liu
Shengdong Du
Jie Hu
Tianrui Li
AI4TS
30
23
0
25 Feb 2022
Preformer: Predictive Transformer with Multi-Scale Segment-wise
  Correlations for Long-Term Time Series Forecasting
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting
Dazhao Du
Fuchun Sun
Zhewei Wei
AI4TS
29
46
0
23 Feb 2022
End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC
  Estimation
End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation
Krishna Subramani
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
42
11
0
23 Feb 2022
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of
  LPCNet
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
29
4
0
22 Feb 2022
Wavebender GAN: An architecture for phonetically meaningful speech
  manipulation
Wavebender GAN: An architecture for phonetically meaningful speech manipulation
Gustavo Teodoro Döhler Beck
Ulme Wennberg
Zofia Malisz
G. Henter
AI4CE
43
8
0
22 Feb 2022
Benchmarking Generative Latent Variable Models for Speech
Benchmarking Generative Latent Variable Models for Speech
Jakob Drachmann Havtorn
Lasse Borgholt
Søren Hauberg
J. Frellsen
Lars Maaløe
36
3
0
22 Feb 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech
  Editing
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
KELM
27
18
0
21 Feb 2022
It's Raw! Audio Generation with State-Space Models
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
28
188
0
20 Feb 2022
Learning to Detect Slip with Barometric Tactile Sensors and a Temporal
  Convolutional Neural Network
Learning to Detect Slip with Barometric Tactile Sensors and a Temporal Convolutional Neural Network
Abhinav Grover
Philippe Nadeau
C. Grebe
Jonathan Kelly
40
9
0
19 Feb 2022
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural
  Networks
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks
V. Nia
Alireza Ghaffari
Mahdi Zolnouri
Yvon Savaria
24
4
0
18 Feb 2022
Dynamic Relation Discovery and Utilization in Multi-Entity Time Series
  Forecasting
Dynamic Relation Discovery and Utilization in Multi-Entity Time Series Forecasting
Lin Huang
Lijun Wu
Jia Zhang
Jiang Bian
Tie-Yan Liu
AI4TS
34
2
0
18 Feb 2022
PGCN: Progressive Graph Convolutional Networks for Spatial-Temporal
  Traffic Forecasting
PGCN: Progressive Graph Convolutional Networks for Spatial-Temporal Traffic Forecasting
Y. Shin
Yoonjin Yoon
GNN
AI4TS
52
41
0
18 Feb 2022
Speech Denoising in the Waveform Domain with Self-Attention
Speech Denoising in the Waveform Domain with Self-Attention
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
34
61
0
15 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
Interpreting a Machine Learning Model for Detecting Gravitational Waves
Interpreting a Machine Learning Model for Detecting Gravitational Waves
M. Safarzadeh
Asad Khan
Eliu A. Huerta
Martin Wattenberg
55
2
0
15 Feb 2022
NewsPod: Automatic and Interactive News Podcasts
NewsPod: Automatic and Interactive News Podcasts
Philippe Laban
Elicia Ye
Srujay Korlakunta
John F. Canny
Marti A. Hearst
24
22
0
15 Feb 2022
Visual Acoustic Matching
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
28
56
0
14 Feb 2022
An Introduction to Neural Data Compression
An Introduction to Neural Data Compression
Yibo Yang
Stephan Mandt
Lucas Theis
29
118
0
14 Feb 2022
Distribution augmentation for low-resource expressive text-to-speech
Distribution augmentation for low-resource expressive text-to-speech
Mateusz Lajszczak
Animesh Prasad
Arent van Korlaar
Bajibabu Bollepalli
Antonio Bonafonte
...
M. Nicolis
Alexis Moinet
Thomas Drugman
Trevor Wood
Elena Sokolova
38
7
0
13 Feb 2022
SleepPPG-Net: a deep learning algorithm for robust sleep staging from
  continuous photoplethysmography
SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography
Kevin Kotzen
Peter H. Charlton
Sharon Salabi
Lea Amar
A. Landesberg
Joachim A. Behar
38
30
0
11 Feb 2022
Bernstein Flows for Flexible Posteriors in Variational Bayes
Bernstein Flows for Flexible Posteriors in Variational Bayes
Oliver Durr
Stephan Hörling
Daniel Dold
Ivonne Kovylov
Beate Sick
BDL
26
4
0
11 Feb 2022
A Graph-based U-Net Model for Predicting Traffic in unseen Cities
A Graph-based U-Net Model for Predicting Traffic in unseen Cities
L. Hermes
Barbara Hammer
Andrew Melnik
Riza Velioglu
Markus Vieth
M. Schilling
GNN
AI4TS
AI4CE
24
6
0
11 Feb 2022
Conditional Diffusion Probabilistic Model for Speech Enhancement
Conditional Diffusion Probabilistic Model for Speech Enhancement
Yen-Ju Lu
Zhongqiu Wang
Shinji Watanabe
Alexander Richard
Cheng Yu
Yu Tsao
DiffM
31
179
0
10 Feb 2022
Diffusion bridges vector quantized Variational AutoEncoders
Diffusion bridges vector quantized Variational AutoEncoders
Max H. Cohen
Guillaume Quispe
Sylvain Le Corff
Charles Ollion
Eric Moulines
DiffM
24
14
0
10 Feb 2022
Deconstructing the Inductive Biases of Hamiltonian Neural Networks
Deconstructing the Inductive Biases of Hamiltonian Neural Networks
Nate Gruver
Marc Finzi
Samuel Stanton
A. Wilson
AI4CE
31
40
0
10 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering
  Inference in Training
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
33
28
0
08 Feb 2022
TACTiS: Transformer-Attentional Copulas for Time Series
TACTiS: Transformer-Attentional Copulas for Time Series
Alexandre Drouin
Étienne Marcotte
Nicolas Chapados
AI4TS
169
37
0
07 Feb 2022
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep
  Networks
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks
Alexander Richard
Peter Dodds
V. Ithapu
44
36
0
07 Feb 2022
Building Synthetic Speaker Profiles in Text-to-Speech Systems
Building Synthetic Speaker Profiles in Text-to-Speech Systems
Jie Pu
Yi Meng
Oguz H. Elibol
23
2
0
07 Feb 2022
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Shimaa Ahmed
Yash R. Wani
Ali Shahin Shamsabadi
Mohammad Yaghini
Ilia Shumailov
Nicolas Papernot
Kassem Fawaz
AAML
46
4
0
06 Feb 2022
Previous
123...222324...596061
Next