ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio
v1v2 (latest)

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXiv (abs)PDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,082 papers shown
Title
HEAR: Holistic Evaluation of Audio Representations
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
137
108
0
06 Mar 2022
Variational Auto-Encoder based Mandarin Speech Cloning
Variational Auto-Encoder based Mandarin Speech Cloning
Qingyu Xing
Xiaohan Ma
133
0
0
06 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband
  Excitation for Noise-Controllable Waveform Generation
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Tao Wang
Ruibo Fu
Jiangyan Yi
J. Tao
Zhengqi Wen
25
2
0
05 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating
  Inverse Short-Time Fourier Transform
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
89
62
0
04 Mar 2022
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker
  Detection and Speech Enhancement
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Jun Xiong
Yu Zhou
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
72
22
0
04 Mar 2022
Real time spectrogram inversion on mobile phone
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
131
4
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDLAI4TSSSL
96
11
0
01 Mar 2022
Explainable deepfake and spoofing detection: an attack analysis using
  SHapley Additive exPlanations
Explainable deepfake and spoofing detection: an attack analysis using SHapley Additive exPlanations
W. Ge
Massimiliano Todisco
Nicholas W. D. Evans
AAML
54
9
0
28 Feb 2022
Concept Graph Neural Networks for Surgical Video Understanding
Concept Graph Neural Networks for Surgical Video Understanding
Yutong Ban
J. Eckhoff
Thomas M. Ward
Daniel A. Hashimoto
O. Meireles
Daniela Rus
Guy Rosman
NAI
86
18
0
27 Feb 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier
Learning the Beauty in Songs: Neural Singing Voice Beautifier
Jinglin Liu
Chengxi Li
Yi Ren
Zhiying Zhu
Zhou Zhao
DiffM
94
17
0
27 Feb 2022
Continuous Human Action Recognition for Human-Machine Interaction: A
  Review
Continuous Human Action Recognition for Human-Machine Interaction: A Review
Harshala Gammulle
David Ahmedt-Aristizabal
Simon Denman
Lachlan Tychsen-Smith
L. Petersson
Clinton Fookes
124
28
0
26 Feb 2022
Revisiting Over-Smoothness in Text to Speech
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
148
64
0
26 Feb 2022
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting
Spatio-Temporal Latent Graph Structure Learning for Traffic Forecasting
Jiabin Tang
Tang Qian
Shikun Liu
Shengdong Du
Jie Hu
Tianrui Li
AI4TS
58
23
0
25 Feb 2022
Preformer: Predictive Transformer with Multi-Scale Segment-wise
  Correlations for Long-Term Time Series Forecasting
Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting
Dazhao Du
Fuchun Sun
Zhewei Wei
AI4TS
87
51
0
23 Feb 2022
End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC
  Estimation
End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation
Krishna Subramani
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
70
11
0
23 Feb 2022
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of
  LPCNet
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet
J. Valin
Umut Isik
Paris Smaragdis
A. Krishnaswamy
62
4
0
22 Feb 2022
Wavebender GAN: An architecture for phonetically meaningful speech
  manipulation
Wavebender GAN: An architecture for phonetically meaningful speech manipulation
Gustavo Teodoro Döhler Beck
Ulme Wennberg
Zofia Malisz
G. Henter
AI4CE
88
8
0
22 Feb 2022
Benchmarking Generative Latent Variable Models for Speech
Benchmarking Generative Latent Variable Models for Speech
Jakob Drachmann Havtorn
Lasse Borgholt
Søren Hauberg
J. Frellsen
Lars Maaløe
80
3
0
22 Feb 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech
  Editing
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
KELM
69
20
0
21 Feb 2022
It's Raw! Audio Generation with State-Space Models
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
110
195
0
20 Feb 2022
Learning to Detect Slip with Barometric Tactile Sensors and a Temporal
  Convolutional Neural Network
Learning to Detect Slip with Barometric Tactile Sensors and a Temporal Convolutional Neural Network
Abhinav Grover
Philippe Nadeau
C. Grebe
Jonathan Kelly
69
10
0
19 Feb 2022
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural
  Networks
Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks
V. Nia
Alireza Ghaffari
Mahdi Zolnouri
Yvon Savaria
58
5
0
18 Feb 2022
Dynamic Relation Discovery and Utilization in Multi-Entity Time Series
  Forecasting
Dynamic Relation Discovery and Utilization in Multi-Entity Time Series Forecasting
Lin Huang
Lijun Wu
Jia Zhang
Jiang Bian
Tie-Yan Liu
AI4TS
48
2
0
18 Feb 2022
PGCN: Progressive Graph Convolutional Networks for Spatial-Temporal
  Traffic Forecasting
PGCN: Progressive Graph Convolutional Networks for Spatial-Temporal Traffic Forecasting
Y. Shin
Yoonjin Yoon
GNNAI4TS
74
47
0
18 Feb 2022
Speech Denoising in the Waveform Domain with Self-Attention
Speech Denoising in the Waveform Domain with Self-Attention
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
89
63
0
15 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
118
66
0
15 Feb 2022
Interpreting a Machine Learning Model for Detecting Gravitational Waves
Interpreting a Machine Learning Model for Detecting Gravitational Waves
M. Safarzadeh
Asad Khan
Eliu A. Huerta
Martin Wattenberg
108
2
0
15 Feb 2022
NewsPod: Automatic and Interactive News Podcasts
NewsPod: Automatic and Interactive News Podcasts
Philippe Laban
Elicia Ye
Srujay Korlakunta
John F. Canny
Marti A. Hearst
54
22
0
15 Feb 2022
Visual Acoustic Matching
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
79
58
0
14 Feb 2022
An Introduction to Neural Data Compression
An Introduction to Neural Data Compression
Yibo Yang
Stephan Mandt
Lucas Theis
149
125
0
14 Feb 2022
Distribution augmentation for low-resource expressive text-to-speech
Distribution augmentation for low-resource expressive text-to-speech
Mateusz Lajszczak
Animesh Prasad
Arent van Korlaar
Bajibabu Bollepalli
Antonio Bonafonte
...
M. Nicolis
Alexis Moinet
Thomas Drugman
Trevor Wood
Elena Sokolova
61
7
0
13 Feb 2022
SleepPPG-Net: a deep learning algorithm for robust sleep staging from
  continuous photoplethysmography
SleepPPG-Net: a deep learning algorithm for robust sleep staging from continuous photoplethysmography
Kevin Kotzen
Peter H. Charlton
Sharon Salabi
Lea Amar
A. Landesberg
Joachim A. Behar
67
33
0
11 Feb 2022
Bernstein Flows for Flexible Posteriors in Variational Bayes
Bernstein Flows for Flexible Posteriors in Variational Bayes
Oliver Durr
Stephan Hörling
Daniel Dold
Ivonne Kovylov
Beate Sick
BDL
102
4
0
11 Feb 2022
A Graph-based U-Net Model for Predicting Traffic in unseen Cities
A Graph-based U-Net Model for Predicting Traffic in unseen Cities
L. Hermes
Barbara Hammer
Andrew Melnik
Riza Velioglu
Markus Vieth
M. Schilling
GNNAI4TSAI4CE
77
6
0
11 Feb 2022
Conditional Diffusion Probabilistic Model for Speech Enhancement
Conditional Diffusion Probabilistic Model for Speech Enhancement
Yen-Ju Lu
Zhongqiu Wang
Shinji Watanabe
Alexander Richard
Cheng Yu
Yu Tsao
DiffM
84
191
0
10 Feb 2022
Diffusion bridges vector quantized Variational AutoEncoders
Diffusion bridges vector quantized Variational AutoEncoders
Max H. Cohen
Guillaume Quispe
Sylvain Le Corff
Charles Ollion
Eric Moulines
DiffM
90
15
0
10 Feb 2022
Deconstructing the Inductive Biases of Hamiltonian Neural Networks
Deconstructing the Inductive Biases of Hamiltonian Neural Networks
Nate Gruver
Marc Finzi
Samuel Stanton
A. Wilson
AI4CE
69
42
0
10 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering
  Inference in Training
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
71
31
0
08 Feb 2022
TACTiS: Transformer-Attentional Copulas for Time Series
TACTiS: Transformer-Attentional Copulas for Time Series
Alexandre Drouin
Étienne Marcotte
Nicolas Chapados
AI4TS
283
39
0
07 Feb 2022
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep
  Networks
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks
Alexander Richard
Peter Dodds
V. Ithapu
71
37
0
07 Feb 2022
Building Synthetic Speaker Profiles in Text-to-Speech Systems
Building Synthetic Speaker Profiles in Text-to-Speech Systems
Jie Pu
Yi Meng
Oguz H. Elibol
48
2
0
07 Feb 2022
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Shimaa Ahmed
Yash R. Wani
Ali Shahin Shamsabadi
Mohammad Yaghini
Ilia Shumailov
Nicolas Papernot
Kassem Fawaz
AAML
62
4
0
06 Feb 2022
GhostTalk: Interactive Attack on Smartphone Voice System Through Power
  Line
GhostTalk: Interactive Attack on Smartphone Voice System Through Power Line
Yuanda Wang
Hanqing Guo
Qiben Yan
AAML
77
41
0
05 Feb 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network
  Accelerators
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators
Lois Orosa
Skanda Koppula
Yaman Umuroglu
Konstantinos Kanellopoulos
Juan Gómez Luna
Michaela Blott
K. Vissers
O. Mutlu
82
4
0
04 Feb 2022
A Survey on Safety-Critical Driving Scenario Generation -- A
  Methodological Perspective
A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective
Wenhao Ding
Chejian Xu
Mansur Arief
Hao-ming Lin
Yue Liu
Ding Zhao
119
165
0
04 Feb 2022
Deep Learning for Epidemiologists: An Introduction to Neural Networks
Deep Learning for Epidemiologists: An Introduction to Neural Networks
S. Serghiou
K. Rough
FedML
54
14
0
02 Feb 2022
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP
  ADD Challenge
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge
Ziyi Chen
Hua Hua
Yuxiang Zhang
Ming Li
Pengyuan Zhang
102
0
0
29 Jan 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For
  Wave Generation
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation
Shoule Wu
Ziqiang Shi
DiffM
456
9
0
29 Jan 2022
Electra: Conditional Generative Model based Predicate-Aware Query
  Approximation
Electra: Conditional Generative Model based Predicate-Aware Query Approximation
Nikhil Sheoran
Subrata Mitra
Vibhor Porwal
Siddharth Ghetia
Jatin Varshney
Tung Mai
Anup B. Rao
Vikas Maddukuri
91
13
0
28 Jan 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising
  Diffusion GANs
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Jane Polak Scowcroft
Dong Yu
DiffM
150
67
0
28 Jan 2022
Previous
123...232425...606162
Next