ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
Fine-grained Noise Control for Multispeaker Speech Synthesis
Fine-grained Noise Control for Multispeaker Speech Synthesis
Karolos Nikitaras
G. Vamvoukakis
Nikolaos Ellinas
Konstantinos Klapsas
K. Markopoulos
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
29
4
0
11 Apr 2022
Model-free optimization of power/efficiency tradeoffs in quantum thermal
  machines using reinforcement learning
Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning
P. A. Erdman
Frank Noé
33
9
0
10 Apr 2022
On Principal Curve-Based Classifiers and Similarity-Based Selective
  Sampling in Time-Series
On Principal Curve-Based Classifiers and Similarity-Based Selective Sampling in Time-Series
Aref Hakimzadeh
K. Ziarati
M. Taheri
AI4TS
18
0
0
10 Apr 2022
Super-Resolved Microbubble Localization in Single-Channel Ultrasound RF
  Signals Using Deep Learning
Super-Resolved Microbubble Localization in Single-Channel Ultrasound RF Signals Using Deep Learning
N. Blanken
J. Wolterink
H. Delingette
Christoph Brune
M. Versluis
G. Lajoinie
24
13
0
09 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one
  voice conversion
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
61
0
0
08 Apr 2022
Arabic Text-To-Speech (TTS) Data Preparation
Arabic Text-To-Speech (TTS) Data Preparation
Hala Al Masri
Muhy Eddin Za'ter
19
1
0
07 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End
  Lightweight Text-to-Speech
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
28
0
0
05 Apr 2022
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic
  Representation
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation
Eleonora Grassucci
Gioia Mancini
Christian Brignone
A. Uncini
Danilo Comminiello
26
16
0
04 Apr 2022
Learning Neural Acoustic Fields
Learning Neural Acoustic Fields
Andrew F. Luo
Yilun Du
Michael J. Tarr
J. Tenenbaum
Antonio Torralba
Chuang Gan
AI4CE
25
79
0
04 Apr 2022
SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits
  of One-shot Graph Generators
SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators
Karolis Martinkus
Andreas Loukas
Nathanael Perraudin
Roger Wattenhofer
52
67
0
04 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
33
51
0
04 Apr 2022
On incorporating social speaker characteristics in synthetic speech
On incorporating social speaker characteristics in synthetic speech
S. Rallabandi
Sebastian Möller
29
0
0
03 Apr 2022
StyleWaveGAN: Style-based synthesis of drum sounds with extensive
  controls using generative adversarial networks
StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks
Antoine Lavault
Axel Roebel
Matthieu Voiry
GAN
17
2
0
02 Apr 2022
Quantized GAN for Complex Music Generation from Dance Videos
Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu
Kyle Olszewski
Yuehua Wu
Panos Achlioptas
Menglei Chai
Yan Yan
Sergey Tulyakov
MGen
38
44
0
01 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different
  Configurations for Speech Synthesis
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
18
0
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement
  by Re-Synthesis
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
22
32
0
31 Mar 2022
Imitate and Repurpose: Learning Reusable Robot Movement Skills From
  Human and Animal Behaviors
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors
Steven Bohez
S. Tunyasuvunakool
Philemon Brakel
Fereshteh Sadeghi
Leonard Hasenclever
...
Nathan Batchelor
Federico Casarini
J. Merel
R. Hadsell
N. Heess
43
51
0
31 Mar 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
34
5
0
31 Mar 2022
WavThruVec: Latent speech representation as intermediate features for
  neural speech synthesis
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
18
30
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to
  Speech
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
30
51
0
31 Mar 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
48
46
0
31 Mar 2022
Robust Disentangled Variational Speech Representation Learning for
  Zero-shot Voice Conversion
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Jiachen Lian
Chunlei Zhang
Dong Yu
DRL
30
51
0
30 Mar 2022
Online Motion Style Transfer for Interactive Character Control
Online Motion Style Transfer for Interactive Character Control
Ying Tang
Jiangtao Liu
Cheng Zhou
Tingguang Li
OffRL
17
1
0
30 Mar 2022
Does Audio Deepfake Detection Generalize?
Does Audio Deepfake Detection Generalize?
Nicolas Müller
Pavel Czempin
Franziska Dieckmann
Adam Froghyar
Konstantin Böttinger
39
141
0
30 Mar 2022
Symbolic music generation conditioned on continuous-valued emotions
Symbolic music generation conditioned on continuous-valued emotions
Serkan Sulun
M. Davies
Paula Viana
MGen
24
25
0
30 Mar 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention
  VAE
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
32
5
0
30 Mar 2022
ReIL: A Framework for Reinforced Intervention-based Imitation Learning
ReIL: A Framework for Reinforced Intervention-based Imitation Learning
Rom N. Parnichkun
M. Dailey
Atsushi Yamashita
28
2
0
29 Mar 2022
Improving Source Separation by Explicitly Modeling Dependencies Between
  Sources
Improving Source Separation by Explicitly Modeling Dependencies Between Sources
Ethan Manilow
Curtis Hawthorne
Cheng-Zhi Anna Huang
Bryan Pardo
Jesse Engel
BDL
31
7
0
28 Mar 2022
vTTS: visual-text to speech
vTTS: visual-text to speech
Yoshifumi Nakano
Takaaki Saeki
Shinnosuke Takamichi
Katsuhito Sudoh
Hiroshi Saruwatari
25
4
0
28 Mar 2022
Attacker Attribution of Audio Deepfakes
Attacker Attribution of Audio Deepfakes
Nicolas Müller
Franziska Dieckmann
Jennifer Williams
27
13
0
28 Mar 2022
MolGenSurvey: A Systematic Survey in Machine Learning Models for
  Molecule Design
MolGenSurvey: A Systematic Survey in Machine Learning Models for Molecule Design
Yuanqi Du
Tianfan Fu
Jimeng Sun
Shengchao Liu
AI4CE
74
88
0
28 Mar 2022
Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud
  to Edge
Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Sangjun Park
Kihyun Choo
Joohyung Lee
A. Porov
Konstantin Osipov
June Sig Sung
24
6
0
27 Mar 2022
A Neural Vocoder Based Packet Loss Concealment Algorithm
A Neural Vocoder Based Packet Loss Concealment Algorithm
Yaofeng Zhou
C. Bao
25
2
0
26 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality
  Speech Synthesis
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
DiffM
44
92
0
25 Mar 2022
An Optical Control Environment for Benchmarking Reinforcement Learning
  Algorithms
An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms
Abulikemu Abuduweili
Changliu Liu
24
1
0
23 Mar 2022
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial
  Fine-Tuning Results for Child Speech Synthesis
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis
Rishabh Jain
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
27
14
0
22 Mar 2022
AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable
  Duration Modeling
AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling
Bac Nguyen
Fabien Cardinaux
Stefan Uhlich
16
2
0
21 Mar 2022
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
57
6
0
21 Mar 2022
Vocal effort modeling in neural TTS for improving the intelligibility of
  synthetic speech in noise
Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
T. Raitio
Petko N. Petkov
Jiangchuan Li
M. Shifas
Andrea Davis
Y. Stylianou
22
2
0
20 Mar 2022
AdaVocoder: Adaptive Vocoder for Custom Voice
AdaVocoder: Adaptive Vocoder for Custom Voice
Xin Yuan
Yongbin Feng
Mingming Ye
Cheng Tuo
Minghang Zhang
22
3
0
18 Mar 2022
Improve few-shot voice cloning using multi-modal learning
Improve few-shot voice cloning using multi-modal learning
Haitong Zhang
Yue Lin
21
8
0
18 Mar 2022
A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech
  Synthesis and Editing
A3^33T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Richard He Bai
Renjie Zheng
Junkun Chen
Xintong Li
Mingbo Ma
Liang Huang
32
49
0
18 Mar 2022
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
Paritosh Mittal
Y. Cheng
Maneesh Singh
Shubham Tulsiani
37
227
0
17 Mar 2022
Transframer: Arbitrary Frame Prediction with Generative Models
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
27
38
0
17 Mar 2022
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video
  Generation
DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation
Yichao Yan
Zanwei Zhou
Zi Wang
Chen-Ning Yang
Xiaokang Yang
CVBM
21
19
0
15 Mar 2022
Reinforced Imitative Graph Learning for Mobile User Profiling
Reinforced Imitative Graph Learning for Mobile User Profiling
Dongjie Wang
Pengyang Wang
Yanjie Fu
Kunpeng Liu
Hui Xiong
C. Hughes
21
10
0
13 Mar 2022
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker
  Verification System
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
35
11
0
12 Mar 2022
End-to-End Multi-Tab Website Fingerprinting Attack: A Detection
  Perspective
End-to-End Multi-Tab Website Fingerprinting Attack: A Detection Perspective
Mantun Chen
Yong Chen
Yongjun Wang
Peidai Xie
Shaojing Fu
Xiatian Zhu
27
3
0
12 Mar 2022
Masked Visual Pre-training for Motor Control
Masked Visual Pre-training for Motor Control
Tete Xiao
Ilija Radosavovic
Trevor Darrell
Jitendra Malik
SSL
39
242
0
11 Mar 2022
Neural Forecasting of the Italian Sovereign Bond Market with Economic
  News
Neural Forecasting of the Italian Sovereign Bond Market with Economic News
Sergio Consoli
L. Pezzoli
Elisa Tosetti
25
4
0
11 Mar 2022
Previous
123...212223...596061
Next