ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.03499
  4. Cited By
WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
    DiffM
ArXivPDFHTML

Papers citing "WaveNet: A Generative Model for Raw Audio"

50 / 3,039 papers shown
Title
I Hear Your True Colors: Image Guided Audio Generation
I Hear Your True Colors: Image Guided Audio Generation
Roy Sheffer
Yossi Adi
VLM
18
74
0
06 Nov 2022
Self-Supervised Learning for Speech Enhancement through Synthesis
Self-Supervised Learning for Speech Enhancement through Synthesis
Bryce Irvin
Marko Stamenovic
M. Kegler
Li-Chia Yang
45
18
0
04 Nov 2022
Cold Diffusion for Speech Enhancement
Cold Diffusion for Speech Enhancement
Hao Yen
François Germain
Gordon Wichern
Jonathan Le Roux
DiffM
29
40
0
04 Nov 2022
Real-Time Target Sound Extraction
Real-Time Target Sound Extraction
Bandhav Veluri
Justin Chan
Malek Itani
Tuochao Chen
Takuya Yoshioka
Shyamnath Gollakota
46
30
0
04 Nov 2022
Translated Skip Connections -- Expanding the Receptive Fields of Fully
  Convolutional Neural Networks
Translated Skip Connections -- Expanding the Receptive Fields of Fully Convolutional Neural Networks
Joshua Bruton
Hairong Wang
SSeg
17
3
0
03 Nov 2022
HyperSound: Generating Implicit Neural Representations of Audio Signals
  with Hypernetworks
HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks
Filip Szatkowski
Karol J. Piczak
Przemysław Spurek
Jacek Tabor
Tomasz Trzciñski
28
12
0
03 Nov 2022
Human in the loop approaches in multi-modal conversational task guidance
  system development
Human in the loop approaches in multi-modal conversational task guidance system development
R. Manuvinakurike
Sovan Biswas
G. Raffa
R. Beckwith
A. Rhodes
Meng Shi
Gesem Gudino Mejia
Saurav Sahay
L. Nachman
43
2
0
03 Nov 2022
Iterative autoregression: a novel trick to improve your low-latency
  speech enhancement model
Iterative autoregression: a novel trick to improve your low-latency speech enhancement model
Pavel Andreev
Nicholas Babaev
Azat Saginbaev
Ivan Shchekotov
Aibek Alanov
29
4
0
03 Nov 2022
Audio Language Modeling using Perceptually-Guided Discrete
  Representations
Audio Language Modeling using Perceptually-Guided Discrete Representations
Felix Kreuk
Yaniv Taigman
Adam Polyak
Jade Copet
Gabriel Synnaeve
Alexandre Défossez
Yossi Adi
32
4
0
02 Nov 2022
Inference and Denoise: Causal Inference-based Neural Speech Enhancement
Inference and Denoise: Causal Inference-based Neural Speech Enhancement
Tsun-An Hsieh
Chao-Han Huck Yang
Pin-Yu Chen
Sabato Marco Siniscalchi
Yu Tsao
CML
63
2
0
02 Nov 2022
Adversarial Guitar Amplifier Modelling With Unpaired Data
Adversarial Guitar Amplifier Modelling With Unpaired Data
Alec Wright
Vesa Valimaki
Lauri Juvela
GAN
28
8
0
02 Nov 2022
SIMD-size aware weight regularization for fast neural vocoding on CPU
SIMD-size aware weight regularization for fast neural vocoding on CPU
Hiroki Kanagawa
Yusuke Ijima
19
0
0
02 Nov 2022
Neural Fourier Shift for Binaural Speech Rendering
Neural Fourier Shift for Binaural Speech Rendering
Jinkyu Lee
Kyogu Lee
41
7
0
02 Nov 2022
Comparision Of Adversarial And Non-Adversarial LSTM Music Generative
  Models
Comparision Of Adversarial And Non-Adversarial LSTM Music Generative Models
Moseli Motsóehli
Anna Sergeevna Bosman
J. D. Villiers
AAML
GAN
MGen
37
0
0
01 Nov 2022
Waveform Boundary Detection for Partially Spoofed Audio
Waveform Boundary Detection for Partially Spoofed Audio
Zexin Cai
Weiqing Wang
Ming Li
24
25
0
01 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
35
0
0
31 Oct 2022
Audio Time-Scale Modification with Temporal Compressing Networks
Audio Time-Scale Modification with Temporal Compressing Networks
Ernie Chu
Ju-Ting Chen
Chia-Ping Chen
25
0
0
31 Oct 2022
Towards zero-shot Text-based voice editing using acoustic context
  conditioning, utterance embeddings, and reference encoders
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Kohler
Qing He
23
0
0
28 Oct 2022
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band
  Generation and Inverse Short-Time Fourier Transform
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Masaya Kawamura
Yuma Shirahata
Ryuichi Yamamoto
Kentaro Tachibana
34
15
0
28 Oct 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for
  End-to-end Emotional Speech Synthesis
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Yuma Shirahata
Ryuichi Yamamoto
Eunwoo Song
Ryo Terashima
Jae-Min Kim
Kentaro Tachibana
33
10
0
28 Oct 2022
Nonparallel High-Quality Audio Super Resolution with Domain Adaptation
  and Resampling CycleGANs
Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Reo Yoneyama
Ryuichi Yamamoto
Kentaro Tachibana
26
4
0
28 Oct 2022
One-Shot Acoustic Matching Of Audio Signals -- Learning to Hear Music In
  Any Room/ Concert Hall
One-Shot Acoustic Matching Of Audio Signals -- Learning to Hear Music In Any Room/ Concert Hall
Prateek Verma
C. Chafe
J. Berger
24
1
0
27 Oct 2022
LyricJam Sonic: A Generative System for Real-Time Composition and
  Musical Improvisation
LyricJam Sonic: A Generative System for Real-Time Composition and Musical Improvisation
Olga Vechtomova
Gaurav Sahu
14
6
0
27 Oct 2022
Learned Inertial Odometry for Autonomous Drone Racing
Learned Inertial Odometry for Autonomous Drone Racing
Giovanni Cioffi
L. Bauersfeld
Elia Kaufmann
Davide Scaramuzza
41
20
0
27 Oct 2022
Cover Reproducible Steganography via Deep Generative Models
Cover Reproducible Steganography via Deep Generative Models
Kejiang Chen
Hang Zhou
Yaofei Wang
Meng Li
Weiming Zhang
Neng H. Yu
DiffM
34
9
0
26 Oct 2022
WaveBound: Dynamic Error Bounds for Stable Time Series Forecasting
WaveBound: Dynamic Error Bounds for Stable Time Series Forecasting
Youngin Cho
Daejin Kim
Dongmin Kim
Mohammad Azam Khan
Jaegul Choo
AI4TS
34
3
0
25 Oct 2022
EBEN: Extreme bandwidth extension network applied to speech signals
  captured with noise-resilient body-conduction microphones
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
J. Hauret
Thomas Joubaud
V. Zimpfer
Éric Bavu
19
9
0
25 Oct 2022
A Survey on Artificial Intelligence for Music Generation: Agents,
  Domains and Perspectives
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
49
6
0
25 Oct 2022
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
AI4TS
31
5
0
25 Oct 2022
High Fidelity Neural Audio Compression
High Fidelity Neural Audio Compression
Alexandre Défossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
37
612
0
24 Oct 2022
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based
  On FullConv-TTS
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS
Ziqi Liang
38
0
0
24 Oct 2022
A Machine Learning Approach to Classifying Construction Cost Documents
  into the International Construction Measurement Standard
A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement Standard
J. Ignacio Deza
Hisham Ihshaish
L. Mahdjoubi
36
0
0
24 Oct 2022
Federated Learning and Meta Learning: Approaches, Applications, and
  Directions
Federated Learning and Meta Learning: Approaches, Applications, and Directions
Xiaonan Liu
Yansha Deng
Arumugam Nallanathan
M. Bennis
77
32
0
24 Oct 2022
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary
  Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Chunhui Wang
Chang Zeng
Jun Chen
Xingji He
56
7
0
23 Oct 2022
Boomerang: Local sampling on image manifolds using diffusion models
Boomerang: Local sampling on image manifolds using diffusion models
Lorenzo Luzi
P. Mayer
Josue Casco-Rodriguez
Ali Siahkoohi
Richard G. Baraniuk
DiffM
37
20
0
21 Oct 2022
Adaptive re-calibration of channel-wise features for Adversarial Audio
  Classification
Adaptive re-calibration of channel-wise features for Adversarial Audio Classification
Vardhan Dongre
Abhinav Thimma Reddy
Nikhitha Reddeddy
AAML
24
0
0
21 Oct 2022
Improved Normalizing Flow-Based Speech Enhancement using an All-pole
  Gammatone Filterbank for Conditional Input Representation
Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation
Martin Strauss
Matteo Torcoli
B. Edler
31
4
0
21 Oct 2022
Robust One-Shot Singing Voice Conversion
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
42
8
0
20 Oct 2022
DOT-VAE: Disentangling One Factor at a Time
DOT-VAE: Disentangling One Factor at a Time
Vaishnavi Patil
Matthew Evanusa
J. JáJá
CoGe
DRL
CML
23
1
0
19 Oct 2022
Transformers Learn Shortcuts to Automata
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
53
158
0
19 Oct 2022
Autoregressive Generative Modeling with Noise Conditional Maximum
  Likelihood Estimation
Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation
Henry Li
Y. Kluger
30
2
0
19 Oct 2022
Language Does More Than Describe: On The Lack Of Figurative Speech in
  Text-To-Image Models
Language Does More Than Describe: On The Lack Of Figurative Speech in Text-To-Image Models
Ricardo Kleinlein
Cristina Luna Jiménez
Fernando Fernández-Martínez
DiffM
23
3
0
19 Oct 2022
Spoofed training data for speech spoofing countermeasure can be
  efficiently created using neural vocoders
Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders
Xin Wang
Junichi Yamagishi
26
36
0
19 Oct 2022
Mid-attribute speaker generation using optimal-transport-based
  interpolation of Gaussian mixture models
Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Detai Xin
Hiroshi Saruwatari
45
3
0
18 Oct 2022
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Zhiyuan Zhang
Lingjuan Lyu
Xingjun Ma
Chenguang Wang
Xu Sun
AAML
23
41
0
18 Oct 2022
TorchDIVA: An Extensible Computational Model of Speech Production built
  on an Open-Source Machine Learning Library
TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library
Sean M. Kinahan
J. Liss
Visar Berisha
16
2
0
17 Oct 2022
Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario
Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario
Emily R. Bartusiak
Edward J. Delp
27
12
0
14 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
31
16
0
14 Oct 2022
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
YuXuan Liu
Nikhil Mishra
Maximilian Sieb
Yide Shentu
Pieter Abbeel
Xi Chen
3DPC
43
5
0
13 Oct 2022
Learning Multivariate CDFs and Copulas using Tensor Factorization
Learning Multivariate CDFs and Copulas using Tensor Factorization
Magda Amiridi
N. Sidiropoulos
25
1
0
13 Oct 2022
Previous
123...161718...596061
Next