Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.03499
Cited By
WaveNet: A Generative Model for Raw Audio
12 September 2016
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaveNet: A Generative Model for Raw Audio"
50 / 3,039 papers shown
Title
I Hear Your True Colors: Image Guided Audio Generation
Roy Sheffer
Yossi Adi
VLM
18
74
0
06 Nov 2022
Self-Supervised Learning for Speech Enhancement through Synthesis
Bryce Irvin
Marko Stamenovic
M. Kegler
Li-Chia Yang
45
18
0
04 Nov 2022
Cold Diffusion for Speech Enhancement
Hao Yen
François Germain
Gordon Wichern
Jonathan Le Roux
DiffM
29
40
0
04 Nov 2022
Real-Time Target Sound Extraction
Bandhav Veluri
Justin Chan
Malek Itani
Tuochao Chen
Takuya Yoshioka
Shyamnath Gollakota
46
30
0
04 Nov 2022
Translated Skip Connections -- Expanding the Receptive Fields of Fully Convolutional Neural Networks
Joshua Bruton
Hairong Wang
SSeg
17
3
0
03 Nov 2022
HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks
Filip Szatkowski
Karol J. Piczak
Przemysław Spurek
Jacek Tabor
Tomasz Trzciñski
28
12
0
03 Nov 2022
Human in the loop approaches in multi-modal conversational task guidance system development
R. Manuvinakurike
Sovan Biswas
G. Raffa
R. Beckwith
A. Rhodes
Meng Shi
Gesem Gudino Mejia
Saurav Sahay
L. Nachman
43
2
0
03 Nov 2022
Iterative autoregression: a novel trick to improve your low-latency speech enhancement model
Pavel Andreev
Nicholas Babaev
Azat Saginbaev
Ivan Shchekotov
Aibek Alanov
29
4
0
03 Nov 2022
Audio Language Modeling using Perceptually-Guided Discrete Representations
Felix Kreuk
Yaniv Taigman
Adam Polyak
Jade Copet
Gabriel Synnaeve
Alexandre Défossez
Yossi Adi
32
4
0
02 Nov 2022
Inference and Denoise: Causal Inference-based Neural Speech Enhancement
Tsun-An Hsieh
Chao-Han Huck Yang
Pin-Yu Chen
Sabato Marco Siniscalchi
Yu Tsao
CML
63
2
0
02 Nov 2022
Adversarial Guitar Amplifier Modelling With Unpaired Data
Alec Wright
Vesa Valimaki
Lauri Juvela
GAN
28
8
0
02 Nov 2022
SIMD-size aware weight regularization for fast neural vocoding on CPU
Hiroki Kanagawa
Yusuke Ijima
19
0
0
02 Nov 2022
Neural Fourier Shift for Binaural Speech Rendering
Jinkyu Lee
Kyogu Lee
41
7
0
02 Nov 2022
Comparision Of Adversarial And Non-Adversarial LSTM Music Generative Models
Moseli Motsóehli
Anna Sergeevna Bosman
J. D. Villiers
AAML
GAN
MGen
37
0
0
01 Nov 2022
Waveform Boundary Detection for Partially Spoofed Audio
Zexin Cai
Weiqing Wang
Ming Li
24
25
0
01 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
35
0
0
31 Oct 2022
Audio Time-Scale Modification with Temporal Compressing Networks
Ernie Chu
Ju-Ting Chen
Chia-Ping Chen
25
0
0
31 Oct 2022
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Kohler
Qing He
23
0
0
28 Oct 2022
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Masaya Kawamura
Yuma Shirahata
Ryuichi Yamamoto
Kentaro Tachibana
34
15
0
28 Oct 2022
Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Yuma Shirahata
Ryuichi Yamamoto
Eunwoo Song
Ryo Terashima
Jae-Min Kim
Kentaro Tachibana
33
10
0
28 Oct 2022
Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs
Reo Yoneyama
Ryuichi Yamamoto
Kentaro Tachibana
26
4
0
28 Oct 2022
One-Shot Acoustic Matching Of Audio Signals -- Learning to Hear Music In Any Room/ Concert Hall
Prateek Verma
C. Chafe
J. Berger
24
1
0
27 Oct 2022
LyricJam Sonic: A Generative System for Real-Time Composition and Musical Improvisation
Olga Vechtomova
Gaurav Sahu
14
6
0
27 Oct 2022
Learned Inertial Odometry for Autonomous Drone Racing
Giovanni Cioffi
L. Bauersfeld
Elia Kaufmann
Davide Scaramuzza
41
20
0
27 Oct 2022
Cover Reproducible Steganography via Deep Generative Models
Kejiang Chen
Hang Zhou
Yaofei Wang
Meng Li
Weiming Zhang
Neng H. Yu
DiffM
34
9
0
26 Oct 2022
WaveBound: Dynamic Error Bounds for Stable Time Series Forecasting
Youngin Cho
Daejin Kim
Dongmin Kim
Mohammad Azam Khan
Jaegul Choo
AI4TS
34
3
0
25 Oct 2022
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
J. Hauret
Thomas Joubaud
V. Zimpfer
Éric Bavu
19
9
0
25 Oct 2022
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
49
6
0
25 Oct 2022
Semi-Supervised Learning Based on Reference Model for Low-resource TTS
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
AI4TS
31
5
0
25 Oct 2022
High Fidelity Neural Audio Compression
Alexandre Défossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
37
612
0
24 Oct 2022
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS
Ziqi Liang
38
0
0
24 Oct 2022
A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement Standard
J. Ignacio Deza
Hisham Ihshaish
L. Mahdjoubi
36
0
0
24 Oct 2022
Federated Learning and Meta Learning: Approaches, Applications, and Directions
Xiaonan Liu
Yansha Deng
Arumugam Nallanathan
M. Bennis
77
32
0
24 Oct 2022
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Chunhui Wang
Chang Zeng
Jun Chen
Xingji He
56
7
0
23 Oct 2022
Boomerang: Local sampling on image manifolds using diffusion models
Lorenzo Luzi
P. Mayer
Josue Casco-Rodriguez
Ali Siahkoohi
Richard G. Baraniuk
DiffM
37
20
0
21 Oct 2022
Adaptive re-calibration of channel-wise features for Adversarial Audio Classification
Vardhan Dongre
Abhinav Thimma Reddy
Nikhitha Reddeddy
AAML
24
0
0
21 Oct 2022
Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation
Martin Strauss
Matteo Torcoli
B. Edler
31
4
0
21 Oct 2022
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
42
8
0
20 Oct 2022
DOT-VAE: Disentangling One Factor at a Time
Vaishnavi Patil
Matthew Evanusa
J. JáJá
CoGe
DRL
CML
23
1
0
19 Oct 2022
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
53
158
0
19 Oct 2022
Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation
Henry Li
Y. Kluger
30
2
0
19 Oct 2022
Language Does More Than Describe: On The Lack Of Figurative Speech in Text-To-Image Models
Ricardo Kleinlein
Cristina Luna Jiménez
Fernando Fernández-Martínez
DiffM
23
3
0
19 Oct 2022
Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders
Xin Wang
Junichi Yamagishi
26
36
0
19 Oct 2022
Mid-attribute speaker generation using optimal-transport-based interpolation of Gaussian mixture models
Aya Watanabe
Shinnosuke Takamichi
Yuki Saito
Detai Xin
Hiroshi Saruwatari
45
3
0
18 Oct 2022
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Zhiyuan Zhang
Lingjuan Lyu
Xingjun Ma
Chenguang Wang
Xu Sun
AAML
23
41
0
18 Oct 2022
TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library
Sean M. Kinahan
J. Liss
Visar Berisha
16
2
0
17 Oct 2022
Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario
Emily R. Bartusiak
Edward J. Delp
27
12
0
14 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
31
16
0
14 Oct 2022
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
YuXuan Liu
Nikhil Mishra
Maximilian Sieb
Yide Shentu
Pieter Abbeel
Xi Chen
3DPC
43
5
0
13 Oct 2022
Learning Multivariate CDFs and Copulas using Tensor Factorization
Magda Amiridi
N. Sidiropoulos
25
1
0
13 Oct 2022
Previous
1
2
3
...
16
17
18
...
59
60
61
Next