Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.08435
Cited By
Efficient Neural Audio Synthesis
23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Neural Audio Synthesis"
50 / 472 papers shown
Title
Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding
Haici Yang
Wootaek Lim
Minje Kim
24
9
0
04 Nov 2022
Iterative autoregression: a novel trick to improve your low-latency speech enhancement model
Pavel Andreev
Nicholas Babaev
Azat Saginbaev
Ivan Shchekotov
Aibek Alanov
29
4
0
03 Nov 2022
SIMD-size aware weight regularization for fast neural vocoding on CPU
Hiroki Kanagawa
Yusuke Ijima
16
0
0
02 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
35
0
0
31 Oct 2022
Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders
Jason Fong
Yun Wang
Prabhav Agrawal
Vimal Manohar
Jilong Wu
Thilo Kohler
Qing He
23
0
0
28 Oct 2022
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Masaya Kawamura
Yuma Shirahata
Ryuichi Yamamoto
Kentaro Tachibana
32
15
0
28 Oct 2022
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Nobuyuki Morioka
Heiga Zen
Nanxin Chen
Yu Zhang
Yifan Ding
39
16
0
28 Oct 2022
Streaming Parrotron for on-device speech-to-speech conversion
Oleg Rybakov
Fadi Biadsy
Xia Zhang
Liyang Jiang
Phoenix Meadowlark
Shivani Agrawal
34
3
0
25 Oct 2022
High Fidelity Neural Audio Compression
Alexandre Défossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
35
607
0
24 Oct 2022
Perfectly Secure Steganography Using Minimum Entropy Coupling
Christian Schroeder de Witt
Samuel Sokota
J. Zico Kolter
Jakob N. Foerster
Martin Strohmeier
19
33
0
24 Oct 2022
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS
Ziqi Liang
36
0
0
24 Oct 2022
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Chunhui Wang
Chang Zeng
Jun Chen
Xingji He
54
7
0
23 Oct 2022
Adaptive re-calibration of channel-wise features for Adversarial Audio Classification
Vardhan Dongre
Abhinav Thimma Reddy
Nikhitha Reddeddy
AAML
21
0
0
21 Oct 2022
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
30
8
0
20 Oct 2022
Hierarchical Diffusion Models for Singing Voice Neural Vocoder
Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
DiffM
29
16
0
14 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
49
29
0
03 Oct 2022
The Chamber Ensemble Generator: Limitless High-Quality MIR Data via Generative Modeling
Yusong Wu
Josh Gardner
Ethan Manilow
Ian Simon
Curtis Hawthorne
Jesse Engel
40
10
0
28 Sep 2022
AutoLV: Automatic Lecture Video Generator
Wen Wang
Yang Song
Sanjay Jha
VGen
29
3
0
19 Sep 2022
Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Sheng-Chun Kao
Amir Yazdanbakhsh
Suvinay Subramanian
Shivani Agrawal
Utku Evci
T. Krishna
50
12
0
15 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
73
573
0
07 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Tengjiao Wang
Ming-Hsuan Yang
DiffM
MedIm
226
1,314
0
02 Sep 2022
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks
L. Finkelstein
Heiga Zen
Norman Casagrande
Chun-an Chan
Ye Jia
...
Jonathan Shen
V. Wan
Yu Zhang
Yonghui Wu
R. Clark
25
9
0
28 Aug 2022
Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi
M. Levy
Richard Sharp
28
6
0
26 Aug 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
28
11
0
21 Aug 2022
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
Xin Yan
Jiangyan Yi
J. Tao
Chenglong Wang
Haoxin Ma
Tao Wang
Shiming Wang
Ruibo Fu
22
26
0
20 Aug 2022
Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer
S. Nercessian
10
9
0
15 Aug 2022
Towards Parametric Speech Synthesis Using Gaussian-Markov Model of Spectral Envelope and Wavelet-Based Decomposition of F0
M. S. Al-Radhi
Tamás Gábor Csapó
Csaba Zainkó
Géza Németh
14
1
0
15 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
34
24
0
09 Aug 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System
Yi-Chiao Wu
Patrick Lumban Tobing
Kazuki Yasuhara
Noriyuki Matsunaga
Yamato Ohtani
T. Toda
42
0
0
13 Jul 2022
WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer Conditional Adversarial Training
Zewang Zhang
Yibin Zheng
Xinhui Li
Li Lu
DiffM
31
11
0
05 Jul 2022
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis
Tao Li
Xinsheng Wang
Qicong Xie
Zhichao Wang
Ming Jiang
Linfu Xie
35
15
0
04 Jul 2022
Towards Error-Resilient Neural Speech Coding
Huaying Xue
Xiulian Peng
Xue Jiang
Yan Lu
29
7
0
03 Jul 2022
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Kyle Kastner
Aaron Courville
35
0
0
30 Jun 2022
Expressive, Variable, and Controllable Duration Modelling in TTS
Ammar Abbas
Thomas Merritt
Alexis Moinet
S. Karlapati
Ewa Muszyñska
Simon Slangen
Elia Gatti
Thomas Drugman
33
10
0
28 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
49
23
0
27 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
28
0
0
20 Jun 2022
The State of Sparse Training in Deep Reinforcement Learning
L. Graesser
Utku Evci
Erich Elsen
Pablo Samuel Castro
OffRL
14
34
0
17 Jun 2022
NatiQ: An End-to-end Text-to-Speech System for Arabic
Ahmed Abdelali
Nadir Durrani
C. Demiroğlu
Fahim Dalvi
Hamdy Mubarak
Kareem Darwish
23
14
0
15 Jun 2022
Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models
Fan Bao
Chongxuan Li
Jiacheng Sun
Jun Zhu
Bo Zhang
DiffM
36
73
0
15 Jun 2022
LPCSE: Neural Speech Enhancement through Linear Predictive Coding
Yang Liu
Na Tang
Xia Chu
Yang Yang
Jun Wang
31
1
0
14 Jun 2022
Adversarial Audio Synthesis with Complex-valued Polynomial Networks
Yongtao Wu
Grigorios G. Chrysos
V. Cevher
DiffM
27
4
0
14 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
23
49
0
11 Jun 2022
A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation
Junhui Zhang
Wudi Bao
Junjie Pan
Xiang Yin
Zejun Ma
19
2
0
10 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
30
230
0
09 Jun 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama
Yi-Chiao Wu
T. Toda
30
14
0
12 May 2022
Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model
J. Valin
Ahmed Mustafa
Christopher Montgomery
Timothy B. Terriberry
Michael Klingbeil
Paris Smaragdis
A. Krishnaswamy
27
18
0
11 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
44
213
0
09 May 2022
Green Accelerated Hoeffding Tree
E. García-Martín
Albert Bifet
Niklas Lavesson
Rikard König
Henrik Linusson
26
7
0
06 May 2022
Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss
Efthymios Georgiou
Kosmas Kritsis
Georgios Paraskevopoulos
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
23
3
0
28 Apr 2022
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
42
5
0
25 Apr 2022
Previous
1
2
3
4
5
6
...
8
9
10
Next