ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08435
  4. Cited By
Efficient Neural Audio Synthesis

Efficient Neural Audio Synthesis

23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
ArXivPDFHTML

Papers citing "Efficient Neural Audio Synthesis"

50 / 472 papers shown
Title
Prosodic Representation Learning and Contextual Sampling for Neural
  Text-to-Speech
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech
S. Karlapati
Ammar Abbas
Zack Hodari
Alexis Moinet
Arnaud Joly
Panagiota Karanasou
Thomas Drugman
23
19
0
04 Nov 2020
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with
  Temporal Adaptive Normalization
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
Ahmed Mustafa
N. Pia
Guillaume Fuchs
22
71
0
03 Nov 2020
DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device
  Text-to-Speech
DeviceTTS: A Small-Footprint, Fast, Stable Network for On-Device Text-to-Speech
Zhiying Huang
Hao Li
Ming Lei
6
11
0
29 Oct 2020
PPG-based singing voice conversion with adversarial representation
  learning
PPG-based singing voice conversion with adversarial representation learning
Zhonghao Li
Benlai Tang
Xiang Yin
Yuan Wan
Linjia Xu
Chen Shen
Zejun Ma
19
37
0
28 Oct 2020
Parallel waveform synthesis based on generative adversarial networks
  with voicing-aware conditional discriminators
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Ryuichi Yamamoto
Eunwoo Song
Min-Jae Hwang
Jae-Min Kim
27
18
0
27 Oct 2020
Unsupervised Learning of Disentangled Speech Content and Style
  Representation
Unsupervised Learning of Disentangled Speech Content and Style Representation
Andros Tjandra
Ruoming Pang
Yu Zhang
Shigeki Karita
BDL
DRL
23
15
0
24 Oct 2020
Listening to Sounds of Silence for Speech Denoising
Listening to Sounds of Silence for Speech Denoising
Ruilin Xu
Rundi Wu
Y. Ishiwaka
Carl Vondrick
Changxi Zheng
28
32
0
22 Oct 2020
Brain-Inspired Learning on Neuromorphic Substrates
Brain-Inspired Learning on Neuromorphic Substrates
Friedemann Zenke
Emre Neftci
38
89
0
22 Oct 2020
The NTU-AISG Text-to-speech System for Blizzard Challenge 2020
The NTU-AISG Text-to-speech System for Blizzard Challenge 2020
Haobo Zhang
Tingzhi Mao
Haihua Xu
Hao-Ming Huang
10
1
0
22 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
30
102
0
22 Oct 2020
An Investigation of the Relation Between Grapheme Embeddings and
  Pronunciation for Tacotron-based Systems
An Investigation of the Relation Between Grapheme Embeddings and Pronunciation for Tacotron-based Systems
Antoine Perquin
Erica Cooper
Junichi Yamagishi
9
1
0
21 Oct 2020
The NeteaseGames System for Voice Conversion Challenge 2020 with
  Vector-quantization Variational Autoencoder and WaveNet
The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet
Haitong Zhang
DRL
15
4
0
15 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020:
  On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural
  Vocoders
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
T. Toda
19
8
0
09 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis
  Including Unsupervised Duration Modeling
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
27
112
0
08 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed
  Langevin Dynamics
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
23
21
0
06 Oct 2020
Transfer Learning from Speech Synthesis to Voice Conversion with
  Non-Parallel Training Data
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
24
52
0
30 Sep 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffM
BDL
34
1,397
0
21 Sep 2020
Controllable neural text-to-speech synthesis using intuitive prosodic
  features
Controllable neural text-to-speech synthesis using intuitive prosodic features
T. Raitio
Ramya Rasipuram
D. Castellani
39
66
0
14 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
  Modeling
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
26
88
0
06 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
14
773
0
02 Sep 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
T. Toda
27
206
0
28 Aug 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative
  Adversarial Networks
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
18
20
0
27 Aug 2020
Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning
Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning
Noé Tits
Kevin El Haddad
Thierry Dutoit
12
14
0
20 Aug 2020
Textual Echo Cancellation
Textual Echo Cancellation
Shaojin Ding
Ye Jia
Ke Hu
Quan Wang
24
8
0
13 Aug 2020
Enhancing Speech Intelligibility in Text-To-Speech Synthesis using
  Speaking Style Conversion
Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion
D. Paul
M. Shifas
Yannis Pantazis
Y. Stylianou
6
21
0
13 Aug 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text
  Length Limit
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
19
8
0
13 Aug 2020
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Ravichander Vipperla
Sangjun Park
Kihyun Choo
Samin S. Ishtiaq
Kyoungbo Min
S. Bhattacharya
Abhinav Mehrotra
Alberto Gil C. P. Ramos
Nicholas D. Lane
26
26
0
11 Aug 2020
SpeedySpeech: Efficient Neural Speech Synthesis
SpeedySpeech: Efficient Neural Speech Synthesis
Jan Vainer
Ondrej Dusek
24
42
0
09 Aug 2020
Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen
  Speaker and Recording Conditions
Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions
D. Paul
Yannis Pantazis
Y. Stylianou
DRL
16
29
0
09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
41
318
0
09 Aug 2020
Incremental Text to Speech for Neural Sequence-to-Sequence Models using
  Reinforcement Learning
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
D. Mohan
R. Lenain
Lorenzo Foglianti
Tian Huey Teh
Marlene Staib
Alexandra Torresquintero
Jiameng Gao
AI4TS
17
11
0
07 Aug 2020
DurIAN-SC: Duration Informed Attention Network based Singing Voice
  Conversion System
DurIAN-SC: Duration Informed Attention Network based Singing Voice Conversion System
Liqiang Zhang
Chengzhu Yu
Heng Lu
Chao Weng
Chunlei Zhang
Yusong Wu
Xiang Xie
Zijin Li
Dong Yu
22
34
0
07 Aug 2020
Unsupervised Cross-Domain Singing Voice Conversion
Unsupervised Cross-Domain Singing Voice Conversion
Adam Polyak
Lior Wolf
Yossi Adi
Yaniv Taigman
20
44
0
06 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding
HooliGAN: Robust, High Quality Neural Vocoding
Ollie McCarthy
Zo Ahmed
13
14
0
06 Aug 2020
Recognition-Synthesis Based Non-Parallel Voice Conversion with
  Adversarial Learning
Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
15
6
0
05 Aug 2020
Expressive TTS Training with Frame and Style Reconstruction Loss
Expressive TTS Training with Frame and Style Reconstruction Loss
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
39
73
0
04 Aug 2020
A Spectral Energy Distance for Parallel Speech Synthesis
A Spectral Energy Distance for Parallel Speech Synthesis
A. Gritsenko
Tim Salimans
Rianne van den Berg
Jasper Snoek
Nal Kalchbrenner
11
70
0
03 Aug 2020
One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
Tomás Nekvinda
Ondrej Dusek
23
57
0
03 Aug 2020
Audiovisual Speech Synthesis using Tacotron2
Audiovisual Speech Synthesis using Tacotron2
Ahmed Hussen Abdelaziz
Anushree Prasanna Kumar
Chloe Seivwright
Gabriele Fanelli
Justin Binder
Y. Stylianou
S. Kajarekar
20
15
0
03 Aug 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested
  Adversarial Network
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Jinhyeok Yang
Junmo Lee
Young-Ik Kim
Hoonyoung Cho
Injung Kim
14
72
0
30 Jul 2020
Privacy-preserving Voice Analysis via Disentangled Representations
Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi
Hamed Haddadi
David E. Boyle
DRL
26
58
0
29 Jul 2020
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network
  and an Adversarial Pair Discriminator
Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network and an Adversarial Pair Discriminator
Ravi Shankar
Jacob Sager
A. Venkataraman
GAN
42
18
0
25 Jul 2020
Sudo rm -rf: Efficient Networks for Universal Audio Source Separation
Sudo rm -rf: Efficient Networks for Universal Audio Source Separation
Efthymios Tzinis
Zhepei Wang
Paris Smaragdis
36
128
0
14 Jul 2020
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model
  with Pitch-dependent Dilated Convolution Neural Network
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network
Yi-Chiao Wu
Tomoki Hayashi
Patrick Lumban Tobing
Kazuhiro Kobayashi
T. Toda
27
18
0
11 Jul 2020
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
Yi Ren
Xu Tan
Tao Qin
Jian Luan
Zhou Zhao
Tie-Yan Liu
39
73
0
09 Jul 2020
Statistical Mechanical Analysis of Neural Network Pruning
Statistical Mechanical Analysis of Neural Network Pruning
Rupam Acharyya
Ankani Chattoraj
Boyu Zhang
Shouman Das
Daniel Stefankovic
32
0
0
30 Jun 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
118
17,084
0
19 Jun 2020
Sparse GPU Kernels for Deep Learning
Sparse GPU Kernels for Deep Learning
Trevor Gale
Matei A. Zaharia
C. Young
Erich Elsen
17
230
0
18 Jun 2020
A Practical Sparse Approximation for Real Time Recurrent Learning
A Practical Sparse Approximation for Real Time Recurrent Learning
Jacob Menick
Erich Elsen
Utku Evci
Simon Osindero
Karen Simonyan
Alex Graves
21
31
0
12 Jun 2020
Deep generative models for musical audio synthesis
Deep generative models for musical audio synthesis
M. Huzaifah
L. Wyse
27
20
0
10 Jun 2020
Previous
123...106789
Next