ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10288
  4. Cited By
Location-Relative Attention Mechanisms For Robust Long-Form Speech
  Synthesis

Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis

23 October 2019
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
ArXivPDFHTML

Papers citing "Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis"

26 / 76 papers shown
Title
Fine-grained Emotion Strength Transfer, Control and Prediction for
  Emotional Speech Synthesis
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis
Yinjiao Lei
Shan Yang
Lei Xie
25
55
0
17 Nov 2020
Pretraining Strategies, Waveform Model Choice, and Acoustic
  Configurations for Multi-Speaker End-to-End Speech Synthesis
Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Erica Cooper
Xin Wang
Yi Zhao
Yusuke Yasuda
Junichi Yamagishi
SyDa
6
3
0
10 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
24
97
0
06 Nov 2020
FeatherTTS: Robust and Efficient attention based Neural TTS
FeatherTTS: Robust and Efficient attention based Neural TTS
Qiao Tian
Zewang Zhang
Chao-Jung Liu
Heng Lu
Linghui Chen
Bin Wei
P. He
Shan Liu
15
4
0
02 Nov 2020
PPG-based singing voice conversion with adversarial representation
  learning
PPG-based singing voice conversion with adversarial representation learning
Zhonghao Li
Benlai Tang
Xiang Yin
Yuan Wan
Linjia Xu
Chen Shen
Zejun Ma
16
37
0
28 Oct 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines
Yao Shi
Hui Bu
Xin Xu
Shaojing Zhang
Ming Li
30
219
0
22 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
19
102
0
22 Oct 2020
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
23
16
0
19 Oct 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis
  Including Unsupervised Duration Modeling
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan Shen
Ye Jia
Mike Chrzanowski
Yu Zhang
Isaac Elias
Heiga Zen
Yonghui Wu
19
112
0
08 Oct 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
  Modeling
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
26
88
0
06 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffM
BDL
14
771
0
02 Sep 2020
Prosody Learning Mechanism for Speech Synthesis System Without Text
  Length Limit
Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
13
8
0
13 Aug 2020
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial
  Training
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training
Jian Cong
Shan Yang
Lei Xie
Guoqiao Yu
Guanglu Wan
24
30
0
10 Aug 2020
Incremental Text to Speech for Neural Sequence-to-Sequence Models using
  Reinforcement Learning
Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning
D. Mohan
R. Lenain
Lorenzo Foglianti
Tian Huey Teh
Marlene Staib
Alexandra Torresquintero
Jiameng Gao
AI4TS
9
11
0
07 Aug 2020
Exploiting Deep Sentential Context for Expressive End-to-End Speech
  Synthesis
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis
Fengyu Yang
Shan Yang
Qinghua Wu
Yujun Wang
Lei Xie
31
5
0
03 Aug 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
17
185
0
05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
54
475
0
22 May 2020
Investigation of learning abilities on linguistic features in
  sequence-to-sequence text-to-speech synthesis
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke Yasuda
Xin Wang
Junichi Yamagishi
AI4TS
14
31
0
20 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End
  Text-To-Speech
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Wenjie Li
Benlai Tang
Xiang Yin
Yushi Zhao
Wei Li
Kang Wang
Hao Huang
Yuxuan Wang
Zejun Ma
6
13
0
19 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech
  without Explicit Alignment
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
D. Lim
Won Jang
Gyeonghwan O
Heayoung Park
Bongwan Kim
Jaesam Yoon
11
36
0
15 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by
  Text-To-Speech Data Augmentation
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
16
61
0
14 May 2020
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice
  Conversion without Parallel Data
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data
Seung-won Park
Doo-young Kim
Myun-chul Joe
10
40
0
07 May 2020
Adversarial Feature Learning and Unsupervised Clustering based Speech
  Synthesis for Found Data with Acoustic and Textual Noise
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise
Shan Yang
Yuxuan Wang
Lei Xie
6
9
0
28 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration
  Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Yu Gu
Xiang Yin
Yonghui Rao
Yuan Wan
Benlai Tang
Yang Zhang
Jitong Chen
Yuxuan Wang
Zejun Ma
12
70
0
23 Apr 2020
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
13
48
0
03 Oct 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End
  Speech Synthesis
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
BDL
14
45
0
08 Jun 2019
Previous
12