Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.07157
Cited By
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
14 May 2020
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation"
25 / 25 papers shown
Title
Contrastive Learning from Synthetic Audio Doppelgängers
Manuel Cherep
Nikhil Singh
51
1
0
09 Jun 2024
Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
A. Andrusenko
A. Laptev
Ivan Medennikov
27
16
0
22 Apr 2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
DiffM
46
92
0
06 Feb 2020
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Nick Rossenbach
Albert Zeyer
Ralf Schluter
Hermann Ney
36
83
0
19 Dec 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
86
245
0
19 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
48
114
0
23 Oct 2019
Speech Recognition with Augmented Synthesized Speech
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Ye Jia
Pedro J. Moreno
Yonghui Wu
Zelin Wu
40
127
0
25 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
44
718
0
13 Sep 2019
Maximizing Mutual Information for Tacotron
Peng Liu
Xixin Wu
Shiyin Kang
Guangzhi Li
Dan Su
Dong Yu
44
16
0
30 Aug 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
BDL
32
45
0
08 Jun 2019
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation
Christoph Luscher
Eugen Beck
Kazuki Irie
M. Kitza
Wilfried Michel
Albert Zeyer
Ralf Schluter
Hermann Ney
VLM
83
234
0
08 May 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
140
3,435
0
18 Apr 2019
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models
Thomas Drugman
Janne Pylkkönen
Reinhard Kneser
10
59
0
07 Mar 2019
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Jason Chun Lok Li
R. Gadde
Boris Ginsburg
Vitaly Lavrukhin
26
55
0
02 Nov 2018
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator
Andros Tjandra
S. Sakti
Satoshi Nakamura
31
44
0
31 Oct 2018
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction
J. Valin
Jan Skoglund
28
450
0
28 Oct 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
...
Ye Jia
Zhiwen Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
36
275
0
16 Oct 2018
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
131
3,490
0
19 Aug 2018
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
70
1,492
0
30 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
54
822
0
23 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
59
2,684
0
16 Dec 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
324
129,831
0
12 Jun 2017
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
120
1,817
0
29 Mar 2017
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
88
2,602
0
24 Jun 2015
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
332
16,972
0
20 Dec 2013
1