ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.07157
  4. Cited By
You Do Not Need More Data: Improving End-To-End Speech Recognition by
  Text-To-Speech Data Augmentation

You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation

14 May 2020
A. Laptev
Roman Korostik
A. Svischev
A. Andrusenko
Ivan Medennikov
S. Rybin
ArXivPDFHTML

Papers citing "You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation"

25 / 25 papers shown
Title
Contrastive Learning from Synthetic Audio Doppelgängers
Contrastive Learning from Synthetic Audio Doppelgängers
Manuel Cherep
Nikhil Singh
51
1
0
09 Jun 2024
Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner
  Party Transcription
Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
A. Andrusenko
A. Laptev
Ivan Medennikov
27
16
0
22 Apr 2020
Generating diverse and natural text-to-speech samples using a quantized
  fine-grained VAE and auto-regressive prosody prior
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
DiffM
46
92
0
06 Feb 2020
Generating Synthetic Audio Data for Attention-Based Speech Recognition
  Systems
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Nick Rossenbach
Albert Zeyer
Ralf Schluter
Hermann Ney
36
83
0
19 Dec 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
86
245
0
19 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech
  Synthesis
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
48
114
0
23 Oct 2019
Speech Recognition with Augmented Synthesized Speech
Speech Recognition with Augmented Synthesized Speech
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Ye Jia
Pedro J. Moreno
Yonghui Wu
Zelin Wu
40
127
0
25 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
44
718
0
13 Sep 2019
Maximizing Mutual Information for Tacotron
Maximizing Mutual Information for Tacotron
Peng Liu
Xixin Wu
Shiyin Kang
Guangzhi Li
Dan Su
Dong Yu
44
16
0
30 Aug 2019
Effective Use of Variational Embedding Capacity in Expressive End-to-End
  Speech Synthesis
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
Eric Battenberg
Soroosh Mariooryad
Daisy Stanton
RJ Skerry-Ryan
Matt Shannon
David Kao
Tom Bagby
BDL
32
45
0
08 Jun 2019
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data
  Augmentation
RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation
Christoph Luscher
Eugen Beck
Kazuki Irie
M. Kitza
Wilfried Michel
Albert Zeyer
Ralf Schluter
Hermann Ney
VLM
83
234
0
08 May 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech
  Recognition
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
140
3,435
0
18 Apr 2019
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and
  Language Models
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models
Thomas Drugman
Janne Pylkkönen
Reinhard Kneser
10
59
0
07 Mar 2019
Training Neural Speech Recognition Systems with Synthetic Speech
  Augmentation
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Jason Chun Lok Li
R. Gadde
Boris Ginsburg
Vitaly Lavrukhin
26
55
0
02 Nov 2018
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through
  Estimator
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator
Andros Tjandra
S. Sakti
Satoshi Nakamura
31
44
0
31 Oct 2018
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction
J. Valin
Jan Skoglund
28
450
0
28 Oct 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis
Hierarchical Generative Modeling for Controllable Speech Synthesis
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
...
Ye Jia
Zhiwen Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
36
275
0
16 Oct 2018
SentencePiece: A simple and language independent subword tokenizer and
  detokenizer for Neural Text Processing
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo
John Richardson
131
3,490
0
19 Aug 2018
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
70
1,492
0
30 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in
  End-to-End Speech Synthesis
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
54
822
0
23 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
59
2,684
0
16 Dec 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
324
129,831
0
12 Jun 2017
Tacotron: Towards End-to-End Speech Synthesis
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
120
1,817
0
29 Mar 2017
Attention-Based Models for Speech Recognition
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
88
2,602
0
24 Jun 2015
Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
332
16,972
0
20 Dec 2013
1