Generating Synthetic Audio Data for Attention-Based Speech Recognition
Systems

Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems

19 December 2019

Nick Rossenbach

Papers citing "Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems"

14 / 14 papers shown

Title
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data Haoxin Li Boyang Li CoGe 73 0 0 03 Mar 2025
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data? Che Liu Zhongwei Wan Haozhe Wang Yinda Chen T. Qaiser Chen Jin Fariba Yousefi Nikolay Burlutskiy Rossella Arcucci VLM SyDa LM&MA MedIm 69 2 0 17 Oct 2024
Contrastive Learning from Synthetic Audio Doppelgängers Manuel Cherep Nikhil Singh 40 1 0 09 Jun 2024
Cross-Corpora Spoken Language Identification with Domain Diversification and Generalization Spandan Dey Md. Sahidullah G. Saha 21 11 0 10 Feb 2023
Learning Speech Emotion Representations in the Quaternion Domain E. Guizzo Tillman Weyde Simone Scardapane Danilo Comminiello 21 18 0 05 Apr 2022
Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition M. Soleymanpour Michael T. Johnson Rahim Soleymanpour J. Berry 27 27 0 27 Jan 2022
Sequence-level self-learning with multiple hypotheses K. Kumatani Dimitrios Dimitriadis Yashesh Gaur R. Gmyr Sefik Emre Eskimez Jinyu Li Michael Zeng SSL 20 1 0 10 Dec 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition A. Fazel Wei Yang Yulan Liu Roberto Barra-Chicote Yi Meng Roland Maas J. Droppo SyDa 13 48 0 14 Jun 2021
Label-Synchronous Speech-to-Text Alignment for ASR Using Forward and Backward Transformers Yusuke Kida Tatsuya Komatsu M. Togami 21 1 0 21 Apr 2021
Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures Nick Rossenbach Mohammad Zeineldeen Benedikt Hilmes Ralf Schluter Hermann Ney 28 12 0 12 Apr 2021
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems Xianrui Zheng Yulan Liu Deniz Gunceler D. Willett 17 78 0 23 Nov 2020
Early Stage LM Integration Using Local and Global Log-Linear Combination Wilfried Michel Ralf Schluter Hermann Ney 11 11 0 20 May 2020
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation A. Laptev Roman Korostik A. Svischev A. Andrusenko Ivan Medennikov S. Rybin 16 61 0 14 May 2020
Listening while Speaking: Speech Chain by Deep Learning Andros Tjandra S. Sakti Satoshi Nakamura AuLLM 126 165 0 16 Jul 2017