ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.02133
  4. Cited By
SpeechStew: Simply Mix All Available Speech Recognition Data to Train
  One Large Neural Network

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

5 April 2021
William Chan
Daniel S. Park
Chris A. Lee
Yu Zhang
Quoc V. Le
Mohammad Norouzi
    AI4TS
ArXivPDFHTML

Papers citing "SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network"

34 / 34 papers shown
Title
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving
  Zero-Shot Voice Editing
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing
Philip Anastassiou
Zhenyu Tang
Kainan Peng
Dongya Jia
Jiaxin Li
Ming Tu
Yuping Wang
Yuxuan Wang
Mingbo Ma
42
4
0
10 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
RADIA -- Radio Advertisement Detection with Intelligent Analytics
RADIA -- Radio Advertisement Detection with Intelligent Analytics
Jorge Álvarez
J. C. Armenteros
Camilo Torrón
Miguel Ortega-Martín
Alfonso Ardoiz
...
Íñigo Galdeano
Ignacio Garrido
Adrián Alonso
Fernando Bayón
Oleg Vorontsov
26
0
0
06 Mar 2024
Improved Long-Form Speech Recognition by Jointly Modeling the Primary
  and Non-primary Speakers
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
Guru Prakash Arumugam
Shuo-yiin Chang
Tara N. Sainath
Rohit Prabhavalkar
Quan Wang
Shaan Bijwadia
29
3
0
18 Dec 2023
Adaptation of Whisper models to child speech recognition
Adaptation of Whisper models to child speech recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Peter Corcoran
H. Cucu
16
30
0
24 Jul 2023
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Xilin Jiang
Yinghao Aaron Li
N. Mesgarani
CLL
24
1
0
29 May 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African
  Languages
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
33
2
0
22 Mar 2023
Robust Knowledge Distillation from RNN-T Models With Noisy Training
  Labels Using Full-Sum Loss
Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Mohammad Zeineldeen
Kartik Audhkhasi
M. Baskar
Bhuvana Ramabhadran
24
2
0
10 Mar 2023
Efficient Domain Adaptation for Speech Foundation Models
Efficient Domain Adaptation for Speech Foundation Models
Bo-wen Li
DongSeon Hwang
Zhouyuan Huo
Junwen Bai
Guru Prakash
...
K. Sim
Yu Zhang
Wei Han
Trevor Strohman
F. Beaufays
AI4CE
44
23
0
03 Feb 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
25
7
0
31 Dec 2022
Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Pengcheng Li
Genshun Wan
Fenglin Ding
Hang Chen
Jianqing Gao
Jia-Yu Pan
Cong Liu
SSL
27
1
0
07 Dec 2022
Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Genshun Wan
Tan Liu
Hang Chen
Jia-Yu Pan
Cong Liu
Z. Ye
SSL
18
0
0
07 Dec 2022
Improved Self-Supervised Multilingual Speech Representation Learning
  Combined with Auxiliary Language Information
Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information
Fenglin Ding
Genshun Wan
Pengcheng Li
Jia-Yu Pan
Cong Liu
SSL
25
1
0
07 Dec 2022
Learning the joint distribution of two sequences using little or no
  paired data
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
17
2
0
06 Dec 2022
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition
Yufeng Yang
Ashutosh Pandey
DeLiang Wang
24
8
0
24 Oct 2022
G-Augment: Searching for the Meta-Structure of Data Augmentation
  Policies for ASR
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
Gary Wang
Ekin D.Cubuk
Andrew Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel S. Park
30
1
0
19 Oct 2022
A Comparison of Transformer, Convolutional, and Recurrent Neural
  Networks on Phoneme Recognition
A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Kyuhong Shim
Wonyong Sung
25
2
0
01 Oct 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer
  to Unlabeled Modality
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
27
41
0
14 Jul 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
Zhen Zhang
Shuai Zheng
Yida Wang
Justin Chiu
George Karypis
Trishul Chilimbi
Mu Li
Xin Jin
19
39
0
30 Apr 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech
  recognition
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
Zhao You
Shulin Feng
Dan Su
Dong Yu
22
9
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
28
56
0
06 Apr 2022
Similarity and Content-based Phonetic Self Attention for Speech
  Recognition
Similarity and Content-based Phonetic Self Attention for Speech Recognition
Kyuhong Shim
Wonyong Sung
18
7
0
19 Mar 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
115
1,704
0
26 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text
  Joint Pre-Training
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
61
94
0
20 Oct 2021
Continual learning using lattice-free MMI for speech recognition
Continual learning using lattice-free MMI for speech recognition
Hossein Hadian
Arsenii Gorin
CLL
18
1
0
13 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for
  End-to-End Speech Recognition
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Xuankai Chang
Takashi Maekaku
Pengcheng Guo
Jing Shi
Yen-Ju Lu
...
Tianzi Wang
Shu-Wen Yang
Yu Tsao
Hung-yi Lee
Shinji Watanabe
SSL
AI4TS
24
81
0
09 Oct 2021
Spell my name: keyword boosted speech recognition
Spell my name: keyword boosted speech recognition
Namkyu Jung
Geon-min Kim
Joon Son Chung
51
13
0
06 Oct 2021
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets
  Development
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Mingkuan Liu
Chi Zhang
Hua Xing
C. Feng
Mon-Chu Chen
Judith Bishop
Grace Ngapo
27
3
0
01 Sep 2021
Injecting Text in Self-Supervised Speech Pretraining
Injecting Text in Self-Supervised Speech Pretraining
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Gary Wang
Pedro J. Moreno
SSL
25
36
0
27 Aug 2021
A deep convolutional neural network that is invariant to time rescaling
A deep convolutional neural network that is invariant to time rescaling
Brandon G. Jacques
Zoran Tiganj
Aakash Sarkar
Marc W Howard
P. Sederberg
AI4TS
21
7
0
09 Jul 2021
Transformer Language Models with LSTM-based Cross-utterance Information
  Representation
Transformer Language Models with LSTM-based Cross-utterance Information Representation
G. Sun
C. Zhang
P. Woodland
76
32
0
12 Feb 2021
Pushing the Limits of Semi-Supervised Learning for Automatic Speech
  Recognition
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
James Qin
Daniel S. Park
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Quoc V. Le
Yonghui Wu
VLM
SSL
146
308
0
20 Oct 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
1