Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.05533
Cited By
SpecAugment on Large Scale Datasets
11 December 2019
Daniel S. Park
Yu Zhang
Chung-Cheng Chiu
Youzheng Chen
Bo-wen Li
William Chan
Quoc V. Le
Yonghui Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SpecAugment on Large Scale Datasets"
34 / 34 papers shown
Title
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
Xinyu Wang
Qian Wang
Haolin Huang
Yu Fang
Mengjie Xu
Qian Wang
36
0
0
31 Aug 2024
Investigating the Effect of Label Topology and Training Criterion on ASR Performance and Alignment Quality
Tina Raissi
Christoph Luscher
Simon Berger
Ralf Schluter
Hermann Ney
40
2
0
16 Jul 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
96
2
0
09 Jul 2024
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer
Maxime Burchi
Krishna C. Puvvada
Jagadeesh Balam
Boris Ginsburg
Radu Timofte
44
8
0
14 Mar 2024
Self-Supervised Learning for Few-Shot Bird Sound Classification
Ilyass Moummad
Romain Serizel
Nicolas Farrugia
SSL
15
9
0
25 Dec 2023
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
31
1
0
18 Dec 2023
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition
Haoyu Tang
Zhaoyi Liu
Chang Zeng
Xinfeng Li
34
1
0
23 Mar 2023
Visual Transformers for Primates Classification and Covid Detection
Steffen Illium
Robert Muller
Andreas Sedlmeier
Claudia Linnhoff-Popien
38
11
0
20 Dec 2022
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
Gary Wang
Ekin D.Cubuk
Andrew Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel S. Park
30
1
0
19 Oct 2022
A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
Rui Li
Guodong Ma
Dexin Zhao
Ranran Zeng
Xiaoyu Li
Haolin Huang
29
2
0
16 Oct 2022
A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Kyuhong Shim
Wonyong Sung
25
2
0
01 Oct 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Improving Rare Word Recognition with LM-aware MWER Training
Weiran Wang
Tongzhou Chen
Tara N. Sainath
Ehsan Variani
Rohit Prabhavalkar
...
S. Mavandadi
Cal Peyser
Trevor Strohman
Yanzhang He
David Rybach
KELM
40
13
0
15 Apr 2022
Improving the fusion of acoustic and text representations in RNN-T
Chao Zhang
Bo-wen Li
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
43
12
0
25 Jan 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
38
23
0
25 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
24
22
0
19 Dec 2021
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
Guodong Ma
Pengfei Hu
Nurmemet Yolwas
Shen Huang
Hao-Ming Huang
27
4
0
13 Dec 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
Daniel S. Park
Wei Han
James Qin
Anmol Gulati
...
Zhifeng Chen
Quoc V. Le
Chung-Cheng Chiu
Ruoming Pang
Yonghui Wu
SSL
27
175
0
27 Sep 2021
ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization
M. Gaudesi
F. Weninger
D. Sharma
P. Zhan
AAML
33
1
0
23 Sep 2021
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
38
55
0
15 Sep 2021
Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer
Krishna D N Freshworks
29
7
0
22 Aug 2021
SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features
Gwantae Kim
D. Han
Hanseok Ko
47
42
0
06 Aug 2021
Learning a Neural Diff for Speech Models
J. Macoskey
Grant P. Strimel
Ariya Rastrow
18
2
0
03 Aug 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
30
9
0
17 Jun 2021
Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Lukas Drude
Jahn Heymann
A. Schwarz
J. Valin
14
3
0
15 Jun 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition
A. Fazel
Wei Yang
Yulan Liu
Roberto Barra-Chicote
Yi Meng
Roland Maas
J. Droppo
SyDa
18
48
0
14 Jun 2021
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
49
35
0
12 Dec 2020
Improving RNN-T ASR Accuracy Using Context Audio
A. Schwarz
Ilya Sklyar
Simon Wiesler
24
9
0
20 Nov 2020
Improving RNN Transducer Based ASR with Auxiliary Tasks
Chunxi Liu
Frank Zhang
Duc Le
Suyoun Kim
Yatharth Saraf
Geoffrey Zweig
26
49
0
05 Nov 2020
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding
Seongbin Kim
Gyuwan Kim
Seongjin Shin
Sangmin Lee
VLM
15
19
0
25 Oct 2020
Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Wenyong Huang
Wenchao Hu
Y. Yeung
Xiao Chen
25
50
0
13 Aug 2020
Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms
Steffen Illium
Robert Muller
Andreas Sedlmeier
Claudia Linnhoff-Popien
26
11
0
11 Aug 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
95
3,038
0
16 May 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
16
259
0
07 May 2020
1