Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.08779
Cited By
v1
v2
v3 (latest)
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
18 April 2019
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition"
50 / 1,048 papers shown
Title
Slow-Fast Auditory Streams For Audio Recognition
Evangelos Kazakos
Arsha Nagrani
Andrew Zisserman
Dima Damen
117
68
0
05 Mar 2021
Neural model robustness for skill routing in large-scale conversational AI systems: A design choice exploration
Han Li
Sunghyun Park
Aswarth Abhilash Dara
Jinseok Nam
Sungjin Lee
Young-Bum Kim
Spyros Matsoukas
R. Sarikaya
67
9
0
04 Mar 2021
An Empirical Study of End-to-end Simultaneous Speech Translation Decoding Strategies
H. Nguyen
Yannick Esteve
Laurent Besacier
60
19
0
04 Mar 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
214
1,030
0
04 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition
Hirofumi Inaguma
Tatsuya Kawahara
127
14
0
28 Feb 2021
The NPU System for the 2020 Personalized Voice Trigger Challenge
Jingyong Hou
Li Zhang
Yihui Fu
Qing Wang
Zhanheng Yang
Qijie Shao
Lei Xie
61
7
0
26 Feb 2021
MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition
Linghui Meng
Jin Xu
Xu Tan
Jindong Wang
Tao Qin
Bo Xu
VLM
115
78
0
25 Feb 2021
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Xian Shi
Fan Yu
Yizhou Lu
Yuhao Liang
Qiangze Feng
Daliang Wang
Y. Qian
Lei Xie
60
68
0
20 Feb 2021
End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study
Prashanth Gurunath Shivakumar
Shrikanth Narayanan
53
54
0
19 Feb 2021
Unit selection synthesis based data augmentation for fixed phrase speaker verification
Houjun Huang
Xu Xiang
Fei Zhao
Shuai Wang
Y. Qian
16
6
0
19 Feb 2021
Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition
Gary Yeung
Ruchao Fan
Abeer Alwan
74
20
0
18 Feb 2021
End-to-end lyrics Recognition with Voice to Singing Style Transfer
Sakya Basak
Shrutina Agarwal
Sriram Ganapathy
Naoya Takahashi
68
20
0
17 Feb 2021
End-to-End Automatic Speech Recognition with Deep Mutual Learning
Ryo Masumura
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Takanori Ashihara
34
5
0
16 Feb 2021
Hierarchical Transformer-based Large-Context End-to-end ASR with Large-Context Knowledge Distillation
Ryo Masumura
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
77
29
0
16 Feb 2021
Adversarial defense for automatic speaker verification by cascaded self-supervised learning models
Haibin Wu
Xu Li
Andy T. Liu
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAML
86
41
0
14 Feb 2021
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
Maja Pantic
157
234
0
12 Feb 2021
Enhancing Audio Augmentation Methods with Consistency Learning
Turab Iqbal
Karim Helwani
A. Krishnaswamy
Wenwu Wang
67
5
0
09 Feb 2021
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong Lee
Shinji Watanabe
151
140
0
05 Feb 2021
Data Generation Using Pass-phrase-dependent Deep Auto-encoders for Text-Dependent Speaker Verification
A. K. Sarkar
Md. Sahidullah
Zheng-Hua Tan
26
0
0
03 Feb 2021
Speech Emotion Recognition with Multiscale Area Attention and Data Augmentation
Mingke Xu
Fan Zhang
Xiaodong Cui
Wei Zhang
54
52
0
03 Feb 2021
The Multilingual TEDx Corpus for Speech Recognition and Translation
Elizabeth Salesky
Sanjeev Khudanpur
Jacob Bremerman
R. Cattoni
Matteo Negri
Marco Turchi
Douglas W. Oard
Matt Post
79
126
0
02 Feb 2021
CTC-based Compression for Direct Speech Translation
Marco Gaido
Mauro Cettolo
Matteo Negri
Marco Turchi
104
59
0
02 Feb 2021
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit
Zhuoyuan Yao
Di Wu
Xiong Wang
Binbin Zhang
Fan Yu
Chao Yang
Zhendong Peng
Xiaoyu Chen
Lei Xie
X. Lei
129
268
0
02 Feb 2021
The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap
Shota Horiguchi
Nelson Yalta
Leibny Paola García-Perera
Yuki Takashima
Yawen Xue
Desh Raj
Zili Huang
Yusuke Fujita
Shinji Watanabe
Sanjeev Khudanpur
BDL
63
37
0
02 Feb 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
199
147
0
02 Feb 2021
BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge
M. Kocour
Guillermo Cámbara
Jordi Luque
David Bonet
Mireia Farrús
Martin Karafiát
Karel Veselý
Jan ''Honza'' Cernocký
28
6
0
29 Jan 2021
LEAF: A Learnable Frontend for Audio Classification
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLM
AAML
137
148
0
21 Jan 2021
Arabic Speech Recognition by End-to-End, Modular Systems and Human
A. Hussein
Shinji Watanabe
Ahmed M. Ali
VLM
72
50
0
21 Jan 2021
On Data-Augmentation and Consistency-Based Semi-Supervised Learning
Atin Ghosh
Alexandre Hoang Thiery
132
21
0
18 Jan 2021
Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Yuekai Zhang
Sining Sun
Long Ma
95
29
0
18 Jan 2021
An evaluation of word-level confidence estimation for end-to-end automatic speech recognition
Dan Oneaţă
Alexandru Caranica
Adriana Stan
H. Cucu
UQCV
88
25
0
14 Jan 2021
End-to-End Speaker Height and age estimation using Attention Mechanism with LSTM-RNN
Manav Kaushik
Van Tung Pham
Chng Eng Siong
53
6
0
13 Jan 2021
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection
Qing Wang
Jun Du
Hua-Xin Wu
Jia Pan
Feng Ma
Chin-Hui Lee
64
83
0
08 Jan 2021
Environment Transfer for Distributed Systems
Chunheng Jiang
Jae-wook Ahn
N. Desai
60
1
0
06 Jan 2021
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
Hieu H. Pham
Quoc V. Le
128
57
0
05 Jan 2021
Robustness Testing of Language Understanding in Task-Oriented Dialog
Jiexi Liu
Ryuichi Takanobu
Jiaxin Wen
Dazhen Wan
Hongguang Li
Weiran Nie
Cheng Li
Wei Peng
Minlie Huang
ELM
122
49
0
30 Dec 2020
NeurST: Neural Speech Translation Toolkit
Chengqi Zhao
Mingxuan Wang
Qianqian Dong
Rong Ye
Lei Li
89
32
0
18 Dec 2020
CIF-based Collaborative Decoding for End-to-end Contextual Speech Recognition
Minglun Han
Linhao Dong
Shiyu Zhou
Bo Xu
73
23
0
17 Dec 2020
A review of on-device fully neural end-to-end automatic speech recognition algorithms
Chanwoo Kim
Dhananjaya N. Gowda
Dongsoo Lee
Jiyeon Kim
Ankur Kumar
Sungsoo Kim
Abhinav Garg
C. Han
68
27
0
14 Dec 2020
Bayesian Learning for Deep Neural Network Adaptation
Xurong Xie
Xunying Liu
Tan Lee
Lan Wang
BDL
112
22
0
14 Dec 2020
REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling
Hu Hu
Xuesong Yang
Zeynab Raeesy
Jinxi Guo
Gokce Keskin
Harish Arsikere
Ariya Rastrow
A. Stolcke
Roland Maas
64
30
0
14 Dec 2020
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning
Wei Xia
Chunlei Zhang
Chao Weng
Meng Yu
Dong Yu
SSL
64
80
0
13 Dec 2020
Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Rohit Prabhavalkar
Yanzhang He
David Rybach
S. Campbell
A. Narayanan
Trevor Strohman
Tara N. Sainath
125
35
0
12 Dec 2020
Improved Robustness to Disfluencies in RNN-Transducer Based Speech Recognition
Valentin Mendelev
Tina Raissi
Guglielmo Camporese
Manuel Giollo
48
21
0
11 Dec 2020
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Binbin Zhang
Di Wu
Zhuoyuan Yao
Xiong Wang
F. Yu
Chao Yang
Liyong Guo
Yaguang Hu
Lei Xie
X. Lei
93
81
0
10 Dec 2020
Parameter Efficient Multimodal Transformers for Video Representation Learning
Sangho Lee
Youngjae Yu
Gunhee Kim
Thomas Breuel
Jan Kautz
Yale Song
ViT
104
78
0
08 Dec 2020
Frame-level SpecAugment for Deep Convolutional Neural Networks in Hybrid ASR Systems
Xinwei Li
Yuanyuan Zhang
Xiaodan Zhuang
Daben Liu
28
6
0
07 Dec 2020
MLS: A Large-Scale Multilingual Dataset for Speech Research
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
189
513
0
07 Dec 2020
Triplet Entropy Loss: Improving The Generalisation of Short Speech Language Identification Systems
Ruan van der Merwe
71
8
0
03 Dec 2020
Improving accuracy of rare words for RNN-Transducer through unigram shallow fusion
Vijay Ravi
Yile Gu
Ankur Gandhe
Ariya Rastrow
Linda Liu
Denis Filimonov
Scott Novotney
I. Bulyko
60
9
0
30 Nov 2020
Previous
1
2
3
...
15
16
17
...
19
20
21
Next