Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.08779
Cited By
v1
v2
v3 (latest)
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
18 April 2019
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition"
50 / 1,048 papers shown
Title
Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings
Tharindu Fernando
Sridha Sridharan
Simon Denman
H. Ghaemmaghami
Clinton Fookes
61
28
0
30 Jun 2021
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection
Thi Ngoc Tho Nguyen
Karn N. Watcharasupat
Ngoc Khanh Nguyen
Douglas L. Jones
W. Gan
70
16
0
29 Jun 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri
Heinrich Jiang
Yi Tay
Donald Metzler
SSL
74
178
0
29 Jun 2021
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus
Hamdy Mubarak
A. Hussein
Shammur A. Chowdhury
Ahmed M. Ali
51
49
0
24 Jun 2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
71
6
0
23 Jun 2021
Enrollment-less training for personalized voice activity detection
Naoki Makishima
Mana Ihori
Tomohiro Tanaka
Akihiko Takashima
Shota Orihashi
Ryo Masumura
48
10
0
23 Jun 2021
Do sound event representations generalize to other audio tasks? A case study in audio transfer learning
Anurag Kumar
Yun Wang
V. Ithapu
Christian Fuegen
76
3
0
21 Jun 2021
Towards sound based testing of COVID-19 -- Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge
N. Sharma
Ananya Muguli
Prashant Krishnan
Rohit Kumar
Srikanth Raj Chetupalli
Sriram Ganapathy
79
14
0
21 Jun 2021
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection
Kazuki Shimada
Naoya Takahashi
Yuichiro Koyama
Shusuke Takahashi
E. Tsunoo
Masafumi Takahashi
Yuki Mitsufuji
56
23
0
21 Jun 2021
Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System
Jinhan Wang
Yunzheng Zhu
Ruchao Fan
Wei Chu
Abeer Alwan
69
8
0
18 Jun 2021
Multi-mode Transformer Transducer with Stochastic Future Context
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
71
10
0
17 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
99
88
0
17 Jun 2021
Layer Pruning on Demand with Intermediate CTC
Jaesong Lee
Jingu Kang
Shinji Watanabe
40
18
0
17 Jun 2021
Collaborative Training of Acoustic Encoders for Speech Recognition
Varun K. Nagaraja
Yangyang Shi
Ganesh Venkatesh
Ozlem Kalinli
M. Seltzer
Vikas Chandra
92
12
0
16 Jun 2021
VidHarm: A Clip Based Dataset for Harmful Content Detection
Johan Edstedt
Amanda Berg
Michael Felsberg
Johan Karlsson
Francisca Benavente
Anette Novak
G. Pihlgren
42
2
0
15 Jun 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition
A. Fazel
Wei Yang
Yulan Liu
Roberto Barra-Chicote
Yi Meng
Roland Maas
J. Droppo
SyDa
110
51
0
14 Jun 2021
End-to-end Neural Diarization: From Transformer to Conformer
Yi Y. Liu
Eunjung Han
Chul Lee
A. Stolcke
135
42
0
14 Jun 2021
Cross-Modal Attention Consistency for Video-Audio Unsupervised Learning
Shaobo Min
Qi Dai
Hongtao Xie
Chuang Gan
Yongdong Zhang
Jingdong Wang
SSL
59
7
0
13 Jun 2021
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Shigeki Karita
Yotaro Kubo
M. Bacchiani
Llion Jones
48
13
0
09 Jun 2021
RealTranS: End-to-End Simultaneous Speech Translation with Convolutional Weighted-Shrinking Transformer
Xingshan Zeng
Liangyou Li
Qun Liu
69
48
0
09 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
135
770
0
08 Jun 2021
Broadcasted Residual Learning for Efficient Keyword Spotting
Byeonggeun Kim
Simyung Chang
Jinkyu Lee
Dooyong Sung
124
124
0
08 Jun 2021
SIGTYP 2021 Shared Task: Robust Spoken Language Identification
Elizabeth Salesky
Badr M. Abdullah
Sabrina J. Mielke
Elena Klyachko
O. Serikov
Edoardo Ponti
Ritesh Kumar
Ryan Cotterell
Ekaterina Vylomova
56
11
0
07 Jun 2021
EventDrop: data augmentation for event-based learning
Fuqiang Gu
Weicong Sng
Xuke Hu
Fei Yu
67
40
0
07 Jun 2021
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
E. Tsunoo
Kentarou Shibata
Chaitanya Narisetty
Yosuke Kashiwagi
Shinji Watanabe
69
12
0
07 Jun 2021
Impact of data-splits on generalization: Identifying COVID-19 from cough and context
Makkunda Sharma
Nikhil Shenoy
Jigar Doshi
Piyush Bagad
Aman Dalmia
Parag Bhamare
A. Mahale
S. Rane
Neeraj Agrawal
R. Panicker
OOD
121
4
0
05 Jun 2021
Signal Transformer: Complex-valued Attention and Meta-Learning for Signal Recognition
Yihong Dong
Ying Peng
Muqiao Yang
Songtao Lu
Qingjiang Shi
103
9
0
05 Jun 2021
ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition
S. Verbitskiy
Vladimir Berikov
Viacheslav Vyshegorodtsev
115
75
0
03 Jun 2021
Noisy student-teacher training for robust keyword spotting
Hyun-jin Park
Pai Zhu
Ignacio López Moreno
Niranjan A. Subrahmanya
NoLa
53
17
0
03 Jun 2021
Lightweight Adapter Tuning for Multilingual Speech Translation
Hang Le
J. Pino
Changhan Wang
Jiatao Gu
D. Schwab
Laurent Besacier
144
90
0
02 Jun 2021
Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?
L. Bentivogli
Mauro Cettolo
Marco Gaido
Alina Karakanta
A. Martinelli
Matteo Negri
Marco Turchi
78
83
0
02 Jun 2021
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Takafumi Moriya
Naoyuki Kamo
68
23
0
02 Jun 2021
A Neural Acoustic Echo Canceller Optimized Using An Automatic Speech Recognizer And Large Scale Synthetic Data
N. Howard
Alex Park
T. Shabestary
A. Gruenstein
Rohit Prabhavalkar
52
17
0
01 Jun 2021
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
Haibin Wu
Xu Li
Andy T. Liu
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAML
SSL
116
30
0
01 Jun 2021
Multilingual Speech Translation with Unified Transformer: Huawei Noah's Ark Lab at IWSLT 2021
Xingshan Zeng
Liangyou Li
Qun Liu
44
2
0
01 Jun 2021
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Shammur A. Chowdhury
A. Hussein
Ahmed Abdelali
Ahmed M. Ali
78
36
0
31 May 2021
How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation
Marco Gaido
Beatrice Savoldi
L. Bentivogli
Matteo Negri
Marco Turchi
105
15
0
28 May 2021
The Imaginative Generative Adversarial Network: Automatic Data Augmentation for Dynamic Skeleton-Based Hand Gesture and Human Action Recognition
Junxiao Shen
John J. Dudley
Per Ola Kristensson
SLR
GAN
76
24
0
27 May 2021
Unsupervised Speech Recognition
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
155
275
0
24 May 2021
The Volctrans Neural Speech Translation System for IWSLT 2021
Chengqi Zhao
Zhicheng Liu
Jian-Fei Tong
Tao Wang
Mingxuan Wang
Rong Ye
Qianqian Dong
Jun Cao
Lei Li
59
8
0
16 May 2021
Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Chen Xu
Bojie Hu
Yanyang Li
Yuhao Zhang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
76
78
0
12 May 2021
End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021
Gerard I. Gállego
Ioannis Tsiamas
Carlos Escolano
José A. R. Fonollosa
Marta R. Costa-jussá
54
31
0
10 May 2021
Voice activity detection in the wild: A data-driven approach using teacher-student training
Heinrich Dinkel
Shuai Wang
Xuenan Xu
Mengyue Wu
K. Yu
VLM
45
33
0
10 May 2021
FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Yichong Leng
Xu Tan
Linchen Zhu
Jin Xu
Renqian Luo
Linquan Liu
Tao Qin
Xiang-Yang Li
Ed Lin
Tie-Yan Liu
KELM
109
64
0
09 May 2021
Efficient Weight factorization for Multilingual Speech Recognition
Ngoc-Quan Pham
Tuan-Nam Nguyen
S. Stueker
A. Waibel
97
20
0
07 May 2021
Self-Supervised Learning from Automatically Separated Sound Scenes
Eduardo Fonseca
A. Jansen
D. Ellis
Scott Wisdom
Marco Tagliasacchi
J. Hershey
Manoj Plakal
Shawn Hershey
R. C. Moore
Xavier Serra
SSL
81
13
0
05 May 2021
SUPERB: Speech processing Universal PERformance Benchmark
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
...
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
SSL
194
943
0
03 May 2021
On the limit of English conversational speech recognition
Zoltán Tüske
G. Saon
Brian Kingsbury
90
50
0
03 May 2021
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Siddharth Dalmia
Brian Yan
Vikas Raunak
Florian Metze
Shinji Watanabe
89
31
0
02 May 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR
Yue Liu
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Wenjie Huang
Min Ma
Junwen Bai
CLL
144
77
0
30 Apr 2021
Previous
1
2
3
...
13
14
15
...
19
20
21
Next