Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.08779
Cited By
v1
v2
v3 (latest)
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
18 April 2019
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition"
50 / 1,048 papers shown
Title
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li
David Qiu
Yu Zhang
Yue Liu
Yanzhang He
P. Woodland
Liangliang Cao
Trevor Strohman
50
49
0
22 Oct 2020
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks
Yun Tang
J. Pino
Changhan Wang
Xutai Ma
Dmitriy Genzel
81
75
0
21 Oct 2020
The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification
Jenthe Thienpondt
Brecht Desplanques
Kris Demuynck
82
84
0
21 Oct 2020
How Data Augmentation affects Optimization for Linear Regression
Boris Hanin
Yi Sun
86
16
0
21 Oct 2020
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Jiahui Yu
Chung-Cheng Chiu
Yue Liu
Shuo-yiin Chang
Tara N. Sainath
...
A. Narayanan
Wei Han
Anmol Gulati
Yonghui Wu
Ruoming Pang
86
92
0
21 Oct 2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Yangyang Shi
Yongqiang Wang
Chunyang Wu
Ching-Feng Yeh
Julian Chan
Frank Zhang
Duc Le
M. Seltzer
200
172
0
21 Oct 2020
Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin
Amina Mardiyyah Rufai
Afolabi Abeeb
Esther Oduntan
Tayo Arulogun
Oluwabukola Adegboro
Daniel Ajisafe
112
4
0
21 Oct 2020
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
James Qin
Daniel S. Park
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Quoc V. Le
Yonghui Wu
VLM
SSL
229
310
0
20 Oct 2020
MicAugment: One-shot Microphone Style Transfer
Zalan Borsos
Yunpeng Li
Beat Gfeller
Marco Tagliasacchi
43
4
0
19 Oct 2020
CLAR: Contrastive Learning of Auditory Representations
Haider Al-Tahan
Y. Mohsenzadeh
SSL
195
56
0
19 Oct 2020
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions
Ludwig Kurzinger
Nicolas Lindae
Palle Klewitz
Gerhard Rigoll
67
5
0
15 Oct 2020
Viewmaker Networks: Learning Views for Unsupervised Representation Learning
Alex Tamkin
Mike Wu
Noah D. Goodman
SSL
131
64
0
14 Oct 2020
Exploiting Spectral Augmentation for Code-Switched Spoken Language Identification
P. Rangan
Sundeep Teki
Hemant Misra
36
22
0
14 Oct 2020
Towards Data-efficient Modeling for Wake Word Spotting
Yixin Gao
Yuriy Mishchenko
Anish Shah
Spyros Matsoukas
S. Vitaladevuni
101
32
0
13 Oct 2020
Improving Low Resource Code-switched ASR using Augmented Code-switched TTS
Yash Sharma
Basil Abraham
Karan Taneja
Preethi Jyothi
61
21
0
12 Oct 2020
fairseq S2T: Fast Speech-to-Text Modeling with fairseq
Changhan Wang
Yun Tang
Xutai Ma
Anne Wu
Sravya Popuri
Dmytro Okhonko
J. Pino
VLM
LRM
116
276
0
11 Oct 2020
Contrastive Representation Learning: A Framework and Review
Phúc H. Lê Khắc
Graham Healy
Alan F. Smeaton
SSL
AI4TS
330
722
0
10 Oct 2020
Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Yinghui Huang
H. Kuo
Samuel Thomas
Zvi Kons
Kartik Audhkhasi
Brian Kingsbury
R. Hoory
M. Picheny
VLM
48
63
0
08 Oct 2020
Population Based Training for Data Augmentation and Regularization in Speech Recognition
Daniel Haziza
Jérémy Rapin
Gabriel Synnaeve
35
1
0
08 Oct 2020
Super-Human Performance in Online Low-latency Recognition of Conversational Speech
T. Nguyen
S. Stueker
A. Waibel
BDL
74
38
0
07 Oct 2020
Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition
Anshuman Tripathi
Jaeyoung Kim
Qian Zhang
Han Lu
Hasim Sak
71
43
0
07 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSL
AI4TS
75
12
0
07 Oct 2020
SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding
Yu-An Chung
Chenguang Zhu
Michael Zeng
VLM
72
8
0
05 Oct 2020
Differentiable Weighted Finite-State Transducers
Awni Y. Hannun
Vineel Pratap
Jacob Kahn
Wei-Ning Hsu
118
29
0
02 Oct 2020
Embedded Emotions -- A Data Driven Approach to Learn Transferable Feature Representations from Raw Speech Input for Emotion Recognition
Dominik Schiller
Silvan Mertes
Elisabeth André
15
0
0
30 Sep 2020
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
Yerbolat Khassanov
Saida Mussakhojayeva
A. Mirzakhmetov
A. Adiyev
Mukhamet Nurpeiissov
H. A. Varol
57
31
0
22 Sep 2020
Consecutive Decoding for Speech-to-text Translation
Qianqian Dong
Mingxuan Wang
Hao Zhou
Shuang Xu
Bo Xu
Lei Li
SLR
111
41
0
21 Sep 2020
"Listen, Understand and Translate": Triple Supervision Decouples End-to-end Speech-to-text Translation
Qianqian Dong
Rong Ye
Mingxuan Wang
Hao Zhou
Shuang Xu
Bo Xu
Lei Li
75
3
0
21 Sep 2020
Cough Against COVID: Evidence of COVID-19 Signature in Cough Sounds
Piyush Bagad
Aman Dalmia
Jigar Doshi
Arsha Nagrani
Parag Bhamare
A. Mahale
S. Rane
N. Agarwal
R. Panicker
105
113
0
17 Sep 2020
EasyASR: A Distributed Machine Learning Platform for End-to-end Automatic Speech Recognition
Chengyu Wang
Mengli Cheng
Xu Hu
Jun Huang
VLM
42
6
0
14 Sep 2020
On Multitask Loss Function for Audio Event Detection and Localization
Huy P Phan
L. D. Pham
P. Koch
Ngoc Q. K. Duong
Ian Mcloughlin
Alfred Mertins
87
14
0
11 Sep 2020
On Target Segmentation for Direct Speech Translation
Mattia Antonino Di Gangi
Marco Gaido
Matteo Negri
Marco Turchi
79
14
0
10 Sep 2020
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition
Quan Wang
Ignacio López Moreno
Mert Saglam
K. Wilson
Alan Chiao
...
Yanzhang He
Wei Li
Jason W. Pelecanos
M. Nika
A. Gruenstein
VLM
71
86
0
09 Sep 2020
AutoKWS: Keyword Spotting with Differentiable Architecture Search
Bo Zhang
WenFeng Li
Qingyuan Li
Weiji Zhuang
Xiangxiang Chu
Yujun Wang
68
23
0
08 Sep 2020
KoSpeech: Open-Source Toolkit for End-to-End Korean Speech Recognition
Soohwan Kim
Seyoung Bae
Cheolhwang Won
VLM
31
5
0
07 Sep 2020
Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019
Archontis Politis
A. Mesaros
Sharath Adavanne
Toni Heittola
Tuomas Virtanen
133
128
0
06 Sep 2020
Multi-Attention-Network for Semantic Segmentation of Fine Resolution Remote Sensing Images
Rui Li
Shunyi Zheng
Chenxi Duan
Ce Zhang
Jianlin Su
P. M. Atkinson
SSeg
108
388
0
03 Sep 2020
Convolutional Speech Recognition with Pitch and Voice Quality Features
Guillermo Cámbara
Jordi Luque
Mireia Farrús
28
8
0
02 Sep 2020
Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition
Wei Li
James Qin
Chung-Cheng Chiu
Ruoming Pang
Yanzhang He
85
14
0
30 Aug 2020
A Survey of Deep Active Learning
Pengzhen Ren
Yun Xiao
Xiaojun Chang
Po-Yao (Bernie) Huang
Zhihui Li
Brij B. Gupta
Xiaojiang Chen
Xin Wang
160
1,161
0
30 Aug 2020
Data augmentation using prosody and false starts to recognize non-native children's speech
H. Kathania
Mittul Singh
Tamás Grósz
M. Kurimo
19
12
0
29 Aug 2020
CRNNs for Urban Sound Tagging with spatiotemporal context
Augustin Arnault
Nicolas Riche
58
7
0
24 Aug 2020
Howl: A Deployed, Open-Source Wake Word Detection System
Raphael Tang
Jaejun Lee
Afsaneh Razi
Julia Cambre
Ian Bicking
Jofish Kaye
Jimmy J. Lin
VLM
52
17
0
21 Aug 2020
Speech To Semantics: Improve ASR and NLU Jointly via All-Neural Interfaces
Milind Rao
A. Raju
Pranav Dheram
Bach Bui
Ariya Rastrow
58
43
0
14 Aug 2020
Subword Regularization: An Analysis of Scalability and Generalization for End-to-End Automatic Speech Recognition
Egor Lakomkin
Jahn Heymann
Ilya Sklyar
Simon Wiesler
51
8
0
10 Aug 2020
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Hayato Futami
Hirofumi Inaguma
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
78
53
0
09 Aug 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Jin Xu
Xu Tan
Yi Ren
Tao Qin
Jian Li
Sheng Zhao
Tie-Yan Liu
VLM
70
91
0
09 Aug 2020
Investigation of Speaker-adaptation methods in Transformer based ASR
Vishwas M. Shetty
J. MetildaSagayaMaryN.
S. Umesh
87
5
0
07 Aug 2020
Contextualized Translation of Automatically Segmented Speech
Marco Gaido
Mattia Antonino Di Gangi
Matteo Negri
Mauro Cettolo
Marco Turchi
61
19
0
05 Aug 2020
Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network
Rui Li
Shunyi Zheng
Chenxi Duan
Ce Zhang
104
102
0
01 Aug 2020
Previous
1
2
3
...
17
18
19
20
21
Next