Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.08779
Cited By
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
18 April 2019
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition"
50 / 754 papers shown
Title
Non-Autoregressive ASR with Self-Conditioned Folded Encoders
Tatsuya Komatsu
28
7
0
17 Feb 2022
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Sarala Padi
S. O. Sadjadi
Tianyi Zhou
Ram D. Sriram
38
37
0
16 Feb 2022
What Does it Mean for a Language Model to Preserve Privacy?
Hannah Brown
Katherine Lee
Fatemehsadat Mireshghallah
Reza Shokri
Florian Tramèr
PILM
61
232
0
11 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
49
11
0
10 Feb 2022
Conversational Agents: Theory and Applications
M. Wahde
M. Virgolin
LLMAG
40
25
0
07 Feb 2022
MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances
Tianchi Liu
Rohan Kumar Das
Kong Aik Lee
Haizhou Li
27
69
0
03 Feb 2022
The RoyalFlush System of Speech Recognition for M2MeT Challenge
Shuaishuai Ye
Peiyao Wang
Shunfei Chen
Xinhui Hu
Xinkang Xu
26
5
0
03 Feb 2022
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
32
7
0
02 Feb 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
127
264
0
02 Feb 2022
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
P. Mihajlik
A. Balog
T. E. Gráczi
A. Kohári
Balázs Tarján
K. Mády
25
8
0
01 Feb 2022
Reducing language context confusion for end-to-end code-switching automatic speech recognition
Shuai Zhang
Jiangyan Yi
Zhengkun Tian
J. Tao
Y. Yeung
Liqun Deng
27
11
0
28 Jan 2022
Sentiment-Aware Automatic Speech Recognition pre-training for enhanced Speech Emotion Recognition
Ayoub Ghriss
Bo Yang
Viktor Rozgic
Elizabeth Shriberg
Chao Wang
32
21
0
27 Jan 2022
Recency Dropout for Recurrent Recommender Systems
Bo-Yu Chang
Can Xu
Matt Le
Jingchen Feng
Ya Le
Sriraj Badam
Ed H. Chi
Minmin Chen
30
3
0
26 Jan 2022
On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End Mandarin Chinese ASR
Zhao Yang
Dianwen Ng
Xiao Fu
Liping Han
Wei Xi
Ruimeng Wang
Rui Jiang
Jizhong Zhao
42
2
0
26 Jan 2022
Improving Factored Hybrid HMM Acoustic Modeling without State Tying
Tina Raissi
Eugen Beck
Ralf Schluter
Hermann Ney
37
5
0
24 Jan 2022
NAS-VAD: Neural Architecture Search for Voice Activity Detection
Daniel Rho
Jinhyeok Park
J. Ko
53
6
0
22 Jan 2022
Supervised and Self-supervised Pretraining Based COVID-19 Detection Using Acoustic Breathing/Cough/Speech Signals
Xing-Yu Chen
Qiu-shi Zhu
Jie Zhang
Lirong Dai
34
14
0
22 Jan 2022
Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
A. Haliassos
Rodrigo Mira
Stavros Petridis
Maja Pantic
CVBM
40
127
0
18 Jan 2022
Recent Progress in the CUHK Dysarthric Speech Recognition System
Shansong Liu
Mengzhe Geng
Shoukang Hu
Xurong Xie
Mingyu Cui
Jianwei Yu
Xunying Liu
Helen Meng
19
58
0
15 Jan 2022
Investigation of Data Augmentation Techniques for Disordered Speech Recognition
Mengzhe Geng
Xurong Xie
Shansong Liu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
16
56
0
14 Jan 2022
An Ensemble of Deep Learning Frameworks Applied For Predicting Respiratory Anomalies
L. D. Pham
Dat Ngo
T. Hoang
Alexander Schindler
Ian Mcloughlin
42
5
0
09 Jan 2022
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset
Tiezheng Yu
Rita Frieske
Peng Xu
Samuel Cahyawijaya
Cheuk Tung Shadow Yiu
...
Elham J. Barezi
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
RALM
54
9
0
07 Jan 2022
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
36
10
0
06 Jan 2022
Multi-turn RNN-T for streaming recognition of multi-party speech
Ilya Sklyar
A. Piunova
Xianrui Zheng
Yulan Liu
29
22
0
19 Dec 2021
Denoised Labels for Financial Time-Series Data via Self-Supervised Learning
Yanqing Ma
Carmine Ventre
M. Polukarov
NoLa
25
7
0
19 Dec 2021
Data Augmentation through Expert-guided Symmetry Detection to Improve Performance in Offline Reinforcement Learning
Giorgio Angelotti
Nicolas Drougard
Caroline Ponzoni Carvalho Chanel
OffRL
33
2
0
18 Dec 2021
An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification
L. D. Pham
Dat Ngo
Phu X. Nguyen
Hoang Van Truong
Alexander Schindler
32
9
0
16 Dec 2021
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification
Sung Hwan Mun
Min Hyun Han
Dongjune Lee
Jihwan Kim
N. Kim
SSL
45
3
0
16 Dec 2021
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
40
16
0
14 Dec 2021
ImportantAug: a data augmentation agent for speech
V. Trinh
Hassan Salami Kavaki
Michael I. Mandel
32
10
0
14 Dec 2021
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
Guodong Ma
Pengfei Hu
Nurmemet Yolwas
Shen Huang
Hao-Ming Huang
32
4
0
13 Dec 2021
Improving Code-switching Language Modeling with Artificially Generated Texts using Cycle-consistent Adversarial Networks
Chia-Yu Li
Ngoc Thang Vu
19
12
0
12 Dec 2021
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Holy Lovenia
Samuel Cahyawijaya
Genta Indra Winata
Peng Xu
Xu Yan
...
Elham J. Barezi
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
44
32
0
12 Dec 2021
Sequence-level self-learning with multiple hypotheses
K. Kumatani
Dimitrios Dimitriadis
Yashesh Gaur
R. Gmyr
Sefik Emre Eskimez
Jinyu Li
Michael Zeng
SSL
25
1
0
10 Dec 2021
Are E2E ASR models ready for an industrial usage?
Valentin Vielzeuf
G. Antipov
31
8
0
09 Dec 2021
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
Mufan Sang
Haoqi Li
F. Liu
Andrew O. Arnold
Li Wan
SSL
18
40
0
08 Dec 2021
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Jinchuan Tian
Jianwei Yu
Chao Weng
Shi-Xiong Zhang
Dan Su
Dong Yu
Yuexian Zou
AuLLM
50
13
0
05 Dec 2021
BBS-KWS:The Mandarin Keyword Spotting System Won the Video Keyword Wakeup Challenge
Yuting Yang
Binbin Du
Yingxin Zhang
Wenxuan Wang
Yuke Li
21
0
0
03 Dec 2021
Sound-Guided Semantic Image Manipulation
Seung Hyun Lee
Wonseok Roh
Wonmin Byeon
Sang Ho Yoon
Chanyoung Kim
Jinkyu Kim
Sangpil Kim
DiffM
42
43
0
30 Nov 2021
SP-SEDT: Self-supervised Pre-training for Sound Event Detection Transformer
Zhi-qin Ye
Xiangdong Wang
Hong Liu
Yueliang Qian
Ruijie Tao
Long Yan
Kazushige Ouchi
ViT
29
2
0
30 Nov 2021
Classification of animal sounds in a hyperdiverse rainforest using Convolutional Neural Networks
Yuren Sun
Tatiana Midori Maeda
Claudia R. Solís-Lemus
Daniel L. Pimentel-Alarcón
Z. Buřivalová
24
18
0
29 Nov 2021
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Siddhant Arora
Siddharth Dalmia
Pavel Denisov
Xuankai Chang
Yushi Ueda
...
Karthik Ganesan
Brian Yan
Ngoc Thang Vu
A. Black
Shinji Watanabe
VLM
38
74
0
29 Nov 2021
Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds
Abdelrahman Younes
Daniel Honerkamp
Tim Welschehold
Abhinav Valada
30
40
0
29 Nov 2021
Romanian Speech Recognition Experiments from the ROBIN Project
Andrei-Marius Avram
Vasile Puaics
Dan Tufics
27
4
0
23 Nov 2021
Deep Spoken Keyword Spotting: An Overview
Iván López-Espejo
Zheng-Hua Tan
John H. L. Hansen
Jesper Jensen
26
102
0
20 Nov 2021
A comparison of streaming models and data augmentation methods for robust speech recognition
Jiyeon Kim
Mehul Kumar
Dhananjaya N. Gowda
Abhinav Garg
Chanwoo Kim
31
5
0
19 Nov 2021
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays
Thi Ngoc Tho Nguyen
Douglas L. Jones
Karn N. Watcharasupat
Huy P Phan
W. Gan
33
36
0
16 Nov 2021
Attention based end to end Speech Recognition for Voice Search in Hindi and English
Raviraj Joshi
Venkateshan Kannan
30
7
0
15 Nov 2021
A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal
Ovishake Sen
Al-Mahmud
Pias Roy
9
5
0
12 Nov 2021
Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization
Byeonggeun Kim
Seunghan Yang
Jang-Hyun Kim
Simyung Chang
31
15
0
12 Nov 2021
Previous
1
2
3
...
8
9
10
...
14
15
16
Next