v1v2v3 (latest)

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

18 April 2019

Papers citing "SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition"

50 / 1,048 papers shown

Title
A Study On Data Augmentation In Voice Anti-Spoofing Ariel Cohen Inbal Rimon Eran Aflalo Haim Permuter 78 46 0 20 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer Yuan Gong Cheng-I Jeff Lai Yu-An Chung James R. Glass ViT 100 277 0 19 Oct 2021
Efficient Sequence Training of Attention Models using Approximative Recombination Nils-Philipp Wynands Wilfried Michel Jan Rosendahl Ralf Schluter Hermann Ney 45 3 0 18 Oct 2021
Improving End-To-End Modeling for Mispronunciation Detection with Effective Augmentation Mechanisms Tien-Hong Lo Y. Sung Berlin Chen 39 7 0 17 Oct 2021
Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet Haichuan Yang Yuan Shangguan Dilin Wang Meng Li P. Chuang Xiaohui Zhang Ganesh Venkatesh Ozlem Kalinli Vikas Chandra 84 14 0 15 Oct 2021
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks Sangeeta Srivastava Yun Wang Andros Tjandra Anurag Kumar Chunxi Liu Kritika Singh Yatharth Saraf SSL 99 25 0 14 Oct 2021
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training Kazuki Shimada Yuichiro Koyama Shusuke Takahashi Naoya Takahashi E. Tsunoo Yuki Mitsufuji 72 66 0 14 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 431 1,115 0 13 Oct 2021
Study of positional encoding approaches for Audio Spectrogram Transformers L. Pepino Pablo Riera Luciana Ferrer ViT 53 7 0 13 Oct 2021
Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems Mohd Abbas Zaidi Beomseok Lee Sangha Kim Chanwoo Kim 66 5 0 13 Oct 2021
Duality Temporal-channel-frequency Attention Enhanced Speaker Representation Learning Li Zhang Qing Wang Lei Xie 114 17 0 13 Oct 2021
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition Li-Wei Chen Alexander I. Rudnicky VLM 105 130 0 12 Oct 2021
Multi-Modal Pre-Training for Automated Speech Recognition David M. Chan Shalini Ghosh D. Chakrabarty Björn Hoffmeister SSL 92 16 0 12 Oct 2021
Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection Ricardo Falcón Pérez Kazuki Shimada Yuichiro Koyama Shusuke Takahashi Yuki Mitsufuji 130 5 0 12 Oct 2021
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information Zhongjie Ye Helin Wang Dongchao Yang Yuexian Zou 101 28 0 12 Oct 2021
Word Order Does Not Matter For Speech Recognition Vineel Pratap Qiantong Xu Tatiana Likhomanenko Gabriel Synnaeve R. Collobert 79 4 0 12 Oct 2021
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition Jing Pan Tao Lei Kwangyoun Kim Kyu Jeong Han Shinji Watanabe VLM 57 10 0 11 Oct 2021
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation Yosuke Higuchi Nanxin Chen Yuya Fujita Hirofumi Inaguma Tatsuya Komatsu Jaesong Lee Jumon Nozaki Tianzi Wang Shinji Watanabe 49 43 0 11 Oct 2021
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables Jounghee Kim Pilsung Kang VLM 44 6 0 11 Oct 2021
Efficient Training of Audio Transformers with Patchout Khaled Koutini Jan Schluter Hamid Eghbalzadeh Gerhard Widmer ViT 176 263 0 11 Oct 2021
Data Augmentation with Locally-time Reversed Speech for Automatic Speech Recognition Si-Ioi Ng Tan Lee 34 2 0 09 Oct 2021
TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context Nithin Rao Koluguri Taejin Park Boris Ginsburg ViT 121 104 0 08 Oct 2021
Improving Pseudo-label Training For End-to-end Speech Recognition Using Gradient Mask Shaoshi Ling Chen Shen Meng Cai Zejun Ma VLM SSL 76 10 0 08 Oct 2021
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition Hao Yen Pin-Jui Ku Chao-Han Huck Yang Hu Hu Sabato Marco Siniscalchi Pin-Yu Chen Yu Tsao 114 5 0 08 Oct 2021
Phone-to-audio alignment without text: A Semi-supervised Approach Jian Zhu Cong Zhang David Jurgens 63 38 0 08 Oct 2021
PEAF: Learnable Power Efficient Analog Acoustic Features for Audio Recognition Boris Bergsma Minhao Yang Milos Cernak 60 4 0 07 Oct 2021
Enabling On-Device Training of Speech Recognition Models with Federated Dropout Dhruv Guliani Lillian Zhou Changwan Ryu Tien-Ju Yang Harry Zhang Yong Xiao F. Beaufays Giovanni Motta FedML 58 16 0 07 Oct 2021
Peer Collaborative Learning for Polyphonic Sound Event Detection Hayato Endo Hiromitsu Nishizaki 39 4 0 07 Oct 2021
Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models Liang-Hsuan Tseng Yu-Kuan Fu Heng-Jui Chang Hung-yi Lee SSL 47 14 0 07 Oct 2021
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition Binbin Zhang Hang Lv Pengcheng Guo Qijie Shao Chao Yang ... Hui Bu Xiaoyu Chen Chenchen Zeng Di Wu Zhendong Peng 138 231 0 07 Oct 2021
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study Dawei Liang Yangyang Shi Yun Wang Nayan Singhal Alex Xiao Jonathan Shaw Edison Thomaz Ozlem Kalinli M. Seltzer 50 4 0 07 Oct 2021
An Investigation of the Effectiveness of Phase for Audio Classification Shunsuke Hidaka Kohei Wakamiya T. Kaburagi 28 4 0 06 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA Hayato Futami Hirofumi Inaguma Masato Mimura S. Sakai Tatsuya Kawahara KELM 104 21 0 05 Oct 2021
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection Thi Ngoc Tho Nguyen Karn N. Watcharasupat Ngoc Khanh Nguyen Douglas L. Jones W. Gan 80 49 0 01 Oct 2021
SpliceOut: A Simple and Efficient Audio Augmentation Method Arjit Jain Pranay Reddy Samala Deepak Mittal Preethi Jyothi M. Singh 132 11 0 30 Sep 2021
Fine-tuning wav2vec2 for speaker recognition Nik Vaessen David A. van Leeuwen 116 109 0 30 Sep 2021
FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition Yichong Leng Xu Tan Rui Wang Linchen Zhu Jin Xu ... Linquan Liu Tao Qin Xiang-Yang Li Ed Lin Tie-Yan Liu 129 42 0 29 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition Yu Zhang Daniel S. Park Wei Han James Qin Anmol Gulati ... Zhifeng Chen Quoc V. Le Chung-Cheng Chiu Ruoming Pang Yonghui Wu SSL 86 176 0 27 Sep 2021
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates Hirofumi Inaguma Siddharth Dalmia Brian Yan Shinji Watanabe 99 11 0 27 Sep 2021
ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization M. Gaudesi F. Weninger D. Sharma P. Zhan AAML 73 1 0 23 Sep 2021
Multi-view Contrastive Self-Supervised Learning of Accounting Data Representations for Downstream Audit Tasks Marco Schreyer Timur Sattarov Damian Borth MLAU 76 15 0 23 Sep 2021
Hybrid Data Augmentation and Deep Attention-based Dilated Convolutional-Recurrent Neural Networks for Speech Emotion Recognition Nhat Truong Pham Duc Ngoc Minh Dang Sy Dzung Nguyen 27 38 0 18 Sep 2021
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition F. Weninger M. Gaudesi Ralf Leibold R. Gemello P. Zhan 46 4 0 17 Sep 2021
Tied & Reduced RNN-T Decoder Rami Botros Tara N. Sainath R. David Emmanuel Guzman Wei Li Yanzhang He 86 55 0 15 Sep 2021
Dialog speech sentiment classification for imbalanced datasets Sergis Nicolaou Lambros Mavrides G. Tryfou Kyriakos Tolias Konstantinos P. Panousis S. Chatzis Sergios Theodoridis 41 0 0 15 Sep 2021
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech Katrin Tomanek Vicky Zayats Dirk Padfield K. Vaillancourt Fadi Biadsy 128 58 0 14 Sep 2021
Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition Chuan-Fei Zhang Yang Liu Tianren Zhang Songlu Chen Feng Chen Xu-Cheng Yin 56 8 0 14 Sep 2021
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages A. C. S. Prathosh A P A. G. Ramakrishnan 91 13 0 12 Sep 2021
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition Rong Gong Carl Quillen D. Sharma Andrew Goderre José Laínez Ljubomir Milanović 94 14 0 10 Sep 2021
Speechformer: Reducing Information Loss in Direct Speech Translation Sara Papi Marco Gaido Matteo Negri Marco Turchi 129 24 0 09 Sep 2021