Title
A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification M. Herde Denis Huseljic Bernhard Sick A. Calma 89 25 0 23 Sep 2021
Continuous Streaming Multi-Talker ASR with Dual-path Transducers Desh Raj Liang Lu Zhuo Chen Yashesh Gaur Jinyu Li 57 18 0 17 Sep 2021
Tied & Reduced RNN-T Decoder Rami Botros Tara N. Sainath R. David Emmanuel Guzman Wei Li Yanzhang He 83 55 0 15 Sep 2021
Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition Chuan-Fei Zhang Yang Liu Tianren Zhang Songlu Chen Feng Chen Xu-Cheng Yin 53 8 0 14 Sep 2021
Adversarial Parameter Defense by Multi-Step Risk Minimization Zhiyuan Zhang Ruixuan Luo Xuancheng Ren Qi Su Liangyou Li Xu Sun AAML 64 6 0 07 Sep 2021
Coarse-To-Fine And Cross-Lingual ASR Transfer Peter Polák Ondrej Bojar 46 3 0 02 Sep 2021
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition Guangzhi Sun Chao Zhang P. Woodland 103 33 0 01 Sep 2021
Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech Injy Hamed Pavel Denisov C. Li Mohamed S. Elmahdy Slim Abdennadher Ngoc Thang Vu 66 36 0 29 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers Xiaodong Cui Brian Kingsbury G. Saon David Haws Zoltán Tüske 46 5 0 24 Aug 2021
Generalizing RNN-Transducer to Out-Domain Audio via Sparse Self-Attention Layers Juntae Kim Jee-Hye Lee 41 6 0 22 Aug 2021
UnSplit: Data-Oblivious Model Inversion, Model Stealing, and Label Inference Attacks Against Split Learning Ege Erdogan Alptekin Kupcu A. E. Cicek FedML MIACV 77 79 0 20 Aug 2021
SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features Gwantae Kim D. Han Hanseok Ko 101 45 0 06 Aug 2021
The Performance Evaluation of Attention-Based Neural ASR under Mixed Speech Input Bradley He Martin H. Radfar 32 1 0 03 Aug 2021
A Configurable Multilingual Model is All You Need to Recognize All Languages Long Zhou Jinyu Li Eric Sun Shujie Liu 136 42 0 13 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models Xiaohui Zhang Vimal Manohar David C. Zhang Frank Zhang Yangyang Shi Nayan Singhal Julian Chan Fuchun Peng Yatharth Saraf M. Seltzer 83 14 0 09 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition Timo Lohrenz P. Schwarz Zhengyang Li Tim Fingscheidt 52 11 0 02 Jul 2021
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition Niko Moritz Takaaki Hori Jonathan Le Roux 59 21 0 02 Jul 2021
On joint training with interfaces for spoken language understanding A. Raju Milind Rao Gautam Tiwari Pranav Dheram Bryan Anderson Zhe Zhang Chul Lee Bach Bui Ariya Rastrow VLM 55 11 0 30 Jun 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 133 359 0 29 Jun 2021
Where are we in semantic concept extraction for Spoken Language Understanding? Sahar Ghannay Antoine Caubrière Salima Mdhaffar G. Laperriere Bassam Jabaian Yannick Esteve 46 18 0 24 Jun 2021
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus Hamdy Mubarak A. Hussein Shammur A. Chowdhury Ahmed M. Ali 51 48 0 24 Jun 2021
End-to-End Spoken Language Understanding for Generalized Voice Assistants Michael Stephen Saxon Samridhi Choudhary Joseph P. McKenna Athanasios Mouchtaris VLM 83 26 0 16 Jun 2021
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition Yosuke Higuchi Niko Moritz Jonathan Le Roux Takaaki Hori VLM 129 52 0 16 Jun 2021
Dynamic Gradient Aggregation for Federated Domain Adaptation Dimitrios Dimitriadis K. Kumatani R. Gmyr Yashesh Gaur Sefik Emre Eskimez FedML 72 5 0 14 Jun 2021
Unsupervised Automatic Speech Recognition: A Review Hanan Aldarmaki Asad Ullah Nazar Zaki VLM SSL 53 59 0 09 Jun 2021
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition Max W. Y. Lam Jun Wang Chao Weng Dan Su Dong Yu 65 6 0 08 Jun 2021
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition Zhong Meng Yu-Huan Wu Naoyuki Kanda Liang Lu Xie Chen Guoli Ye Eric Sun Jinyu Li Jiawei Liu MoMe 94 21 0 04 Jun 2021
A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect Binbin Xu Chongyang Tao Z. Feng Youssef Raqui Sylvie Ranwez 64 13 0 07 May 2021
A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs Devansh Bisla Apoorva Nandini Saridena A. Choromańska 64 8 0 05 May 2021
Streaming end-to-end speech recognition with jointly trained neural feature enhancement Chanwoo Kim Abhinav Garg Dhananjaya N. Gowda Seongkyu Mun C. Han AuLLM 56 6 0 04 May 2021
Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models Coleman Hooper Thierry Tambe Gu-Yeon Wei 37 0 0 03 May 2021
End-to-End Speech Recognition from Federated Acoustic Models Yan Gao Titouan Parcollet Salah Zaiem Javier Fernandez-Marques Pedro Porto Buarque de Gusmão Daniel J. Beutel Nicholas D. Lane 92 44 0 29 Apr 2021
On Addressing Practical Challenges for RNN-Transducer Rui Zhao Jian Xue Jinyu Li Wenning Wei Lei He Jiawei Liu 72 32 0 27 Apr 2021
Sparse Attention with Linear Units Biao Zhang Ivan Titov Rico Sennrich 54 40 0 14 Apr 2021
End-to-end Keyword Spotting using Neural Architecture Search and Quantization David Peter Wolfgang Roth Franz Pernkopf MQ 50 14 0 14 Apr 2021
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search Yukun Liu Ta Li Pengyuan Zhang Yonghong Yan AI4TS 35 7 0 12 Apr 2021
A Toolbox for Construction and Analysis of Speech Datasets Evelina Bakhturina Vitaly Lavrukhin Boris Ginsburg 48 12 0 11 Apr 2021
Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR Fan Yu Haoneng Luo Pengcheng Guo Yuhao Liang Zhuoyuan Yao Lei Xie Yingying Gao Leijing Hou Shilei Zhang 25 11 0 10 Apr 2021
Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions Jumon Nozaki Tatsuya Komatsu 86 75 0 06 Apr 2021
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network William Chan Daniel S. Park Chris A. Lee Yu Zhang Quoc V. Le Mohammad Norouzi AI4TS 90 138 0 05 Apr 2021
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study Zhiqiang Shen Zechun Liu Dejia Xu Zitian Chen Kwang-Ting Cheng Marios Savvides 76 76 0 01 Apr 2021
A study of latent monotonic attention variants Albert Zeyer Ralf Schluter Hermann Ney 75 5 0 30 Mar 2021
Residual Energy-Based Models for End-to-End Speech Recognition Qiujia Li Yu Zhang Yue Liu Liangliang Cao P. Woodland 67 14 0 25 Mar 2021
Advancing RNN Transducer Technology for Speech Recognition G. Saon Zoltan Tueske Daniel Bolaños Brian Kingsbury 95 88 0 17 Mar 2021
Towards the evaluation of automatic simultaneous speech translation from a communicative perspective Claudio Fantinuoli Bianca Prandi 138 18 0 15 Mar 2021
OkwuGbé: End-to-End Speech Recognition for Fon and Igbo Bonaventure F. P. Dossou Chris C. Emezue 70 14 0 13 Mar 2021
A Distributed Optimisation Framework Combining Natural Gradient with Hessian-Free for Discriminative Sequence Training Adnan Haider Chao Zhang Florian Kreyssig P. Woodland 107 7 0 12 Mar 2021
End-to-end acoustic modelling for phone recognition of young readers Lucile Gelin Morgane Daniel J. Pinquier Thomas Pellegrini 53 13 0 04 Mar 2021
Incorporating VAD into ASR System by Multi-task Learning Meng Li Xiai Yan Feng Lin VLM 16 3 0 02 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition Hirofumi Inaguma Tatsuya Kawahara 123 14 0 28 Feb 2021