v1v2 (latest)

Listen, Attend and Spell

5 August 2015

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown

Title
ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data K. Cheuk Dorien Herremans Li Su 208 34 0 11 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models Xiaohui Zhang Vimal Manohar David C. Zhang Frank Zhang Yangyang Shi Nayan Singhal Julian Chan Fuchun Peng Yatharth Saraf M. Seltzer 91 14 0 09 Jul 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning Tomohiro Tanaka Ryo Masumura Mana Ihori Akihiko Takashima Shota Orihashi Naoki Makishima 45 4 0 07 Jul 2021
Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition Christian Huber Juan Hussain Sebastian Stüker A. Waibel 73 27 0 05 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition Timo Lohrenz P. Schwarz Zhengyang Li Tim Fingscheidt 54 11 0 02 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis Shammur A. Chowdhury Nadir Durrani Ahmed M. Ali 120 16 0 01 Jul 2021
On joint training with interfaces for spoken language understanding A. Raju Milind Rao Gautam Tiwari Pranav Dheram Bryan Anderson Zhe Zhang Chul Lee Bach Bui Ariya Rastrow VLM 55 11 0 30 Jun 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 139 359 0 29 Jun 2021
Where are we in semantic concept extraction for Spoken Language Understanding? Sahar Ghannay Antoine Caubrière Salima Mdhaffar G. Laperriere Bassam Jabaian Yannick Esteve 67 18 0 24 Jun 2021
Towards Automatic Speech to Sign Language Generation Parul Kapoor Rudrabha Mukhopadhyay Sindhu B. Hegde Vinay P. Namboodiri C. V. Jawahar SLR 54 12 0 24 Jun 2021
Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition Xiong Wang Sining Sun Lei Xie Long Ma 65 20 0 17 Jun 2021
Layer Pruning on Demand with Intermediate CTC Jaesong Lee Jingu Kang Shinji Watanabe 45 18 0 17 Jun 2021
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition Yosuke Higuchi Niko Moritz Jonathan Le Roux Takaaki Hori VLM 132 53 0 16 Jun 2021
Attention-Based Keyword Localisation in Speech using Visual Grounding Kayode Olaleye Herman Kamper 66 13 0 16 Jun 2021
SynthASR: Unlocking Synthetic Data for Speech Recognition A. Fazel Wei Yang Yulan Liu Roberto Barra-Chicote Yi Meng Roland Maas J. Droppo SyDa 110 51 0 14 Jun 2021
Improving RNN-T ASR Performance with Date-Time and Location Awareness Swayambhu Nath Ray Soumyajit Mitra Raghavendra Bilgi Sri Garimella 38 5 0 11 Jun 2021
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition Max W. Y. Lam Jun Wang Chao Weng Jane Polak Scowcroft Dong Yu 68 6 0 08 Jun 2021
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios E. Tsunoo Kentarou Shibata Chaitanya Narisetty Yosuke Kashiwagi Shinji Watanabe 71 12 0 07 Jun 2021
Approximate Fixed-Points in Recurrent Neural Networks Zhengxiong Wang Anton Ragni 27 4 0 04 Jun 2021
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition Zhong Meng Yu-Huan Wu Naoyuki Kanda Liang Lu Xie Chen Guoli Ye Eric Sun Jinyu Li Jiawei Liu MoMe 101 21 0 04 Jun 2021
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR Shammur A. Chowdhury A. Hussein Ahmed Abdelali Ahmed M. Ali 78 36 0 31 May 2021
Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End Swayambhu Nath Ray Minhua Wu A. Raju Pegah Ghahremani Raghavendra Bilgi Milind Rao Harish Arsikere Ariya Rastrow A. Stolcke J. Droppo 68 11 0 14 May 2021
Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition Khin Me Me Chit Laet Laet Lin 40 4 0 13 May 2021
Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models Coleman Hooper Thierry Tambe Gu-Yeon Wei 43 0 0 03 May 2021
On the limit of English conversational speech recognition Zoltán Tüske G. Saon Brian Kingsbury 94 50 0 03 May 2021
On Addressing Practical Challenges for RNN-Transducer Rui Zhao Jian Xue Jinyu Li Wenning Wei Lei He Jiawei Liu 72 32 0 27 Apr 2021
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models Thibault Doutre Wei Han Chung-Cheng Chiu Ruoming Pang Olivier Siohan Liangliang Cao 68 5 0 25 Apr 2021
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech Solène Evain H. Nguyen Hang Le Marcely Zanon Boito Salima Mdhaffar ... François Portet Solange Rossato Fabien Ringeval D. Schwab Laurent Besacier SSL 119 70 0 23 Apr 2021
Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network Janne Pylkkönen Antti Ukkonen Juho Kilpikoski Samu Tamminen Hannes Heikinheimo 60 27 0 22 Apr 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers Takaaki Hori Niko Moritz Chiori Hori Jonathan Le Roux 76 34 0 19 Apr 2021
Non-linear Functional Modeling using Neural Networks Aniruddha Rajendra Rao M. Reimherr 66 31 0 19 Apr 2021
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition Wei Zhou Mohammad Zeineldeen Zuoyun Zheng Ralf Schluter Hermann Ney 77 14 0 19 Apr 2021
A Method to Reveal Speaker Identity in Distributed ASR Training, and How to Counter It Trung D. Q. Dang Om Thakkar Swaroop Indra Ramaswamy Rajiv Mathews Peter Chin Franccoise Beaufays FedML 53 10 0 15 Apr 2021
Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding S. Seo Donghyun Kwak Bowon Lee 85 33 0 15 Apr 2021
Annealing Knowledge Distillation A. Jafari Mehdi Rezagholizadeh Pranav Sharma A. Ghodsi 98 79 0 14 Apr 2021
Efficient conformer-based speech recognition with linear attention Shengqiang Li Menglong Xu Xiao-Lei Zhang 60 23 0 14 Apr 2021
Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models Mohammad Zeineldeen Aleksandr Glushko Wilfried Michel Albert Zeyer Ralf Schluter Hermann Ney AuLLM 48 40 0 12 Apr 2021
Non-autoregressive Transformer-based End-to-end ASR using BERT Fu-Hao Yu Kuan-Yu Chen 59 23 0 10 Apr 2021
Lip reading using external viseme decoding J. Peymanfard Mohammad Reza Mohammadi Hossein Zeinali N. Mozayani 53 11 0 10 Apr 2021
Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR Fan Yu Haoneng Luo Pengcheng Guo Yuhao Liang Zhuoyuan Yao Lei Xie Yingying Gao Leijing Hou Shilei Zhang 36 11 0 10 Apr 2021
Language model fusion for streaming end to end speech recognition Rodrigo Cabrera Xiaofeng Liu M. Ghodsi Zebulun Matteson Eugene Weinstein Anjuli Kannan MoMe AI4TS 70 14 0 09 Apr 2021
On Architectures and Training for Raw Waveform Feature Extraction in ASR Peter Vieting Christoph Luscher Wilfried Michel Ralf Schluter Hermann Ney 59 10 0 09 Apr 2021
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization Zhengkun Tian Jiangyan Yi Ye Bai J. Tao Shuai Zhang Zhengqi Wen 51 16 0 07 Apr 2021
Darts-Conformer: Towards Efficient Gradient-Based Neural Architecture Search For End-to-End ASR Xian Shi Pan Zhou Wei Chen Lei Xie 81 17 0 07 Apr 2021
Extremely Low Footprint End-to-End ASR System for Smart Device Zhifu Gao Yiwu Yao Shiliang Zhang Jun Yang Ming Lei Ian Mcloughlin 43 13 0 06 Apr 2021
Non-autoregressive Mandarin-English Code-switching Speech Recognition Shun-Po Chuang Heng-Jui Chang Sung-Feng Huang Hung-yi Lee 90 15 0 06 Apr 2021
Understanding Medical Conversations: Rich Transcription, Confidence Scores & Information Extraction H. Soltau Mingqiu Wang Izhak Shafran Laurent El Shafey MedIm LM&MA 75 13 0 06 Apr 2021
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition Yuan Shangguan Rohit Prabhavalkar Hang Su Jay Mahadeokar Yangyang Shi ... Chunyang Wu Duc Le Ozlem Kalinli Christian Fuegen M. Seltzer 57 29 0 06 Apr 2021
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network William Chan Daniel S. Park Chris A. Lee Yu Zhang Quoc V. Le Mohammad Norouzi AI4TS 90 138 0 05 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker Identification Liang Lu Naoyuki Kanda Jinyu Li Jiawei Liu 75 20 0 05 Apr 2021