Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

8 June 2017

Papers citing "Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM"

36 / 36 papers shown

Title
The Conformer Encoder May Reverse the Time Dimension Robin Schmitt Albert Zeyer Mohammad Zeineldeen Ralf Schluter Hermann Ney 39 0 0 01 Oct 2024
Speaker Characterization by means of Attention Pooling Federico Costa Miquel India Javier Hernando 33 1 0 07 May 2024
BLSTM-Based Confidence Estimation for End-to-End Speech Recognition A. Ogawa Naohiro Tawara Takatomo Kano Marc Delcroix 46 4 0 22 Dec 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition Will Rieger BDL UQCV 19 0 0 16 Jan 2023
Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition Lester Phillip Violeta D. Ma Wen-Chin Huang T. Toda 39 7 0 02 Nov 2022
A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation Tom O'Malley A. Narayanan Quan Wang 27 5 0 14 Sep 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning Yeonghyeon Lee Kangwook Jang Jahyun Goo Youngmoon Jung Hoi-Rim Kim 28 29 0 01 Jul 2022
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode Raviraj Joshi Subodh Kumar 36 2 0 26 Jun 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning Xuenan Xu Zeyu Xie Mengyue Wu K. Yu 41 13 0 11 May 2022
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition Zhao You Shulin Feng Dan Su Dong Yu 22 9 0 07 Apr 2022
Recent Advances in End-to-End Automatic Speech Recognition Jinyu Li VLM 37 363 0 02 Nov 2021
Cross-attention conformer for context modeling in speech enhancement for ASR A. Narayanan Chung-Cheng Chiu Tom O'Malley Quan Wang Yanzhang He 24 14 0 30 Oct 2021
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition Rong Gong Carl Quillen D. Sharma Andrew Goderre José Laínez Ljubomir Milanović 39 13 0 10 Sep 2021
Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech Injy Hamed Pavel Denisov C. Li M. Elmahdy Slim Abdennadher Ngoc Thang Vu 36 35 0 29 Aug 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation Samuel Cahyawijaya 28 12 0 24 Aug 2021
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks Siddharth Dalmia Brian Yan Vikas Raunak Florian Metze Shinji Watanabe 47 30 0 02 May 2021
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers Takaaki Hori Niko Moritz Chiori Hori Jonathan Le Roux 30 34 0 19 Apr 2021
SubSpectral Normalization for Neural Audio Data Processing Simyung Chang Hyoungwoo Park Janghoon Cho Hyunsin Park Sungrack Yun Kyuwoong Hwang 31 30 0 25 Mar 2021
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation Md. Akmal Haidar Chao Xing Mehdi Rezagholizadeh 27 7 0 17 Mar 2021
Hierarchical Transformer-based Large-Context End-to-end ASR with Large-Context Knowledge Distillation Ryo Masumura Naoki Makishima Mana Ihori Akihiko Takashima Tomohiro Tanaka Shota Orihashi 31 29 0 16 Feb 2021
Learning Fast Adaptation on Cross-Accented Speech Recognition Genta Indra Winata Samuel Cahyawijaya Zihan Liu Zhaojiang Lin Andrea Madotto Peng Xu Pascale Fung 52 80 0 04 Mar 2020
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection Takenori Yoshimura Tomoki Hayashi K. Takeda Shinji Watanabe 37 49 0 03 Feb 2020
Recognizing long-form speech using streaming end-to-end models A. Narayanan Rohit Prabhavalkar Chung-Cheng Chiu David Rybach Tara N. Sainath Trevor Strohman 29 129 0 24 Oct 2019
A practical two-stage training strategy for multi-stream end-to-end speech recognition Ruizhi Li Gregory Sell Xiaofei Wang Shinji Watanabe H. Hermansky 21 7 0 23 Oct 2019
A Comparative Study on Transformer vs RNN in Speech Applications Shigeki Karita Nanxin Chen Tomoki Hayashi Takaaki Hori Hirofumi Inaguma ... Ryuichi Yamamoto Xiao-fei Wang Shinji Watanabe Takenori Yoshimura Wangyou Zhang 37 716 0 13 Sep 2019
Self Multi-Head Attention for Speaker Recognition Miquel India Pooyan Safari Javier Hernando 19 110 0 24 Jun 2019
Acoustic-to-Word Models with Conversational Context Information Suyoun Kim Florian Metze 22 7 0 21 May 2019
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text M. Baskar Shinji Watanabe Ramón Fernández Astudillo Takaaki Hori L. Burget J. Černocký 36 41 0 30 Apr 2019
Stream attention-based multi-array end-to-end speech recognition Xiaofei Wang Ruizhi Li Sri Harish Reddy Mallidi Takaaki Hori Shinji Watanabe H. Hermansky 25 21 0 12 Nov 2018
Multi-encoder multi-resolution framework for end-to-end speech recognition Ruizhi Li Xiaofei Wang Sri Harish Reddy Mallidi Takaaki Hori Shinji Watanabe H. Hermansky 22 13 0 12 Nov 2018
Few-shot learning with attention-based sequence-to-sequence models Bertrand Higy P. Bell 24 6 0 08 Nov 2018
Language model integration based on memory control for sequence to sequence speech recognition Aaron Springer Shinji Watanabe Takaaki Hori M. Baskar Hirofumi Inaguma Jesus Villalba Najim Dehak KELM 41 5 0 06 Nov 2018
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model Alexander H. Liu Hung-yi Lee Lin-Shan Lee AuLLM 16 46 0 02 Nov 2018
Dialog-context aware end-to-end speech recognition Suyoun Kim Florian Metze 24 47 0 07 Aug 2018
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin Linhao Dong Shiyu Zhou Wei Chen Bo Xu 24 22 0 17 Jun 2018
ESPnet: End-to-End Speech Processing Toolkit Shinji Watanabe Takaaki Hori Shigeki Karita Tomoki Hayashi Jiro Nishitoba ... Jahn Heymann Sanjeev Khudanpur Nanxin Chen Adithya Renduchintala Tsubasa Ochiai VLM 46 1,481 0 30 Mar 2018