End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results

4 December 2014

Papers citing "End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results"

50 / 65 papers shown

Title
Experimental Study on Time Series Analysis of Lower Limb Rehabilitation Exercise Data Driven by Novel Model Architecture and Large Models Hengyu Lin AI4CE 28 0 0 04 Apr 2025
Automatic Speech Recognition for Non-Native English: Accuracy and Disfluency Handling Michael McGuire 49 0 0 10 Mar 2025
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition Jiaming Zhou Yujie Guo Songtao Zhao Haoqin Sun Hui Wang ... Shiyao Wang Xi Yang Yixuan Wang Yonghua Lin Yong Qin 46 0 0 26 Feb 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers Adam Stooke Rohit Prabhavalkar K. Sim P. M. Mengibar 39 0 0 06 Feb 2025
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition Hao Shi Yuan Gao Zhaoheng Ni Tatsuya Kawahara 30 2 0 01 Sep 2024
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision Saierdaer Yusuyin Te Ma Hao Huang Wenbo Zhao Zhijian Ou 46 2 0 04 Jun 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights Moein Heidari Reza Azad Sina Ghorbani Kolahi René Arimond Leon Niggemeier ... Afshin Bozorgpour Ehsan Khodapanah Aghdam A. Kazerouni I. Hacihaliloglu Dorit Merhof 48 7 0 28 Mar 2024
BLSTM-Based Confidence Estimation for End-to-End Speech Recognition A. Ogawa Naohiro Tawara Takatomo Kano Marc Delcroix 46 4 0 22 Dec 2023
Factual Consistency Oriented Speech Recognition Naoyuki Kanda Takuya Yoshioka Yang Liu 43 0 0 24 Feb 2023
LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks Nelly Elsayed Zag ElSayed Anthony Maida 26 0 0 12 Jan 2023
Monotonic segmental attention for automatic speech recognition Albert Zeyer Robin Schmitt Wei Zhou Ralf Schluter Hermann Ney 16 8 0 26 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses C. Li Ngoc Thang Vu 14 2 0 20 Oct 2022
Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition Zehai Tu Jack Deadman Ning Ma Jon Barker 27 4 0 08 Apr 2022
Dynamic Latency for CTC-Based Streaming Automatic Speech Recognition With Emformer J. Sun Guiping Zhong Dinghao Zhou Baoxiang Li 21 0 0 29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit Binbin Zhang Di Wu Zhendong Peng Xingcheng Song Zhuoyuan Yao Hang Lv Linfu Xie Chao Yang Fuping Pan Jianwei Niu VLM 23 93 0 29 Mar 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey Ngoc Dung Huynh Mohamed Reda Bouadjenek Imran Razzak Kevin Lee Chetan Arora Ali Hassani A. Zaslavsky AAML 23 6 0 22 Feb 2022
LiteLSTM Architecture for Deep Recurrent Neural Networks Nelly Elsayed Zag ElSayed Anthony Maida 34 5 0 27 Jan 2022
Exploring Non-Autoregressive End-To-End Neural Modeling For English Mispronunciation Detection And Diagnosis Hsin-Wei Wang Bi-Cheng Yan Hsuan-Sheng Chiu Yung-Chang Hsu Berlin Chen 18 7 0 01 Nov 2021
End-to-end acoustic modelling for phone recognition of young readers Lucile Gelin Morgane Daniel J. Pinquier Thomas Pellegrini 18 13 0 04 Mar 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet M. O. Topal Anil Bas Imke van Heerden LLMAG AI4CE 26 88 0 16 Feb 2021
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems Xianrui Zheng Yulan Liu Deniz Gunceler D. Willett 17 78 0 23 Nov 2020
Error Bounds of Projection Models in Weakly Supervised 3D Human Pose Estimation Nikolas Klug Moritz Einfalt Stephan Brehm Rainer Lienhart 23 1 0 23 Oct 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition Jin Xu Xu Tan Yi Ren Tao Qin Jian Li Sheng Zhao Tie-Yan Liu VLM 16 90 0 09 Aug 2020
Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling Yuliang Zou Pan Ji Quoc-Huy Tran Jia-Bin Huang Manmohan Chandraker SSL 22 65 0 21 Jul 2020
Recursive Social Behavior Graph for Trajectory Prediction Jianhua Sun Qinhong Jiang Cewu Lu GNN 16 158 0 22 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey Konstantinos Benidis Syama Sundar Rangapuram Valentin Flunkert Bernie Wang Danielle C. Maddix ... David Salinas Lorenzo Stella François-Xavier Aubet Laurent Callot Tim Januschowski AI4TS 25 176 0 21 Apr 2020
A Survey of Deep Learning Techniques for Neural Machine Translation Shu Yang Yuxin Wang X. Chu VLM AI4TS AI4CE 22 138 0 18 Feb 2020
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection Takenori Yoshimura Tomoki Hayashi K. Takeda Shinji Watanabe 23 49 0 03 Feb 2020
Teaching Machines to Converse Jiwei Li 21 4 0 31 Jan 2020
Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data Kunal Dhawan Ganji Sreeram Kumar Priyadarshi R. Sinha 15 4 0 15 Jul 2019
Acoustic-to-Word Models with Conversational Context Information Suyoun Kim Florian Metze 14 7 0 21 May 2019
Almost Unsupervised Text to Speech and Automatic Speech Recognition Yi Ren Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 44 101 0 13 May 2019
Modality Attention for End-to-End Audio-visual Speech Recognition Pan Zhou Wenwen Yang Wei Chen Yanfeng Wang Jia Jia 24 69 0 13 Nov 2018
Exploring RNN-Transducer for Chinese Speech Recognition Senmao Wang Pan Zhou Wei Chen Jia Jia Lei Xie 14 30 0 13 Nov 2018
Language model integration based on memory control for sequence to sequence speech recognition Aaron Springer Shinji Watanabe Takaaki Hori M. Baskar Hirofumi Inaguma Jesus Villalba Najim Dehak KELM 27 5 0 06 Nov 2018
Sequential Context Encoding for Duplicate Removal Lu Qi Shu Liu Jianping Shi Jiaya Jia 25 23 0 20 Oct 2018
Deep Audio-Visual Speech Recognition Triantafyllos Afouras Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 22 687 0 06 Sep 2018
Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks Soojeong Kim Gyeong-In Yu Hojin Park Sungwoo Cho Eunji Jeong Hyeonmin Ha Sanha Lee Joo Seong Jeong Byung-Gon Chun 15 73 0 08 Aug 2018
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition Titouan Parcollet Yuhang Zhang Mohamed Morchid C. Trabelsi G. Linarès R. Mori Yoshua Bengio 18 98 0 20 Jun 2018
ESPnet: End-to-End Speech Processing Toolkit Shinji Watanabe Takaaki Hori Shigeki Karita Tomoki Hayashi Jiro Nishitoba ... Jahn Heymann Matthew Wiesner Nanxin Chen Adithya Renduchintala Tsubasa Ochiai VLM 28 1,477 0 30 Mar 2018
Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks Agrim Gupta Justin Johnson Li Fei-Fei Silvio Savarese Alexandre Alahi GAN 54 1,881 0 29 Mar 2018
A GRU-based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition Jianshu Zhang Jun Du Lirong Dai 26 59 0 04 Dec 2017
Towards Language-Universal End-to-End Speech Recognition Suyoun Kim M. Seltzer 24 68 0 06 Nov 2017
An Attention-based Collaboration Framework for Multi-View Network Representation Learning Meng Qu Jian Tang Jingbo Shang Xiang Ren Ming Zhang Jiawei Han GNN 16 166 0 19 Sep 2017
Listening while Speaking: Speech Chain by Deep Learning Andros Tjandra S. Sakti Satoshi Nakamura AuLLM 123 165 0 16 Jul 2017
Neural Sequence Model Training via $α$ -divergence Minimization Sotetsu Koyamada Yuta Kikuchi Atsunori Kanemura S. Maeda S. Ishii 65 0 0 30 Jun 2017
An online sequence-to-sequence model for noisy speech recognition Chung-Cheng Chiu Dieterich Lawson Yuping Luo George Tucker Kevin Swersky Ilya Sutskever Navdeep Jaitly 13 7 0 16 Jun 2017
Multichannel End-to-end Speech Recognition Tsubasa Ochiai Shinji Watanabe Takaaki Hori J. Hershey 19 92 0 14 Mar 2017
Lip Reading Sentences in the Wild Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 162 784 0 16 Nov 2016
Knowledge Representation via Joint Learning of Sequential Text and Knowledge Graphs Jiawei Wu Ruobing Xie Zhiyuan Liu Maosong Sun 3DV 21 19 0 22 Sep 2016