Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.04395
Cited By
End-to-End Attention-based Large Vocabulary Speech Recognition
18 August 2015
Dzmitry Bahdanau
J. Chorowski
Dmitriy Serdyuk
Philemon Brakel
Yoshua Bengio
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Attention-based Large Vocabulary Speech Recognition"
50 / 157 papers shown
Title
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
39
0
0
06 Feb 2025
Classification Error Bound for Low Bayes Error Conditions in Machine Learning
Zijian Yang
Vahe Eminyan
Ralf Schluter
Hermann Ney
38
0
0
28 Jan 2025
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
Kai-Tuo Xu
Feng-Long Xie
Xu Tang
Yao Hu
77
4
0
24 Jan 2025
Attention layers provably solve single-location regression
Pierre Marion
Raphael Berthier
Gérard Biau
Claire Boyer
191
3
0
02 Oct 2024
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Ye Bai
Jingping Chen
Jitong Chen
Wei Chen
Zhuo Chen
...
Wanyi Zhang
Yang Zhang
Yawei Zhang
Yijie Zheng
Ming Zou
AuLLM
52
19
0
05 Jul 2024
Attention on Personalized Clinical Decision Support System: Federated Learning Approach
Chu Myaet Thwal
K. Thar
Ye Lin Tun
Choong Seon Hong
21
22
0
22 Jan 2024
Short-Term Multi-Horizon Line Loss Rate Forecasting of a Distribution Network Using Attention-GCN-LSTM
Jie Liu
Yijia Cao
Yong Li
Yixiu Guo
Wei Deng
19
1
0
19 Dec 2023
Auxiliary Losses for Learning Generalizable Concept-based Models
Ivaxi Sheth
Samira Ebrahimi Kahou
32
25
0
18 Nov 2023
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
37
4
0
20 Jun 2023
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
32
4
0
07 Dec 2022
Minimal Width for Universal Property of Deep RNN
Changhoon Song
Geonho Hwang
Jun ho Lee
Myung-joo Kang
25
9
0
25 Nov 2022
Towards continually learning new languages
Ngoc-Quan Pham
Jan Niehues
A. Waibel
CLL
11
1
0
21 Nov 2022
Self-Transriber: Few-shot Lyrics Transcription with Self-training
Xiaoxue Gao
Xianghu Yue
Haizhou Li
30
7
0
18 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
37
12
0
10 Nov 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
Ao Zhang
F. Yu
Kaixun Huang
Linfu Xie
Longbiao Wang
Eng Siong Chng
Hui Bu
Binbin Zhang
Wei Chen
Xin Xu
32
4
0
03 Nov 2022
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Suyoun Kim
Ke Li
Lucas Kabela
Rongqing Huang
Jiedan Zhu
Ozlem Kalinli
Duc Le
27
8
0
31 Oct 2022
Accelerating RNN-T Training and Inference Using CTC guidance
Yongqiang Wang
Zhehuai Chen
Cheng-yong Zheng
Yu Zhang
Wei Han
Parisa Haghani
40
23
0
29 Oct 2022
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
16
8
0
26 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
41
17
0
18 Oct 2022
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Chao-Han Huck Yang
I-Fan Chen
A. Stolcke
Sabato Marco Siniscalchi
Chin-Hui Lee
32
2
0
11 Oct 2022
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
Multimodal Crop Type Classification Fusing Multi-Spectral Satellite Time Series with Farmers Crop Rotations and Local Crop Distribution
Valentin Barrière
M. Claverie
21
4
0
23 Aug 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
30
21
0
20 Jul 2022
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems
Jesús Andrés-Ferrer
Dario Albesano
P. Zhan
Paul Vozila
16
6
0
29 Jun 2022
Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire
Zhiyun Fan
Linhao Dong
Meng Cai
Zejun Ma
Bo Xu
31
4
0
27 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
29
14
0
07 Jun 2022
Improving CTC-based ASR Models with Gated Interlayer Collaboration
Yuting Yang
Yuke Li
Binbin Du
34
11
0
25 May 2022
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
C. Zhang
P. Woodland
32
14
0
18 May 2022
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
31
296
0
27 Mar 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
34
6
0
22 Feb 2022
Non-Autoregressive ASR with Self-Conditioned Folded Encoders
Tatsuya Komatsu
28
7
0
17 Feb 2022
A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Florian Boyer
Yusuke Shinohara
Takaaki Ishii
Hirofumi Inaguma
Shinji Watanabe
35
34
0
14 Jan 2022
A Likelihood Ratio based Domain Adaptation Method for E2E Models
Chhavi Choudhury
Ankur Gandhe
Xiaohan Ding
I. Bulyko
27
10
0
10 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
46
305
0
05 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
39
13
0
04 Jan 2022
Sequence-level self-learning with multiple hypotheses
K. Kumatani
Dimitrios Dimitriadis
Yashesh Gaur
R. Gmyr
Sefik Emre Eskimez
Jinyu Li
Michael Zeng
SSL
25
1
0
10 Dec 2021
Handwritten Mathematical Expression Recognition via Attention Aggregation based Bi-directional Mutual Learning
Xiaohang Bian
Bo Qin
Xiaozhe Xin
Jianwu Li
Xuefeng Su
Yanfeng Wang
40
49
0
07 Dec 2021
Borrowing from Similar Code: A Deep Learning NLP-Based Approach for Log Statement Automation
Sina Gholamian
Paul A. S. Ward
17
3
0
02 Dec 2021
Exploring Non-Autoregressive End-To-End Neural Modeling For English Mispronunciation Detection And Diagnosis
Hsin-Wei Wang
Bi-Cheng Yan
Hsuan-Sheng Chiu
Yung-Chang Hsu
Berlin Chen
21
7
0
01 Nov 2021
On Language Model Integration for RNN Transducer based Speech Recognition
Wei Zhou
Zuoyun Zheng
Ralf Schluter
Hermann Ney
37
22
0
13 Oct 2021
Back from the future: bidirectional CTC decoding using future information in speech recognition
Namkyu Jung
Geon-min Kim
Han-Gyu Kim
33
3
0
07 Oct 2021
Improving Time Series Classification Algorithms Using Octave-Convolutional Layers
Samuel Harford
Fazle Karim
H. Darabi
AI4TS
22
1
0
28 Sep 2021
Tied & Reduced RNN-T Decoder
Rami Botros
Tara N. Sainath
R. David
Emmanuel Guzman
Wei Li
Yanzhang He
38
55
0
15 Sep 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Xiaodong Cui
Brian Kingsbury
G. Saon
David Haws
Zoltán Tüske
19
5
0
24 Aug 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
20
14
0
09 Jul 2021
Neural Task Success Classifiers for Robotic Manipulation from Few Real Demonstrations
A. Mohtasib
Amir Ghalamzan
Nicola Bellotto
Heriberto Cuay´ahuitl
13
1
0
01 Jul 2021
Efficient Weight factorization for Multilingual Speech Recognition
Ngoc-Quan Pham
Tuan-Nam Nguyen
S. Stueker
A. Waibel
43
19
0
07 May 2021
End-to-End Speech Recognition from Federated Acoustic Models
Yan Gao
Titouan Parcollet
Salah Zaiem
Javier Fernandez-Marques
Pedro Porto Buarque de Gusmão
Daniel J. Beutel
Nicholas D. Lane
28
43
0
29 Apr 2021
1
2
3
4
Next