ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
Language-Routing Mixture of Experts for Multilingual and Code-Switching
  Speech Recognition
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Wenxuan Wang
Guodong Ma
Yuke Li
Binbin Du
MoE
74
25
0
12 Jul 2023
Can Generative Large Language Models Perform ASR Error Correction?
Can Generative Large Language Models Perform ASR Error Correction?
Rao Ma
Mengjie Qian
Potsawee Manakul
Mark Gales
Kate Knill
AuLLMKELM
84
60
0
09 Jul 2023
Align With Purpose: Optimize Desired Properties in CTC Models with a
  General Plug-and-Play Framework
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Eliya Segev
Maya Alroy
Ronen Katsir
Noam Wies
Ayana Shenhav
...
D. Zar
Oren Tadmor
Jacob Bitterman
Amnon Shashua
Tal Rosenwein
97
2
0
04 Jul 2023
Sparse-Input Neural Network using Group Concave Regularization
Sparse-Input Neural Network using Group Concave Regularization
Bin Luo
S. Halabi
79
3
0
01 Jul 2023
Accelerating Transducers through Adjacent Token Merging
Accelerating Transducers through Adjacent Token Merging
Yuang Li
Yu-Huan Wu
Jinyu Li
Shujie Liu
79
4
0
28 Jun 2023
Towards Effective and Compact Contextual Representation for Conformer
  Transducer Speech Recognition Systems
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui
Jiawen Kang
Jiajun Deng
Xiaoyue Yin
Yutao Xie
Xie Chen
Xunying Liu
81
8
0
23 Jun 2023
Multi-pass Training and Cross-information Fusion for Low-resource
  End-to-end Accented Speech Recognition
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
66
4
0
20 Jun 2023
MobileASR: A resource-aware on-device learning framework for user voice
  personalization applications on mobile phones
MobileASR: A resource-aware on-device learning framework for user voice personalization applications on mobile phones
Zitha Sasindran
Harsha Yelchuri
Pooja S B. Rao
Prabhakar Venkata Tamma
64
1
0
15 Jun 2023
Research on an improved Conformer end-to-end Speech Recognition Model
  with R-Drop Structure
Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Weidong Ji
Shijie Zan
Guohui Zhou
Xu Wang
SyDa
66
1
0
14 Jun 2023
DCTX-Conformer: Dynamic context carry-over for low latency unified
  streaming and non-streaming Conformer ASR
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR
Goeric Huybrechts
S. Ronanki
Xilai Li
H. Nosrati
S. Bodapati
Katrin Kirchhoff
60
1
0
13 Jun 2023
Record Deduplication for Entity Distribution Modeling in ASR Transcripts
Record Deduplication for Entity Distribution Modeling in ASR Transcripts
Tianyu Huang
Chung Hoon Hong
Carl N. Wivagg
Kanna Shimizu
35
0
0
09 Jun 2023
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in
  End-to-End Automatic Speech Recognition
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Xianzhao Chen
Yist Y. Lin
Kang Wang
Yi He
Zejun Ma
62
2
0
09 Jun 2023
Streaming Speech-to-Confusion Network Speech Recognition
Streaming Speech-to-Confusion Network Speech Recognition
Denis Filimonov
Prabhat Pandey
Ariya Rastrow
Ankur Gandhe
A. Stolcke
HAI
74
0
0
02 Jun 2023
Adapting an Unadaptable ASR System
Adapting an Unadaptable ASR System
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
94
3
0
01 Jun 2023
Adaptive Contextual Biasing for Transducer Based Streaming Speech
  Recognition
Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition
Tianyi Xu
Zhanheng Yang
Kaixun Huang
Pengcheng Guo
Aoting Zhang
Biao Li
Changru Chen
Chong Li
Linfu Xie
94
12
0
01 Jun 2023
Enhancing the Unified Streaming and Non-streaming Model with Contrastive
  Learning
Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning
Yuting Yang
Yuke Li
Binbin Du
AI4TS
70
0
0
01 Jun 2023
Encoder-decoder multimodal speaker change detection
Encoder-decoder multimodal speaker change detection
Jee-weon Jung
Soonshin Seo
Hee-Soo Heo
Geon-min Kim
You Jin Kim
Youngki Kwon
Min-Ji Lee
Bong-Jin Lee
57
2
0
01 Jun 2023
Edit Distance based RL for RNNT decoding
Edit Distance based RL for RNNT decoding
DongSeon Hwang
Changwan Ryu
K. Sim
54
0
0
31 May 2023
Graph Neural Networks for Contextual ASR with the Tree-Constrained
  Pointer Generator
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
Chuxu Zhang
P. Woodland
54
6
0
30 May 2023
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Chenda Li
Yao Qian
Zhuo Chen
Naoyuki Kanda
Dongmei Wang
Takuya Yoshioka
Y. Qian
Michael Zeng
68
12
0
30 May 2023
Building Accurate Low Latency ASR for Streaming Voice Search
Building Accurate Low Latency ASR for Streaming Voice Search
Abhinav Goyal
Nikesh Garera
30
1
0
29 May 2023
Retraining-free Customized ASR for Enharmonic Words Based on a
  Named-Entity-Aware Model and Phoneme Similarity Estimation
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation
Yui Sudo
K. Hata
K. Nakadai
70
4
0
29 May 2023
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech
  Recognition
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition
Wei Zhou
Eugen Beck
Simon Berger
Ralf Schluter
Hermann Ney
VLM
86
5
0
28 May 2023
CIF-PT: Bridging Speech and Text Representations for Spoken Language
  Understanding via Continuous Integrate-and-Fire Pre-Training
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training
Linhao Dong
Zhecheng An
Peihao Wu
Jun Zhang
Lu Lu
Zejun Ma
54
6
0
27 May 2023
DistriBlock: Identifying adversarial audio samples by leveraging
  characteristics of the output distribution
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution
Matías P. Pizarro
D. Kolossa
Asja Fischer
AAML
98
1
0
26 May 2023
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic
  and Semantic Cooperative Decoding
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Tianren Zhang
Haibo Qin
Zhibing Lai
Songlu Chen
Qi Liu
Feng Chen
Xinyuan Qian
Xu-Cheng Yin
58
0
0
23 May 2023
CopyNE: Better Contextual ASR by Copying Named Entities
CopyNE: Better Contextual ASR by Copying Named Entities
Shilin Zhou
Zhenghua Li
Yu Hong
Hao Fei
Zhefeng Wang
Baoxing Huai
102
8
0
22 May 2023
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Karel Beneš
M. Kocour
L. Burget
61
2
0
21 May 2023
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select
  Indic Languages
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages
Shivam Mhaskar
Vineet Bhat
Akshay Batheja
S. Deoghare
Paramveer Choudhary
P. Bhattacharyya
76
5
0
21 May 2023
Multi-Head State Space Model for Speech Recognition
Multi-Head State Space Model for Speech Recognition
Yassir Fathullah
Chunyang Wu
Yuan Shangguan
Junteng Jia
Wenhan Xiong
...
Chunxi Liu
Yangyang Shi
Ozlem Kalinli
M. Seltzer
Mark Gales
68
14
0
21 May 2023
Contextualized End-to-End Speech Recognition with Contextual Phrase
  Prediction Network
Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Kaixun Huang
Aoting Zhang
Zhanheng Yang
Pengcheng Guo
Bingshen Mu
Tianyi Xu
Linfu Xie
89
24
0
21 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Language-universal phonetic encoder for low-resource speech recognition
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
81
3
0
19 May 2023
Blank-regularized CTC for Frame Skipping in Neural Transducer
Blank-regularized CTC for Frame Skipping in Neural Transducer
Yifan Yang
Xiaoyu Yang
Liyong Guo
Zengwei Yao
Wei Kang
Fangjun Kuang
Long Lin
Xie Chen
Daniel Povey
49
9
0
19 May 2023
AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide
  for Simultaneous Speech Translation
AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation
Sara Papi
Marco Turchi
Matteo Negri
74
22
0
19 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
104
18
0
18 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
91
67
0
18 May 2023
A Lexical-aware Non-autoregressive Transformer-based ASR Model
A Lexical-aware Non-autoregressive Transformer-based ASR Model
Chong Lin
Kuan-Yu Chen
AI4TS
66
1
0
18 May 2023
Accurate and Reliable Confidence Estimation Based on Non-Autoregressive
  End-to-End Speech Recognition System
Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System
Xian Shi
Haoneng Luo
Zhifu Gao
Shiliang Zhang
Zhijie Yan
40
2
0
18 May 2023
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Jason (Jinglun) Cai
Monica Sunkara
Xilai Li
Anshu Bhatia
Xiao Pan
S. Bodapati
133
3
0
11 May 2023
Robust Acoustic and Semantic Contextual Biasing in Neural Transducers
  for Speech Recognition
Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech Recognition
Xuandi Fu
Kanthashree Mysore Sathyendra
Ankur Gandhe
Jing Liu
Grant P. Strimel
Ross McGowan
Athanasios Mouchtaris
100
16
0
09 May 2023
End-to-end spoken language understanding using joint CTC loss and
  self-supervised, pretrained acoustic encoders
End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders
Jixuan Wang
Martin H. Radfar
Kailin Wei
Clement Chung
64
3
0
04 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better
  Generalization
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
150
85
0
27 Apr 2023
Self-regularised Minimum Latency Training for Streaming
  Transformer-based Speech Recognition
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Mohan Li
R. Doddipatla
Catalin Zorila
147
0
0
24 Apr 2023
Non-autoregressive End-to-end Approaches for Joint Automatic Speech
  Recognition and Spoken Language Understanding
Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Mohan Li
R. Doddipatla
70
7
0
21 Apr 2023
DropDim: A Regularization Method for Transformer Networks
DropDim: A Regularization Method for Transformer Networks
Hao Zhang
Dan Qu
Kejia Shao
Xu Yang
79
12
0
20 Apr 2023
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at
  Scale
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale
Cal Peyser
M. Picheny
Kyunghyun Cho
Rohit Prabhavalkar
Ronny Huang
Tara N. Sainath
AI4TS
49
1
0
19 Apr 2023
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming
  Conformer ASR
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
Xilai Li
Goeric Huybrechts
S. Ronanki
Jeffrey J. Farris
S. Bodapati
80
7
0
18 Apr 2023
A CTC Alignment-based Non-autoregressive Transformer for End-to-end
  Automatic Speech Recognition
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition
Ruchao Fan
Wei Chu
Peng Chang
Abeer Alwan
36
11
0
15 Apr 2023
Robust and Context-Aware Real-Time Collaborative Robot Handling via
  Dynamic Gesture Commands
Robust and Context-Aware Real-Time Collaborative Robot Handling via Dynamic Gesture Commands
Rui Chen
Alvin C M Shek
Changliu Liu
43
4
0
12 Apr 2023
Online Spatio-Temporal Learning with Target Projection
Online Spatio-Temporal Learning with Target Projection
Thomas Ortner
Lorenzo Pes
Joris Gentinetta
Charlotte Frenkel
A. Pantazi
62
7
0
11 Apr 2023
Previous
12345...192021
Next