ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXivPDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,033 papers shown
Title
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech
  Recognition
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Peng Fan
Changhao Shan
Sining Sun
Qing Yang
Jianwei Zhang
30
3
0
23 Oct 2023
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
S. M. Fazle
J. Mondal
Meem Arafat Manab
Xi Xiao
Sarfaraz Newaz
AAML
27
0
0
18 Oct 2023
End-to-End real time tracking of children's reading with pointer network
End-to-End real time tracking of children's reading with pointer network
Vishal Sunder
Beulah Karrolla
Eric Fosler-Lussier
13
0
0
17 Oct 2023
Correction Focused Language Model Training for Speech Recognition
Correction Focused Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Ozlem Kalinli
KELM
33
3
0
17 Oct 2023
Personalization of CTC-based End-to-End Speech Recognition Using
  Pronunciation-Driven Subword Tokenization
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Zhihong Lei
Ernest Pusateri
Shiyi Han
Leo Liu
Mingbin Xu
...
R. Travadi
Youyuan Zhang
Mirko Hannemann
Man-Hung Siu
Zhen Huang
23
9
0
16 Oct 2023
Improved Contextual Recognition In Automatic Speech Recognition Systems
  By Semantic Lattice Rescoring
Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Ankitha Sudarshan
Vinay Samuel
Parth Patwa
Ibtihel Amara
Aman Chadha
24
2
0
14 Oct 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training
  Data for Automatic Speech Recognition
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
12
3
0
12 Oct 2023
Investigating the Effect of Language Models in Sequence Discriminative
  Training for Neural Transducers
Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
28
0
0
11 Oct 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework
  for Speech Recognition
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
38
47
0
10 Oct 2023
ed-cec: improving rare word recognition using asr postprocessing based
  on error detection and context-aware error correction
ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Jiajun He
Zekun Yang
T. Toda
37
4
0
08 Oct 2023
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech
  Recognition
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition
Kaixun Huang
Aoting Zhang
Binbin Zhang
Tianyi Xu
Xingchen Song
Lei Xie
18
3
0
07 Oct 2023
Dementia Assessment Using Mandarin Speech with an Attention-based Speech
  Recognition Encoder
Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Zih-Jyun Lin
Yi-Ju Chen
P. Kuo
Likai Huang
Chaur-Jong Hu
Cheng-Yu Chen
18
1
0
06 Oct 2023
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Weiran Wang
Zelin Wu
D. Caseiro
Tsendsuren Munkhdalai
K. Sim
...
Rohit Prabhavalkar
Zhong Meng
Ding Zhao
Tara N. Sainath
P. M. Mengibar
53
5
0
29 Sep 2023
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation
  Auxiliary Task for E2E Code-switching ASR
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Guodong Ma
Wenxuan Wang
Yuke Li
Yuting Yang
Binbin Du
Haoran Fu
31
5
0
28 Sep 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
  Parameter Sharing
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
26
2
0
27 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with
  Large Language Models
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
E. Chng
32
42
0
27 Sep 2023
Developing automatic verbatim transcripts for international multilingual
  meetings: an end-to-end solution
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution
Akshat Dewan
Michal Ziemski
Henri Meylan
Lorenzo Concina
Bruno Pouliquen
13
1
0
27 Sep 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive
  Inference
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
21
0
0
26 Sep 2023
On the Relation between Internal Language Model and Sequence
  Discriminative Training for Neural Transducers
On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
38
0
0
25 Sep 2023
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Xugang Lu
Peng Shen
Yu Tsao
Hisashi Kawai
30
5
0
24 Sep 2023
Memory-augmented conformer for improved end-to-end long-form ASR
Memory-augmented conformer for improved end-to-end long-form ASR
Carlos Carvalho
A. Abad
RALM
32
1
0
22 Sep 2023
Massive End-to-end Models for Short Search Queries
Massive End-to-end Models for Short Search Queries
Weiran Wang
Rohit Prabhavalkar
Dongseong Hwang
Qiujia Li
K. Sim
...
Zhong Meng
CJ Zheng
Yanzhang He
Tara N. Sainath
P. M. Mengibar
32
2
0
22 Sep 2023
Variational Connectionist Temporal Classification for Order-Preserving
  Sequence Modeling
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
BDL
27
2
0
21 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Semi-Autoregressive Streaming ASR With Label Context
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
23
5
0
19 Sep 2023
HypR: A comprehensive study for ASR hypothesis revising with a reference
  corpus
HypR: A comprehensive study for ASR hypothesis revising with a reference corpus
Yi-Wei Wang
Keda Lu
Kuan-Yu Chen
40
2
0
18 Sep 2023
Chunked Attention-based Encoder-Decoder Model for Streaming Speech
  Recognition
Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition
Mohammad Zeineldeen
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
29
4
0
15 Sep 2023
Unimodal Aggregation for CTC-based Speech Recognition
Unimodal Aggregation for CTC-based Speech Recognition
Ying Fang
Xiaofei Li
31
1
0
15 Sep 2023
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of
  Speech in ASR Tasks
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks
Sizhou Chen
Songyang Gao
Sen Fang
19
0
0
14 Sep 2023
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech
  Recognition
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao
Yosuke Higuchi
Yusuke Kida
Tetsuji Ogawa
Tetsunori Kobayashi
20
1
0
09 Sep 2023
Text-Only Domain Adaptation for End-to-End Speech Recognition through
  Down-Sampling Acoustic Representation
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Jiaxu Zhu
Weinan Tong
Yaoxun Xu
Chang Song
Zhiyong Wu
Zhao You
Dan Su
Dong Yu
Helen M. Meng
32
0
0
04 Sep 2023
SememeASR: Boosting Performance of End-to-End Speech Recognition against
  Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Jiaxu Zhu
Chang Song
Zhiyong Wu
Helen Meng
VLM
31
0
0
04 Sep 2023
Decoupled Structure for Improved Adaptability of End-to-End Models
Decoupled Structure for Improved Adaptability of End-to-End Models
Keqi Deng
P. Woodland
AuLLM
27
2
0
25 Aug 2023
KinSPEAK: Improving speech recognition for Kinyarwanda via
  semi-supervised learning methods
KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Antoine Nzeyimana
SSL
30
0
0
23 Aug 2023
Improving CTC-AED model with integrated-CTC and auxiliary loss
  regularization
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Daobin Zhu
Xiangdong Su
Hongbin Zhang
18
1
0
15 Aug 2023
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
  Recognition
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Hanjing Zhu
Dongji Gao
Gaofeng Cheng
Daniel Povey
Pengyuan Zhang
Yonghong Yan
NoLa
38
4
0
12 Aug 2023
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and
  Effective Hotword Customization Ability
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability
Xian Shi
Yexin Yang
Zerui Li
Yanni Chen
Zhifu Gao
Shiliang Zhang
27
11
0
07 Aug 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated
  Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
21
0
0
05 Aug 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
30
4
0
24 Jul 2023
Globally Normalising the Transducer for Streaming Speech Recognition
Globally Normalising the Transducer for Streaming Speech Recognition
Rogier van Dalen
32
0
0
20 Jul 2023
Replay to Remember: Continual Layer-Specific Fine-tuning for German
  Speech Recognition
Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Theresa Pekarek-Rosin
S. Wermter
VLM
CLL
32
2
0
14 Jul 2023
Adapting an ASR Foundation Model for Spoken Language Assessment
Adapting an ASR Foundation Model for Spoken Language Assessment
Rao Ma
Mengjie Qian
Mark J. F. Gales
Kate Knill
19
11
0
13 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Zeping Min
Jinbo Wang
AuLLM
35
13
0
13 Jul 2023
Language-Routing Mixture of Experts for Multilingual and Code-Switching
  Speech Recognition
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Wenxuan Wang
Guodong Ma
Yuke Li
Binbin Du
MoE
14
23
0
12 Jul 2023
Can Generative Large Language Models Perform ASR Error Correction?
Can Generative Large Language Models Perform ASR Error Correction?
Rao Ma
Mengjie Qian
Potsawee Manakul
Mark J. F. Gales
Kate Knill
AuLLM
KELM
27
49
0
09 Jul 2023
Align With Purpose: Optimize Desired Properties in CTC Models with a
  General Plug-and-Play Framework
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Eliya Segev
Maya Alroy
Ronen Katsir
Noam Wies
Ayana Shenhav
...
D. Zar
Oren Tadmor
Jacob Bitterman
Amnon Shashua
Tal Rosenwein
32
2
0
04 Jul 2023
Sparse-Input Neural Network using Group Concave Regularization
Sparse-Input Neural Network using Group Concave Regularization
Bin Luo
S. Halabi
14
2
0
01 Jul 2023
Accelerating Transducers through Adjacent Token Merging
Accelerating Transducers through Adjacent Token Merging
Yuang Li
Yu-Huan Wu
Jinyu Li
Shujie Liu
27
4
0
28 Jun 2023
Towards Effective and Compact Contextual Representation for Conformer
  Transducer Speech Recognition Systems
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui
Jiawen Kang
Jiajun Deng
Xiaoyue Yin
Yutao Xie
Xie Chen
Xunying Liu
35
8
0
23 Jun 2023
Multi-pass Training and Cross-information Fusion for Low-resource
  End-to-end Accented Speech Recognition
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
27
4
0
20 Jun 2023
MobileASR: A resource-aware on-device learning framework for user voice
  personalization applications on mobile phones
MobileASR: A resource-aware on-device learning framework for user voice personalization applications on mobile phones
Zitha Sasindran
Harsha Yelchuri
Pooja S B. Rao
Prabhakar Venkata Tamma
17
1
0
15 Jun 2023
Previous
123456...192021
Next