ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
D4AM: A General Denoising Framework for Downstream Acoustic Models
D4AM: A General Denoising Framework for Downstream Acoustic Models
H. Wang
Yu Tsao
Hsin-Min Wang
Chu-Song Chen
70
4
0
28 Nov 2023
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
113
1
0
24 Nov 2023
Analysis of Visual Features for Continuous Lipreading in Spanish
Analysis of Visual Features for Continuous Lipreading in Spanish
David Gimeno-Gómez
Carlos David Martínez Hinarejos
99
2
0
21 Nov 2023
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
David Gimeno-Gómez
Carlos David Martínez Hinarejos
57
8
0
21 Nov 2023
Phonological Level wav2vec2-based Mispronunciation Detection and
  Diagnosis Method
Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method
M. Shahin
Julien Epps
Beena Ahmed
18
1
0
13 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech
  Translation
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
76
2
0
01 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
  Labelling
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
100
64
0
01 Nov 2023
MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition
MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition
Jiamin Xie
John H. L. Hansen
39
3
0
27 Oct 2023
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech
  Recognition
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Peng Fan
Changhao Shan
Sining Sun
Qing Yang
Jianwei Zhang
68
3
0
23 Oct 2023
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
S. M. Fazle
J. Mondal
Meem Arafat Manab
Xi Xiao
Sarfaraz Newaz
AAML
151
0
0
18 Oct 2023
End-to-End real time tracking of children's reading with pointer network
End-to-End real time tracking of children's reading with pointer network
Vishal Sunder
Beulah Karrolla
Eric Fosler-Lussier
20
0
0
17 Oct 2023
Correction Focused Language Model Training for Speech Recognition
Correction Focused Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Ozlem Kalinli
KELM
98
3
0
17 Oct 2023
Personalization of CTC-based End-to-End Speech Recognition Using
  Pronunciation-Driven Subword Tokenization
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Zhihong Lei
Ernest Pusateri
Shiyi Han
Leo Liu
Mingbin Xu
...
R. Travadi
Youyuan Zhang
Mirko Hannemann
Man-Hung Siu
Zhen Huang
70
9
0
16 Oct 2023
Improved Contextual Recognition In Automatic Speech Recognition Systems
  By Semantic Lattice Rescoring
Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Ankitha Sudarshan
Vinay Samuel
Parth Patwa
Ibtihel Amara
Aman Chadha
67
2
0
14 Oct 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training
  Data for Automatic Speech Recognition
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
66
3
0
12 Oct 2023
Investigating the Effect of Language Models in Sequence Discriminative
  Training for Neural Transducers
Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
58
0
0
11 Oct 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework
  for Speech Recognition
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
138
53
0
10 Oct 2023
ed-cec: improving rare word recognition using asr postprocessing based
  on error detection and context-aware error correction
ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Jiajun He
Zekun Yang
Tomoki Toda
85
7
0
08 Oct 2023
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech
  Recognition
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition
Kaixun Huang
Aoting Zhang
Binbin Zhang
Tianyi Xu
Xingchen Song
Lei Xie
56
4
0
07 Oct 2023
Dementia Assessment Using Mandarin Speech with an Attention-based Speech
  Recognition Encoder
Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Zih-Jyun Lin
Yi-Ju Chen
P. Kuo
Likai Huang
Chaur-Jong Hu
Cheng-Yu Chen
30
2
0
06 Oct 2023
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Weiran Wang
Zelin Wu
D. Caseiro
Tsendsuren Munkhdalai
K. Sim
...
Rohit Prabhavalkar
Zhong Meng
Ding Zhao
Tara N. Sainath
P. M. Mengibar
104
6
0
29 Sep 2023
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation
  Auxiliary Task for E2E Code-switching ASR
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Guodong Ma
Wenxuan Wang
Yuke Li
Yuting Yang
Binbin Du
Haoran Fu
60
6
0
28 Sep 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
  Parameter Sharing
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
89
3
0
27 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with
  Large Language Models
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
Eng Siong Chng
99
48
0
27 Sep 2023
Developing automatic verbatim transcripts for international multilingual
  meetings: an end-to-end solution
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution
Akshat Dewan
Michal Ziemski
Henri Meylan
Lorenzo Concina
Bruno Pouliquen
43
1
0
27 Sep 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive
  Inference
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
112
0
0
26 Sep 2023
On the Relation between Internal Language Model and Sequence
  Discriminative Training for Neural Transducers
On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
80
1
0
25 Sep 2023
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Xugang Lu
Peng Shen
Yu Tsao
Hisashi Kawai
93
6
0
24 Sep 2023
Memory-augmented conformer for improved end-to-end long-form ASR
Memory-augmented conformer for improved end-to-end long-form ASR
Carlos Carvalho
A. Abad
RALM
61
1
0
22 Sep 2023
Massive End-to-end Models for Short Search Queries
Massive End-to-end Models for Short Search Queries
Weiran Wang
Rohit Prabhavalkar
Dongseong Hwang
Qiujia Li
K. Sim
...
Zhong Meng
CJ Zheng
Yanzhang He
Tara N. Sainath
P. M. Mengibar
63
2
0
22 Sep 2023
Variational Connectionist Temporal Classification for Order-Preserving
  Sequence Modeling
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
BDL
60
3
0
21 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Semi-Autoregressive Streaming ASR With Label Context
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
64
6
0
19 Sep 2023
HypR: A comprehensive study for ASR hypothesis revising with a reference
  corpus
HypR: A comprehensive study for ASR hypothesis revising with a reference corpus
Yi-Wei Wang
Keda Lu
Kuan-Yu Chen
91
2
0
18 Sep 2023
Chunked Attention-based Encoder-Decoder Model for Streaming Speech
  Recognition
Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition
Mohammad Zeineldeen
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
95
4
0
15 Sep 2023
Unimodal Aggregation for CTC-based Speech Recognition
Unimodal Aggregation for CTC-based Speech Recognition
Ying Fang
Xiaofei Li
67
2
0
15 Sep 2023
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of
  Speech in ASR Tasks
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks
Sizhou Chen
Songyang Gao
Sen Fang
28
0
0
14 Sep 2023
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech
  Recognition
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao
Yosuke Higuchi
Yusuke Kida
Tetsuji Ogawa
Tetsunori Kobayashi
93
1
0
09 Sep 2023
Text-Only Domain Adaptation for End-to-End Speech Recognition through
  Down-Sampling Acoustic Representation
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Jiaxu Zhu
Weinan Tong
Yaoxun Xu
Chang Song
Zhiyong Wu
Zhao You
Jane Polak Scowcroft
Dong Yu
Helen M. Meng
80
0
0
04 Sep 2023
SememeASR: Boosting Performance of End-to-End Speech Recognition against
  Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Jiaxu Zhu
Chang Song
Zhiyong Wu
Helen Meng
VLM
68
0
0
04 Sep 2023
Decoupled Structure for Improved Adaptability of End-to-End Models
Decoupled Structure for Improved Adaptability of End-to-End Models
Keqi Deng
P. Woodland
AuLLM
70
2
0
25 Aug 2023
KinSPEAK: Improving speech recognition for Kinyarwanda via
  semi-supervised learning methods
KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Antoine Nzeyimana
SSL
136
0
0
23 Aug 2023
Improving CTC-AED model with integrated-CTC and auxiliary loss
  regularization
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Daobin Zhu
Xiangdong Su
Hongbin Zhang
86
1
0
15 Aug 2023
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
  Recognition
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Hanjing Zhu
Dongji Gao
Gaofeng Cheng
Daniel Povey
Pengyuan Zhang
Yonghong Yan
NoLa
76
4
0
12 Aug 2023
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and
  Effective Hotword Customization Ability
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability
Xian Shi
Yexin Yang
Zerui Li
Yanni Chen
Zhifu Gao
Shiliang Zhang
61
11
0
07 Aug 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated
  Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
59
0
0
05 Aug 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming
  Encoder-decoder Speech Recognition
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
76
4
0
24 Jul 2023
Globally Normalising the Transducer for Streaming Speech Recognition
Globally Normalising the Transducer for Streaming Speech Recognition
Rogier van Dalen
68
0
0
20 Jul 2023
Replay to Remember: Continual Layer-Specific Fine-tuning for German
  Speech Recognition
Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Theresa Pekarek-Rosin
S. Wermter
VLMCLL
84
2
0
14 Jul 2023
Adapting an ASR Foundation Model for Spoken Language Assessment
Adapting an ASR Foundation Model for Spoken Language Assessment
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
63
14
0
13 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Zeping Min
Jinbo Wang
AuLLM
92
14
0
13 Jul 2023
Previous
123456...192021
Next