Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
v1
v2 (latest)
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,041 papers shown
Title
D4AM: A General Denoising Framework for Downstream Acoustic Models
H. Wang
Yu Tsao
Hsin-Min Wang
Chu-Song Chen
70
4
0
28 Nov 2023
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
113
1
0
24 Nov 2023
Analysis of Visual Features for Continuous Lipreading in Spanish
David Gimeno-Gómez
Carlos David Martínez Hinarejos
99
2
0
21 Nov 2023
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
David Gimeno-Gómez
Carlos David Martínez Hinarejos
57
8
0
21 Nov 2023
Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method
M. Shahin
Julien Epps
Beena Ahmed
18
1
0
13 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
76
2
0
01 Nov 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
100
64
0
01 Nov 2023
MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition
Jiamin Xie
John H. L. Hansen
39
3
0
27 Oct 2023
Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition
Peng Fan
Changhao Shan
Sining Sun
Qing Yang
Jianwei Zhang
68
3
0
23 Oct 2023
Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
S. M. Fazle
J. Mondal
Meem Arafat Manab
Xi Xiao
Sarfaraz Newaz
AAML
151
0
0
18 Oct 2023
End-to-End real time tracking of children's reading with pointer network
Vishal Sunder
Beulah Karrolla
Eric Fosler-Lussier
20
0
0
17 Oct 2023
Correction Focused Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Ozlem Kalinli
KELM
98
3
0
17 Oct 2023
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
Zhihong Lei
Ernest Pusateri
Shiyi Han
Leo Liu
Mingbin Xu
...
R. Travadi
Youyuan Zhang
Mirko Hannemann
Man-Hung Siu
Zhen Huang
70
9
0
16 Oct 2023
Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring
Ankitha Sudarshan
Vinay Samuel
Parth Patwa
Ibtihel Amara
Aman Chadha
67
2
0
14 Oct 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
66
3
0
12 Oct 2023
Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
58
0
0
11 Oct 2023
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
S. Radhakrishnan
Chao-Han Huck Yang
S. Khan
Rohit Kumar
N. Kiani
D. Gómez-Cabrero
Jesper N. Tegnér
138
53
0
10 Oct 2023
ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Jiajun He
Zekun Yang
Tomoki Toda
85
7
0
08 Oct 2023
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition
Kaixun Huang
Aoting Zhang
Binbin Zhang
Tianyi Xu
Xingchen Song
Lei Xie
56
4
0
07 Oct 2023
Dementia Assessment Using Mandarin Speech with an Attention-based Speech Recognition Encoder
Zih-Jyun Lin
Yi-Ju Chen
P. Kuo
Likai Huang
Chaur-Jong Hu
Cheng-Yu Chen
30
2
0
06 Oct 2023
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Weiran Wang
Zelin Wu
D. Caseiro
Tsendsuren Munkhdalai
K. Sim
...
Rohit Prabhavalkar
Zhong Meng
Ding Zhao
Tara N. Sainath
P. M. Mengibar
104
6
0
29 Sep 2023
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Guodong Ma
Wenxuan Wang
Yuke Li
Yuting Yang
Binbin Du
Haoran Fu
60
6
0
28 Sep 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
89
3
0
27 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
Eng Siong Chng
99
48
0
27 Sep 2023
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution
Akshat Dewan
Michal Ziemski
Henri Meylan
Lorenzo Concina
Bruno Pouliquen
43
1
0
27 Sep 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
112
0
0
26 Sep 2023
On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
80
1
0
25 Sep 2023
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Xugang Lu
Peng Shen
Yu Tsao
Hisashi Kawai
93
6
0
24 Sep 2023
Memory-augmented conformer for improved end-to-end long-form ASR
Carlos Carvalho
A. Abad
RALM
61
1
0
22 Sep 2023
Massive End-to-end Models for Short Search Queries
Weiran Wang
Rohit Prabhavalkar
Dongseong Hwang
Qiujia Li
K. Sim
...
Zhong Meng
CJ Zheng
Yanzhang He
Tara N. Sainath
P. M. Mengibar
63
2
0
22 Sep 2023
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling
Zheng Nan
T. Dang
V. Sethu
Beena Ahmed
BDL
60
3
0
21 Sep 2023
Semi-Autoregressive Streaming ASR With Label Context
Siddhant Arora
G. Saon
Shinji Watanabe
Brian Kingsbury
AI4TS
64
6
0
19 Sep 2023
HypR: A comprehensive study for ASR hypothesis revising with a reference corpus
Yi-Wei Wang
Keda Lu
Kuan-Yu Chen
91
2
0
18 Sep 2023
Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition
Mohammad Zeineldeen
Albert Zeyer
Ralf Schluter
Hermann Ney
AuLLM
95
4
0
15 Sep 2023
Unimodal Aggregation for CTC-based Speech Recognition
Ying Fang
Xiaofei Li
67
2
0
15 Sep 2023
Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks
Sizhou Chen
Songyang Gao
Sen Fang
28
0
0
14 Sep 2023
Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition
Huaibo Zhao
Yosuke Higuchi
Yusuke Kida
Tetsuji Ogawa
Tetsunori Kobayashi
93
1
0
09 Sep 2023
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Jiaxu Zhu
Weinan Tong
Yaoxun Xu
Chang Song
Zhiyong Wu
Zhao You
Jane Polak Scowcroft
Dong Yu
Helen M. Meng
80
0
0
04 Sep 2023
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Jiaxu Zhu
Chang Song
Zhiyong Wu
Helen Meng
VLM
68
0
0
04 Sep 2023
Decoupled Structure for Improved Adaptability of End-to-End Models
Keqi Deng
P. Woodland
AuLLM
70
2
0
25 Aug 2023
KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Antoine Nzeyimana
SSL
136
0
0
23 Aug 2023
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Daobin Zhu
Xiangdong Su
Hongbin Zhang
86
1
0
15 Aug 2023
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition
Hanjing Zhu
Dongji Gao
Gaofeng Cheng
Daniel Povey
Pengyuan Zhang
Yonghong Yan
NoLa
76
4
0
12 Aug 2023
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability
Xian Shi
Yexin Yang
Zerui Li
Yanni Chen
Zhifu Gao
Shiliang Zhang
61
11
0
07 Aug 2023
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Fangyuan Wang
Ming Hao
Yuhai Shi
Bo Xu
MoMe
59
0
0
05 Aug 2023
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition
E. Tsunoo
Hayato Futami
Yosuke Kashiwagi
Siddhant Arora
Shinji Watanabe
76
4
0
24 Jul 2023
Globally Normalising the Transducer for Streaming Speech Recognition
Rogier van Dalen
68
0
0
20 Jul 2023
Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Theresa Pekarek-Rosin
S. Wermter
VLM
CLL
84
2
0
14 Jul 2023
Adapting an ASR Foundation Model for Spoken Language Assessment
Rao Ma
Mengjie Qian
Mark Gales
Kate Knill
63
14
0
13 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Zeping Min
Jinbo Wang
AuLLM
92
14
0
13 Jul 2023
Previous
1
2
3
4
5
6
...
19
20
21
Next