Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,033 papers shown
Title
Blank Collapse: Compressing CTC emission for the faster decoding
Minkyu Jung
Ohhyeok Kwon
S. Seo
Soonshin Seo
36
3
0
31 Oct 2022
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training
Ashish R. Mittal
D. Sivasubramanian
Rishabh K. Iyer
P. Jyothi
Ganesh Ramakrishnan
19
3
0
30 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
54
25
0
29 Oct 2022
Accelerating RNN-T Training and Inference Using CTC guidance
Yongqiang Wang
Zhehuai Chen
Cheng-yong Zheng
Yu Zhang
Wei Han
Parisa Haghani
40
23
0
29 Oct 2022
Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Zezhong Jin
Dading Zhong
Xiao Song
Zhaoyi Liu
Naipeng Ye
Qingcheng Zeng
11
2
0
28 Oct 2022
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition
Yist Y. Lin
Tao Han
Haihua Xu
Van Tung Pham
Yerbolat Khassanov
Tze Yuang Chong
Yi He
Lu Lu
Zejun Ma
13
2
0
28 Oct 2022
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Siddhant Arora
Siddharth Dalmia
Brian Yan
Florian Metze
A. Black
Shinji Watanabe
23
12
0
27 Oct 2022
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
16
8
0
26 Oct 2022
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
Xulong Zhang
Jianzong Wang
Ning Cheng
Mengyuan Zhao
Zhiyong Zhang
Jing Xiao
6
0
0
25 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
30
24
0
24 Oct 2022
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation
Thien Nguyen
Nathalie Tran
Liuhui Deng
Thiago Fraga da Silva
Matthew Radzihovsky
...
Honza Silovsky
Arnab Ghoshal
M. Martel
Bharat Ram Ambati
Mohamed Ali
35
5
0
21 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
Anchored Speech Recognition with Neural Transducers
Desh Raj
J. Jia
Jay Mahadeokar
Chunyang Wu
Niko Moritz
Xiaohui Zhang
Ozlem Kalinli
13
2
0
20 Oct 2022
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Yoshiki Masuyama
Xuankai Chang
Samuele Cornell
Shinji Watanabe
Nobutaka Ono
17
19
0
19 Oct 2022
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
Llion Jones
R. Sproat
Haruko Ishikawa
Alexander Gutkin
30
1
0
18 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
19
4
0
16 Oct 2022
A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
Rui Li
Guodong Ma
Dexin Zhao
Ranran Zeng
Xiaoyu Li
Haolin Huang
29
2
0
16 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
21
14
0
13 Oct 2022
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Chao-Han Huck Yang
I-Fan Chen
A. Stolcke
Sabato Marco Siniscalchi
Chin-Hui Lee
32
2
0
11 Oct 2022
CTC Alignments Improve Autoregressive Translation
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
46
33
0
11 Oct 2022
DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks
Simin Chen
Mirazul Haque
Cong Liu
Wei Yang
47
21
0
10 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
Mayumi Ohta
Julia Kreutzer
Stefan Riezler
19
0
0
05 Oct 2022
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models
R. Olivier
H. Abdullah
Bhiksha Raj
AAML
26
1
0
17 Sep 2022
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Ye Bai
Jie Li
W. Han
Hao Ni
Kaituo Xu
Zhuo Zhang
Cheng Yi
Xiaorui Wang
MoE
26
1
0
17 Sep 2022
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Kartik Audhkhasi
Yinghui Huang
Bhuvana Ramabhadran
Pedro J. Moreno
24
3
0
13 Sep 2022
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Hayato Futami
Hirofumi Inaguma
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
53
12
0
08 Sep 2022
Distilling the Knowledge of BERT for CTC-based ASR
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
21
8
0
05 Sep 2022
Vision-Language Adaptive Mutual Decoder for OOV-STR
Jinshui Hu
Chenyu Liu
Qiandong Yan
Xuyang Zhu
Jiajia Wu
Feng Yu
Bing Yin
VLM
32
0
0
02 Sep 2022
Bayesian Neural Network Language Modeling for Speech Recognition
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCV
BDL
44
14
0
28 Aug 2022
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
CVBM
30
22
0
25 Aug 2022
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Georgios Karakasidis
Tamás Grósz
M. Kurimo
22
2
0
10 Aug 2022
ASR Error Correction with Constrained Decoding on Operation Prediction
J. Yang
Rong-Zhi Li
Wei Peng
32
9
0
09 Aug 2022
Adversarial Attacks on ASR Systems: An Overview
Xiao Zhang
Hao Tan
Xuan Huang
Denghui Zhang
Keke Tang
Zhaoquan Gu
AAML
19
3
0
03 Aug 2022
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Jiatong Shi
G. Saon
David Haws
Shinji Watanabe
Brian Kingsbury
32
3
0
03 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Peng Shen
Xugang Lu
Hisashi Kawai
19
2
0
29 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
V. Trinh
Pegah Ghahremani
Brian King
J. Droppo
A. Stolcke
Roland Maas
MoMe
19
5
0
16 Jul 2022
PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music
Xiaoxue Gao
Chitralekha Gupta
Haizhou Li
33
7
0
15 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition
Guangzhi Sun
C. Zhang
P. Woodland
22
12
0
02 Jul 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
23
29
0
01 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
J. Dang
11
26
0
29 Jun 2022
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems
Jesús Andrés-Ferrer
Dario Albesano
P. Zhan
Paul Vozila
16
6
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
36
2
0
26 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
19
5
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
33
9
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Avoid Overfitting User Specific Information in Federated Keyword Spotting
Xin-Chun Li
Jin-Lin Tang
Shaoming Song
Bingshuai Li
Yinchuan Li
Yunfeng Shao
Le Gan
De-Chuan Zhan
FedML
AAML
30
9
0
17 Jun 2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Zhifu Gao
Shiliang Zhang
Ian Mcloughlin
Zhijie Yan
14
92
0
16 Jun 2022
Previous
1
2
3
...
5
6
7
...
19
20
21
Next