Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
v1
v2 (latest)
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,041 papers shown
Title
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
121
33
0
01 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
Jianwu Dang
84
27
0
29 Jun 2022
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems
Jesús Andrés-Ferrer
Dario Albesano
P. Zhan
Paul Vozila
58
6
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
77
2
0
26 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
77
5
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
80
9
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
102
16
0
20 Jun 2022
Avoid Overfitting User Specific Information in Federated Keyword Spotting
Xin-Chun Li
Jin-Lin Tang
Shaoming Song
Bingshuai Li
Yinchuan Li
Yunfeng Shao
Le Gan
De-Chuan Zhan
FedML
AAML
64
9
0
17 Jun 2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Zhifu Gao
Shiliang Zhang
Ian Mcloughlin
Zhijie Yan
98
110
0
16 Jun 2022
Residual Language Model for End-to-end Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
51
11
0
15 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
73
14
0
07 Jun 2022
Contextual Adapters for Personalized Speech Recognition in Neural Transducers
Kanthashree Mysore Sathyendra
Thejaswi Muniyappa
Feng-Ju Chang
Jing Liu
Jinru Su
Grant P. Strimel
Athanasios Mouchtaris
Siegfried Kunzmann
85
79
0
26 May 2022
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling
Kaitao Song
Yichong Leng
Xu Tan
Yicheng Zou
Tao Qin
Dongsheng Li
111
11
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
72
23
0
24 May 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Yuting Yang
Binbin Du
Yuke Li
71
1
0
24 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
98
36
0
22 May 2022
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
Chuxu Zhang
P. Woodland
103
14
0
18 May 2022
Evaluating Membership Inference Through Adversarial Robustness
Zhaoxi Zhang
L. Zhang
Xufei Zheng
Bilal Hussain Abbasi
Shengshan Hu
AAML
92
17
0
14 May 2022
Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Heli Qi
Sashi Novitasari
S. Sakti
Satoshi Nakamura
AI4TS
144
2
0
14 May 2022
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Zengrui Jin
Mengzhe Geng
Jiajun Deng
Tianzi Wang
Shujie Hu
Guinan Li
Xunying Liu
91
22
0
13 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
105
39
0
02 May 2022
How does a spontaneously speaking conversational agent affect user behavior?
Takahisa Iizuka
H. Mori
29
3
0
02 May 2022
Bilingual End-to-End ASR with Byte-Level Subwords
Liuhui Deng
Roger Hsiao
Arnab Ghoshal
42
4
0
01 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
131
183
0
27 Apr 2022
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
Gene-Ping Yang
Hao Tang
64
3
0
25 Apr 2022
Efficient Training of Neural Transducer for Speech Recognition
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
104
24
0
22 Apr 2022
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
55
0
0
20 Apr 2022
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Niko Moritz
Frank Seide
Duc Le
Jay Mahadeokar
Christian Fuegen
94
8
0
19 Apr 2022
Self-critical Sequence Training for Automatic Speech Recognition
Chen Chen
Yuchen Hu
Nana Hou
Xiaofeng Qi
Heqing Zou
Chng Eng Siong
76
16
0
13 Apr 2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Vishal Sunder
Eric Fosler-Lussier
Samuel Thomas
H. Kuo
Brian Kingsbury
83
7
0
11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition
N. J. Wang
Zongfeng Quan
Shaojun Wang
Jing Xiao
48
1
0
08 Apr 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Ye Du
Jie Zhang
Qiu-shi Zhu
Lirong Dai
Ming Wu
Xin Fang
Zhouwang Yang
68
2
0
05 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
109
191
0
02 Apr 2022
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
Guodong Ma
Pengfei Hu
Jian Kang
Shen Huang
Hao-Ming Huang
78
9
0
02 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
55
4
0
01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
108
8
0
31 Mar 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
69
51
0
31 Mar 2022
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Jingbei Li
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
45
21
0
31 Mar 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
52
19
0
31 Mar 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
63
5
0
30 Mar 2022
Recent improvements of ASR models in the face of adversarial attacks
R. Olivier
Bhiksha Raj
AAML
126
14
0
29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded encoders
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
78
16
0
29 Mar 2022
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
108
8
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
112
99
0
29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Lester Phillip Violeta
Wen-Chin Huang
Tomoki Toda
94
34
0
29 Mar 2022
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain Data
Chen Chen
Nana Hou
Yuchen Hu
Shashank Shirol
Chng Eng Siong
NoLa
103
43
0
29 Mar 2022
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Fangyuan Wang
Bo Xu
79
5
0
29 Mar 2022
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
A. Virkkunen
Aku Rouhe
Nhan Phan
M. Kurimo
104
4
0
28 Mar 2022
Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Yuchen Hu
Nana Hou
Chen Chen
Chng Eng Siong
99
15
0
28 Mar 2022
Joint Transformer/RNN Architecture for Gesture Typing in Indic Languages
Emil Biju
Anirudh Sriram
Mitesh M. Khapra
Pratyush Kumar
32
3
0
26 Mar 2022
Previous
1
2
3
...
6
7
8
...
19
20
21
Next