Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,033 papers shown
Title
K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables
Jounghee Kim
Pilsung Kang
VLM
29
6
0
11 Oct 2021
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Yosuke Higuchi
Niko Moritz
Jonathan Le Roux
Takaaki Hori
19
11
0
11 Oct 2021
Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition
Guoli Ye
V. Mazalov
Jinyu Li
Jiawei Liu
25
9
0
10 Oct 2021
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
Li Fu
Xiaoxiao Li
Runyu Wang
Lu Fan
Zhengchen Zhang
Meng Chen
Youzheng Wu
Xiaodong He
SSL
8
3
0
08 Oct 2021
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Yosuke Higuchi
Keita Karube
Tetsuji Ogawa
Tetsunori Kobayashi
18
23
0
08 Oct 2021
Improving Pseudo-label Training For End-to-end Speech Recognition Using Gradient Mask
Shaoshi Ling
Chen Shen
Meng Cai
Zejun Ma
VLM
SSL
22
8
0
08 Oct 2021
Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees
Yuanchao Wang
Wenjing Du
Chenghao Cai
Yanyan Xu
34
1
0
08 Oct 2021
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Binbin Zhang
Hang Lv
Pengcheng Guo
Qijie Shao
Chao Yang
...
Hui Bu
Xiaoyu Chen
Chenchen Zeng
Di Wu
Zhendong Peng
25
219
0
07 Oct 2021
BERT Attends the Conversation: Improving Low-Resource Conversational ASR
Pablo Ortiz
Simen Burud
34
4
0
05 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
62
20
0
05 Oct 2021
Multi-axis Attentive Prediction for Sparse EventData: An Application to Crime Prediction
Yi Sui
Ga Wu
Scott Sanner
18
2
0
05 Oct 2021
Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition
Tsendsuren Munkhdalai
K. Sim
Angad Chandorkar
Fan Gao
Mason Chua
Trevor Strohman
F. Beaufays
32
34
0
05 Oct 2021
Towards efficient end-to-end speech recognition with biologically-inspired neural networks
Thomas Bohnstingl
Ayush Garg
Stanislaw Wo'zniak
G. Saon
E. Eleftheriou
A. Pantazi
29
5
0
04 Oct 2021
Speech Technology for Everyone: Automatic Speech Recognition for Non-Native English with Transfer Learning
Toshiko Shibano
Xinyi Zhang
Miao Li
Haejin Cho
Peter Sullivan
Muhammad Abdul-Mageed
VLM
36
17
0
01 Oct 2021
Multimodal Emotion Recognition with High-level Speech and Text Features
M. R. Makiuchi
Kuniaki Uto
Koichi Shinoda
12
70
0
29 Sep 2021
Word-level confidence estimation for RNN transducers
Mingqiu Wang
H. Soltau
Laurent El Shafey
Izhak Shafran
UQCV
21
5
0
28 Sep 2021
Private Language Model Adaptation for Speech Recognition
Zhe Liu
Ke Li
Shreyan Bakshi
Fuchun Peng
34
6
0
28 Sep 2021
Factorized Neural Transducer for Efficient Language Model Adaptation
Xie Chen
Zhong Meng
S. Parthasarathy
Jinyu Li
21
39
0
27 Sep 2021
Training Spiking Neural Networks Using Lessons From Deep Learning
Jason Eshraghian
Max Ward
Emre Neftci
Xinxin Wang
Gregor Lenz
Girish Dwivedi
Bennamoun
Doo Seok Jeong
Wei D. Lu
40
433
0
27 Sep 2021
ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization
M. Gaudesi
F. Weninger
D. Sharma
P. Zhan
AAML
33
1
0
23 Sep 2021
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
32
26
0
19 Sep 2021
Dual-Encoder Architecture with Encoder Selection for Joint Close-Talk and Far-Talk Speech Recognition
F. Weninger
M. Gaudesi
Ralf Leibold
R. Gemello
P. Zhan
35
4
0
17 Sep 2021
PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription
Chen Zhang
Jiaxing Yu
Luchin Chang
Xu Tan
Jiawei Chen
Tao Qin
Kecheng Zhang
30
15
0
16 Sep 2021
Utterance-level neural confidence measure for end-to-end children speech recognition
W. Liu
Tan Lee
22
4
0
16 Sep 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
27
71
0
14 Sep 2021
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition
Rong Gong
Carl Quillen
D. Sharma
Andrew Goderre
José Laínez
Ljubomir Milanović
39
13
0
10 Sep 2021
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition
Guangzhi Sun
Chao Zhang
P. Woodland
27
30
0
01 Sep 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
26
12
0
24 Aug 2021
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Xiaodong Cui
Brian Kingsbury
G. Saon
David Haws
Zoltán Tüske
19
5
0
24 Aug 2021
Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer
Krishna D N Freshworks
29
7
0
22 Aug 2021
A Dual-Decoder Conformer for Multilingual Speech Recognition
Krishna D N Freshworks
14
1
0
22 Aug 2021
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition
Krishna D N Freshworks
24
11
0
22 Aug 2021
Generalizing RNN-Transducer to Out-Domain Audio via Sparse Self-Attention Layers
Juntae Kim
Jee-Hye Lee
32
6
0
22 Aug 2021
A Light-weight contextual spelling correction model for customizing transducer-based speech recognition systems
Xiaoqiang Wang
Yanqing Liu
Sheng Zhao
Jinyu Li
KELM
21
15
0
17 Aug 2021
SpecMix : A Mixed Sample Data Augmentation method for Training withTime-Frequency Domain Features
Gwantae Kim
D. Han
Hanseok Ko
47
42
0
06 Aug 2021
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification
Yiding Jiang
Bidisha Sharma
Maulik C. Madhavi
Haizhou Li
36
25
0
05 Aug 2021
Adversarial Data Augmentation for Disordered Speech Recognition
Zengrui Jin
Mengzhe Geng
Xurong Xie
Jianwei Yu
Shansong Liu
Xunying Liu
Helen Meng
14
35
0
02 Aug 2021
Facetron: A Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Seyun Um
Jihyun Kim
Jihyun Lee
Hong-Goo Kang
CVBM
13
4
0
26 Jul 2021
Ensemble of Convolution Neural Networks on Heterogeneous Signals for Sleep Stage Scoring
Enrique Fernández-Blanco
C. Fernandez-Lozano
A. Pazos
Daniel Rivero
13
4
0
23 Jul 2021
VAD-free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording
Hirofumi Inaguma
Tatsuya Kawahara
19
2
0
15 Jul 2021
A Configurable Multilingual Model is All You Need to Recognize All Languages
Long Zhou
Jinyu Li
Eric Sun
Shujie Liu
100
40
0
13 Jul 2021
ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data
K. Cheuk
Dorien Herremans
Li Su
58
32
0
11 Jul 2021
On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Xiaohui Zhang
Vimal Manohar
David C. Zhang
Frank Zhang
Yangyang Shi
Nayan Singhal
Julian Chan
Fuchun Peng
Yatharth Saraf
M. Seltzer
20
14
0
09 Jul 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Tomohiro Tanaka
Ryo Masumura
Mana Ihori
Akihiko Takashima
Shota Orihashi
Naoki Makishima
19
4
0
07 Jul 2021
Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition
Christian Huber
Juan Hussain
Sebastian Stüker
A. Waibel
26
24
0
05 Jul 2021
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Timo Lohrenz
P. Schwarz
Zhengyang Li
Tim Fingscheidt
29
11
0
02 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
41
12
0
01 Jul 2021
On joint training with interfaces for spoken language understanding
A. Raju
Milind Rao
Gautam Tiwari
Pranav Dheram
Bryan Anderson
Zhe Zhang
Chul Lee
Bach Bui
Ariya Rastrow
VLM
21
11
0
30 Jun 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Where are we in semantic concept extraction for Spoken Language Understanding?
Sahar Ghannay
Antoine Caubrière
Salima Mdhaffar
G. Laperriere
Bassam Jabaian
Yannick Esteve
12
18
0
24 Jun 2021
Previous
1
2
3
...
8
9
10
...
19
20
21
Next