ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
Training for Speech Recognition on Coprocessors
Training for Speech Recognition on Coprocessors
Sebastian Baunsgaard
S. Wrede
Pınar Tözün
40
6
0
22 Mar 2020
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Deliberation Model Based Two-Pass End-to-End Speech Recognition
Ke Hu
Tara N. Sainath
Ruoming Pang
Rohit Prabhavalkar
97
87
0
17 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang
Yukun Zhu
Bradley Green
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
3DPC
153
676
0
17 Mar 2020
Hybrid Autoregressive Transducer (hat)
Hybrid Autoregressive Transducer (hat)
Ehsan Variani
David Rybach
Cyril Allauzen
Michael Riley
84
160
0
12 Mar 2020
Toward Cross-Domain Speech Recognition with End-to-End Models
Toward Cross-Domain Speech Recognition with End-to-End Models
T. Nguyen
Sebastian Stüker
A. Waibel
64
7
0
09 Mar 2020
A Density Ratio Approach to Language Model Fusion in End-To-End
  Automatic Speech Recognition
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Erik McDermott
Hasim Sak
Ehsan Variani
75
113
0
26 Feb 2020
Distributed Training of Deep Neural Network Acoustic Models for
  Automatic Speech Recognition
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition
Xiaodong Cui
Wei Zhang
Ulrich Finkler
G. Saon
M. Picheny
David S. Kung
44
19
0
24 Feb 2020
End-to-End Neural Diarization: Reformulating Speaker Diarization as
  Simple Multi-label Classification
End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification
Yusuke Fujita
Shinji Watanabe
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
52
49
0
24 Feb 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDLAI4TS
97
116
0
20 Feb 2020
Stroke Constrained Attention Network for Online Handwritten Mathematical
  Expression Recognition
Stroke Constrained Attention Network for Online Handwritten Mathematical Expression Recognition
Jiaming Wang
Jun Du
Jianshu Zhang
69
24
0
20 Feb 2020
Rnn-transducer with language bias for end-to-end Mandarin-English
  code-switching speech recognition
Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition
Shuai Zhang
Jiangyan Yi
Zhengkun Tian
J. Tao
Ye Bai
56
27
0
19 Feb 2020
Small energy masking for improved neural network training for end-to-end
  speech recognition
Small energy masking for improved neural network training for end-to-end speech recognition
Chanwoo Kim
Kwangyoun Kim
S. Indurthi
55
8
0
15 Feb 2020
Looking Enhances Listening: Recovering Missing Speech Using Images
Looking Enhances Listening: Recovering Missing Speech Using Images
Tejas Srinivasan
Ramon Sanabria
Florian Metze
72
15
0
13 Feb 2020
Attentional Speech Recognition Models Misbehave on Out-of-domain
  Utterances
Attentional Speech Recognition Models Misbehave on Out-of-domain Utterances
Phillip Keung
Wei Niu
Y. Lu
Julian Salazar
Vikas Bhardwaj
72
9
0
12 Feb 2020
Accelerating RNN Transducer Inference via One-Step Constrained Beam
  Search
Accelerating RNN Transducer Inference via One-Step Constrained Beam Search
Juntae Kim
Yoonhan Lee
67
24
0
10 Feb 2020
Audio-Visual Decision Fusion for WFST-based and seq2seq Models
Audio-Visual Decision Fusion for WFST-based and seq2seq Models
R. Aralikatti
Sharad Roy
Abhinav Thanda
D. Margam
Pujitha Appan Kandala
Tanay Sharma
S. Venkatesan
34
1
0
29 Jan 2020
Data Techniques For Online End-to-end Speech Recognition
Data Techniques For Online End-to-end Speech Recognition
Yang Chen
Weiran Wang
I-Fan Chen
Chao Wang
35
4
0
24 Jan 2020
Semi-supervised ASR by End-to-end Self-training
Semi-supervised ASR by End-to-end Self-training
Yang Chen
Weiran Wang
Chao Wang
72
53
0
24 Jan 2020
Single headed attention based sequence-to-sequence model for
  state-of-the-art results on Switchboard
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard
Zoltán Tüske
G. Saon
Kartik Audhkhasi
Brian Kingsbury
BDL
104
69
0
20 Jan 2020
Transformer-based Online CTC/attention End-to-End Speech Recognition
  Architecture
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Haoran Miao
Gaofeng Cheng
Changfeng Gao
Pengyuan Zhang
Yonghong Yan
64
104
0
15 Jan 2020
Domain Adaptation via Teacher-Student Learning for End-to-End Speech
  Recognition
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Zhong Meng
Jinyu Li
Yashesh Gaur
Jiawei Liu
85
50
0
06 Jan 2020
Character-Aware Attention-Based End-to-End Speech Recognition
Character-Aware Attention-Based End-to-End Speech Recognition
Zhong Meng
Yashesh Gaur
Jinyu Li
Jiawei Liu
62
10
0
06 Jan 2020
Speaker-aware speech-transformer
Speaker-aware speech-transformer
Zhiyun Fan
Jie Li
Shiyu Zhou
Bo Xu
BDL
81
22
0
02 Jan 2020
Improved Multi-Stage Training of Online Attention-based Encoder-Decoder
  Models
Improved Multi-Stage Training of Online Attention-based Encoder-Decoder Models
Abhinav Garg
Dhananjaya N. Gowda
Ankur Kumar
Kwangyoun Kim
Mehul Kumar
Chanwoo Kim
3DV
44
15
0
28 Dec 2019
Cross-scale Attention Model for Acoustic Event Classification
Cross-scale Attention Model for Acoustic Event Classification
Xugang Lu
Peng Shen
Sheng Li
Yu Tsao
Hisashi Kawai
34
2
0
27 Dec 2019
end-to-end training of a large vocabulary end-to-end speech recognition
  system
end-to-end training of a large vocabulary end-to-end speech recognition system
Chanwoo Kim
Sungsoo Kim
Kwangyoun Kim
Mehul Kumar
Jiyeon Kim
...
Eunhyang Kim
Minkyoo Shin
Shatrughan Singh
Larry Heck
Dhananjaya N. Gowda
61
27
0
22 Dec 2019
End-to-end training of time domain audio separation and recognition
End-to-end training of time domain audio separation and recognition
Thilo von Neumann
K. Kinoshita
Lukas Drude
Christoph Boeddeker
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
80
34
0
18 Dec 2019
Application of Word2vec in Phoneme Recognition
Application of Word2vec in Phoneme Recognition
Xin Feng
Lei Wang
28
3
0
17 Dec 2019
Synchronous Speech Recognition and Speech-to-Text Translation with
  Interactive Decoding
Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Yuchen Liu
Jiajun Zhang
Hao Xiong
Long Zhou
Zhongjun He
Hua Wu
Haifeng Wang
Chengqing Zong
90
71
0
16 Dec 2019
Learning to Model Aspects of Hearing Perception Using Neural Loss
  Functions
Learning to Model Aspects of Hearing Perception Using Neural Loss Functions
Prateek Verma
J. Berger
AAML
44
3
0
11 Dec 2019
SpecAugment on Large Scale Datasets
SpecAugment on Large Scale Datasets
Daniel S. Park
Yu Zhang
Chung-Cheng Chiu
Youzheng Chen
Yue Liu
William Chan
Quoc V. Le
Yonghui Wu
95
138
0
11 Dec 2019
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DVAI4TSMedIm
142
332
0
04 Dec 2019
Integrating Knowledge into End-to-End Speech Recognition from External
  Text-Only Data
Integrating Knowledge into End-to-End Speech Recognition from External Text-Only Data
Ye Bai
Jiangyan Yi
J. Tao
Zhengqi Wen
Zhengkun Tian
Shuai Zhang
55
2
0
04 Dec 2019
Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
Verena Heusser
Niklas Freymuth
Stefan Constantin
A. Waibel
92
26
0
29 Nov 2019
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech
  Recognition
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition
Chao Weng
Chengzhu Yu
Jia Cui
Chunlei Zhang
Dong Yu
148
39
0
28 Nov 2019
AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks
  for End-to-end Speech Recognition
AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition
Yi-Chen Chen
Zhaojun Yang
Ching-Feng Yeh
Mahaveer Jain
M. Seltzer
72
32
0
27 Nov 2019
Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
Ya Zhao
Rui Xu
Xinchao Wang
Peng Hou
Haihong Tang
Xiuming Zhang
68
91
0
26 Nov 2019
Independent language modeling architecture for end-to-end ASR
Independent language modeling architecture for end-to-end ASR
Van Tung Pham
Haihua Xu
Yerbolat Khassanov
Zhiping Zeng
Chng Eng Siong
Chongjia Ni
B. Ma
Haizhou Li
AuLLM
47
15
0
25 Nov 2019
Improving N-gram Language Models with Pre-trained Deep Transformer
Improving N-gram Language Models with Pre-trained Deep Transformer
Yiren Wang
Hongzhao Huang
Zhe Liu
Yutong Pang
Yongqiang Wang
Chengxiang Zhai
Fuchun Peng
25
8
0
22 Nov 2019
On Using SpecAugment for End-to-End Speech Translation
On Using SpecAugment for End-to-End Speech Translation
Parnia Bahar
Albert Zeyer
Ralf Schluter
Hermann Ney
97
54
0
20 Nov 2019
A Comparative Study on End-to-end Speech to Text Translation
A Comparative Study on End-to-end Speech to Text Translation
Parnia Bahar
Tobias Bieschke
Hermann Ney
94
80
0
20 Nov 2019
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSLAI4TS
141
248
0
19 Nov 2019
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech
  Recognition
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
Jibin Wu
Emre Yilmaz
Malu Zhang
Haizhou Li
Kay Chen Tan
80
107
0
19 Nov 2019
NeuMMU: Architectural Support for Efficient Address Translations in
  Neural Processing Units
NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units
Bongjoon Hyun
Youngeun Kwon
Yujeong Choi
John Kim
Minsoo Rhu
67
29
0
15 Nov 2019
Structured Sparsification of Gated Recurrent Neural Networks
Structured Sparsification of Gated Recurrent Neural Networks
E. Lobacheva
Nadezhda Chirkova
Alexander Markovich
Dmitry Vetrov
59
3
0
13 Nov 2019
Word-level Lexical Normalisation using Context-Dependent Embeddings
Word-level Lexical Normalisation using Context-Dependent Embeddings
Michael Stewart
Wei Liu
R. Cardell-Oliver
8
4
0
13 Nov 2019
The Deep Learning Revolution and Its Implications for Computer
  Architecture and Chip Design
The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design
J. Dean
60
79
0
13 Nov 2019
Listen and Fill in the Missing Letters: Non-Autoregressive Transformer
  for Speech Recognition
Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition
Nanxin Chen
Shinji Watanabe
Jesús Villalba
Najim Dehak
72
16
0
10 Nov 2019
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models
Siddharth Dalmia
Abdel-rahman Mohamed
M. Lewis
Florian Metze
Luke Zettlemoyer
55
11
0
09 Nov 2019
Speaker Adaptation for Attention-Based End-to-End Speech Recognition
Speaker Adaptation for Attention-Based End-to-End Speech Recognition
Zhong Meng
Yashesh Gaur
Jinyu Li
Jiawei Liu
53
38
0
09 Nov 2019
Previous
123...141516...192021
Next