Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
v1
v2 (latest)
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,041 papers shown
Title
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition
Alex Bie
Bharat Venkitesh
João Monteiro
Md. Akmal Haidar
Mehdi Rezagholizadeh
MQ
139
27
0
09 Nov 2019
Emotional speech synthesis with rich and granularized control
Seyun Um
Sangshin Oh
Kyungguen Byun
Inseon Jang
C. Ahn
Hong-Goo Kang
85
90
0
05 Nov 2019
RNN-T For Latency Controlled ASR With Improved Beam Search
Mahaveer Jain
Kjell Schubert
Jay Mahadeokar
Ching-Feng Yeh
Kaustubh Kalgaonkar
Anuroop Sriram
Christian Fuegen
M. Seltzer
80
45
0
05 Nov 2019
What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis
Chung-Yi Li
Pei-Chieh Yuan
Hung-yi Lee
71
31
0
04 Nov 2019
Improving Generalization of Transformer for Speech Recognition with Parallel Schedule Sampling and Relative Positional Embedding
Pan Zhou
Ruchao Fan
Wei Chen
Jia Jia
93
26
0
01 Nov 2019
Masked Language Model Scoring
Julian Salazar
Davis Liang
Toan Q. Nguyen
Katrin Kirchhoff
37
14
0
31 Oct 2019
Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer
Genta Indra Winata
Samuel Cahyawijaya
Zhaojiang Lin
Zihan Liu
Pascale Fung
92
76
0
30 Oct 2019
Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
T. Nguyen
S. Stueker
Jan Niehues
A. Waibel
94
98
0
29 Oct 2019
Transformer-based Cascaded Multimodal Speech Translation
Zixiu "Alex" Wu
Ozan Caglayan
Julia Ive
Josiah Wang
Lucia Specia
72
7
0
29 Oct 2019
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention
Ching-Feng Yeh
Jay Mahadeokar
Kaustubh Kalgaonkar
Yongqiang Wang
Duc Le
Mahaveer Jain
Kjell Schubert
Christian Fuegen
M. Seltzer
100
150
0
28 Oct 2019
Sequence-to-sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding
Alexander H. Liu
Tzu-Wei Sung
Shun-Po Chuang
Hung-yi Lee
Lin-Shan Lee
64
13
0
28 Oct 2019
Training ASR models by Generation of Contextual Information
Kritika Singh
Dmytro Okhonko
Jun Liu
Yongqiang Wang
Frank Zhang
...
Sergey Edunov
Fuchun Peng
Yatharth Saraf
Geoffrey Zweig
Abdel-rahman Mohamed
61
7
0
27 Oct 2019
Exploring Lexicon-Free Modeling Units for End-to-End Korean and Korean-English Code-Switching Speech Recognition
Jisung Wang
Jihwan Kim
Sangki Kim
Yeha Lee
53
5
0
25 Oct 2019
Towards Online End-to-end Transformer Automatic Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
86
32
0
25 Oct 2019
Recognizing long-form speech using streaming end-to-end models
A. Narayanan
Rohit Prabhavalkar
Chung-Cheng Chiu
David Rybach
Tara N. Sainath
Trevor Strohman
79
130
0
24 Oct 2019
An Empirical Study of Efficient ASR Rescoring with Transformers
Hongzhao Huang
Fuchun Peng
KELM
35
22
0
24 Oct 2019
Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model
Oleksii Hrinchuk
Mariya Popova
Boris Ginsburg
VLM
62
90
0
23 Oct 2019
A practical two-stage training strategy for multi-stream end-to-end speech recognition
Ruizhi Li
Gregory Sell
Xiaofei Wang
Shinji Watanabe
H. Hermansky
50
7
0
23 Oct 2019
A Transformer with Interleaved Self-attention and Convolution for Hybrid Acoustic Models
Liang Lu
89
4
0
23 Oct 2019
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR
Duc Le
T. Koehler
Christian Fuegen
M. Seltzer
78
16
0
22 Oct 2019
Improving Transformer-based Speech Recognition Using Unsupervised Pre-training
Dongwei Jiang
Xiaoning Lei
Wubo Li
Ne Luo
Yuxuan Hu
Wei Zou
Xiangang Li
91
99
0
22 Oct 2019
Discriminative Neural Clustering for Speaker Diarisation
Qiujia Li
Florian Kreyssig
Chao Zhang
P. Woodland
69
46
0
22 Oct 2019
Transformer ASR with Contextual Block Processing
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
120
64
0
16 Oct 2019
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
104
121
0
15 Oct 2019
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach
Noé Tits
Kevin El Haddad
Thierry Dutoit
57
8
0
14 Oct 2019
One-To-Many Multilingual End-to-end Speech Translation
Mattia Antonino Di Gangi
Matteo Negri
Marco Turchi
89
51
0
08 Oct 2019
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition
Duc Le
Xiaohui Zhang
Weiyi Zheng
C. Fügen
Geoffrey Zweig
M. Seltzer
92
64
0
02 Oct 2019
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Kyu Jeong Han
R. Prieto
Kaixing(Kai) Wu
T. Ma
134
70
0
01 Oct 2019
Multilingual End-to-End Speech Translation
Hirofumi Inaguma
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
LRM
107
88
0
01 Oct 2019
End-to-End Code-Switching ASR for Low-Resourced Language Pairs
Xianghu Yue
Grandee Lee
Emre Yilmaz
Fang Deng
Haizhou Li
65
31
0
27 Sep 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Jiawei Liu
84
170
0
26 Sep 2019
Optimizing Speech Recognition For The Edge
Yuan Shangguan
Jian Li
Qiao Liang
R. Álvarez
Ian McGraw
87
64
0
26 Sep 2019
Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help?
Chitralekha Gupta
Emre Yilmaz
Haizhou Li
68
14
0
23 Sep 2019
Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
45
5
0
22 Sep 2019
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Yiming Wang
Tongfei Chen
Hainan Xu
Shuoyang Ding
Hang Lv
Yiwen Shao
Nanyun Peng
Lei Xie
Shinji Watanabe
Sanjeev Khudanpur
VLM
105
73
0
18 Sep 2019
Acoustic scene analysis with multi-head attention networks
Weimin Wang
Weiran Wang
Ming Sun
Chao Wang
44
3
0
16 Sep 2019
An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
K. Sim
P. Zadrazil
F. Beaufays
88
58
0
14 Sep 2019
Metric-Based Few-Shot Learning for Video Action Recognition
Chris Careaga
Brian Hutchinson
Nathan Oken Hodas
Lawrence Phillips
143
22
0
14 Sep 2019
Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li
Chao Zhang
P. Woodland
63
20
0
14 Sep 2019
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
267
308
0
14 Sep 2019
Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade
J. Pino
Liezl Puzon
Jiatao Gu
Xutai Ma
Arya D. McCarthy
D. Gopinath
25
3
0
14 Sep 2019
A Comparative Study on Transformer vs RNN in Speech Applications
Shigeki Karita
Nanxin Chen
Tomoki Hayashi
Takaaki Hori
Hirofumi Inaguma
...
Ryuichi Yamamoto
Xiao-fei Wang
Shinji Watanabe
Takenori Yoshimura
Wangyou Zhang
99
722
0
13 Sep 2019
End-to-End Neural Speaker Diarization with Self-attention
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
Shinji Watanabe
240
243
0
13 Sep 2019
Preech: A System for Privacy-Preserving Speech Transcription
Shimaa Ahmed
Amrita Roy Chowdhury
Kassem Fawaz
P. Ramanathan
127
48
0
09 Sep 2019
Learning Alignment for Multimodal Emotion Recognition from Speech
Haiyang Xu
Hui Zhang
Kun Han
Yun Wang
Yiping Peng
Xiangang Li
59
126
0
06 Sep 2019
Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Network
Jivitesh Sharma
Ole-Christoffer Granmo
M. G. Olsen
124
16
0
28 Aug 2019
Neural Cognitive Diagnosis for Intelligent Education Systems
Fei-Yue Wang
Qi Liu
Enhong Chen
Zhenya Huang
Yuying Chen
Yu Yin
Zai Huang
Shijin Wang
AI4Ed
77
238
0
23 Aug 2019
Survey on Deep Neural Networks in Speech and Vision Systems
M. Alam
Manar D. Samad
Lasitha Vidyaratne
Alexander M. Glandon
Khan M. Iftekharuddin
3DV
VLM
AI4TS
100
212
0
16 Aug 2019
End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning
Pavel Denisov
Ngoc Thang Vu
57
27
0
13 Aug 2019
DELTA: A DEep learning based Language Technology plAtform
Kun Han
Junwen Chen
Hui Zhang
Haiyang Xu
Yiping Peng
...
Cheng Gong
Yunbo Wang
Wei Zou
Hui Song
Xiangang Li
VLM
18
10
0
02 Aug 2019
Previous
1
2
3
...
15
16
17
...
19
20
21
Next