ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some
  benchmarks
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks
Anssi Moisio
Dejan Porjazovski
Aku Rouhe
Yaroslav Getman
A. Virkkunen
Tamás Grósz
Krister Lindén
M. Kurimo
118
23
0
24 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network
  Fail in Deep Learning? (Provably)
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
68
97
0
23 Mar 2022
Transformer-based Streaming ASR with Cumulative Attention
Transformer-based Streaming ASR with Cumulative Attention
Mohan Li
Shucong Zhang
Catalin Zorila
R. Doddipatla
111
9
0
11 Mar 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
Md. Imran Hossen
X. Hei
71
5
0
05 Mar 2022
Towards Contextual Spelling Correction for Customization of End-to-end
  Speech Recognition Systems
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Xiaoqiang Wang
Yanqing Liu
Jinyu Li
Veljko Miljanic
Sheng Zhao
H. Khalil
KELM
76
19
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDLAI4TSSSL
101
11
0
01 Mar 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical
  Applications: A Survey
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
68
6
0
22 Feb 2022
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric
  and Elderly Speech Recognition
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition
Mengzhe Geng
Xurong Xie
Zi Ye
Tianzi Wang
Guinan Li
Shujie Hu
Xunying Liu
Helen Meng
79
33
0
21 Feb 2022
Learning Representations Robust to Group Shifts and Adversarial Examples
Learning Representations Robust to Group Shifts and Adversarial Examples
Ming-Chang Chiu
Xuezhe Ma
OOD
53
0
0
18 Feb 2022
End-to-end contextual asr based on posterior distribution adaptation for
  hybrid ctc/attention system
End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system
Zheng Zhang
Pan Zhou
64
6
0
18 Feb 2022
Knowledge Transfer from Large-scale Pretrained Language Models to
  End-to-end Speech Recognizers
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Yotaro Kubo
Shigeki Karita
M. Bacchiani
53
27
0
16 Feb 2022
Conversational Speech Recognition By Learning Conversation-level
  Characteristics
Conversational Speech Recognition By Learning Conversation-level Characteristics
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
82
9
0
16 Feb 2022
USTED: Improving ASR with a Unified Speech and Text Encoder-Decoder
USTED: Improving ASR with a Unified Speech and Text Encoder-Decoder
Bolaji Yusuf
Ankur Gandhe
Alex Sokolov
117
9
0
12 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with
  Transfer Learning and Language Model Decoding
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
83
11
0
10 Feb 2022
ASRPU: A Programmable Accelerator for Low-Power Automatic Speech
  Recognition
ASRPU: A Programmable Accelerator for Low-Power Automatic Speech Recognition
D. Pinto
J. Arnau
Antonio González
35
0
0
10 Feb 2022
Semantic-aware Speech to Text Transmission with Redundancy Removal
Semantic-aware Speech to Text Transmission with Redundancy Removal
Tian Han
Qianqian Yang
Zhiguo Shi
Shibo He
Zhaoyang Zhang
79
18
0
07 Feb 2022
Joint Speech Recognition and Audio Captioning
Joint Speech Recognition and Audio Captioning
Chaitanya Narisetty
E. Tsunoo
Xuankai Chang
Yosuke Kashiwagi
Michael Hentschel
Shinji Watanabe
49
10
0
03 Feb 2022
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Liyan Xu
Yile Gu
J. Kolehmainen
Haidar Khan
Ankur Gandhe
Ariya Rastrow
A. Stolcke
I. Bulyko
118
48
0
02 Feb 2022
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
P. Mihajlik
A. Balog
T. E. Gráczi
A. Kohári
Balázs Tarján
K. Mády
51
8
0
01 Feb 2022
Transformer-based Models of Text Normalization for Speech Applications
Transformer-based Models of Text Normalization for Speech Applications
Jae Hun Ro
Felix Stahlberg
Ke Wu
Shankar Kumar
69
7
0
01 Feb 2022
Improving End-to-End Contextual Speech Recognition with Fine-Grained
  Contextual Knowledge Selection
Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection
Minglun Han
Linhao Dong
Zhenlin Liang
Meng Cai
Shiyu Zhou
Zejun Ma
Bo Xu
87
46
0
30 Jan 2022
Reducing language context confusion for end-to-end code-switching
  automatic speech recognition
Reducing language context confusion for end-to-end code-switching automatic speech recognition
Shuai Zhang
Jiangyan Yi
Zhengkun Tian
J. Tao
Y. Yeung
Liqun Deng
85
12
0
28 Jan 2022
On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End
  Mandarin Chinese ASR
On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End Mandarin Chinese ASR
Zhao Yang
Dianwen Ng
Xiao Fu
Liping Han
Wei Xi
Ruimeng Wang
Rui Jiang
Jizhong Zhao
87
3
0
26 Jan 2022
Improving the fusion of acoustic and text representations in RNN-T
Improving the fusion of acoustic and text representations in RNN-T
Chao Zhang
Yue Liu
Zhiyun Lu
Tara N. Sainath
Shuo-yiin Chang
AI4CE
102
12
0
25 Jan 2022
Run-and-back stitch search: novel block synchronous decoding for
  streaming encoder-decoder ASR
Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR
E. Tsunoo
Chaitanya Narisetty
Michael Hentschel
Yosuke Kashiwagi
Shinji Watanabe
53
2
0
25 Jan 2022
Recent Progress in the CUHK Dysarthric Speech Recognition System
Recent Progress in the CUHK Dysarthric Speech Recognition System
Shansong Liu
Mengzhe Geng
Shoukang Hu
Xurong Xie
Mingyu Cui
Jianwei Yu
Xunying Liu
Helen Meng
57
61
0
15 Jan 2022
Spectro-Temporal Deep Features for Disordered Speech Assessment and
  Recognition
Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Mengzhe Geng
Shansong Liu
Jianwei Yu
Xurong Xie
Shoukang Hu
Zi Ye
Zengrui Jin
Xunying Liu
Helen Meng
71
22
0
14 Jan 2022
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks
Shou-Yong Hu
Xurong Xie
Mingyu Cui
Jiajun Deng
Shansong Liu
Jianwei Yu
Mengzhe Geng
Xunying Liu
Helen Meng
99
27
0
08 Jan 2022
Two-Pass End-to-End ASR Model Compression
Two-Pass End-to-End ASR Model Compression
Nauman Dawalatabad
Tushar Vatsal
Ashutosh Gupta
Sungsoo Kim
Shatrughan Singh
Dhananjaya N. Gowda
Chanwoo Kim
42
5
0
08 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries
Sign Language Video Retrieval with Free-Form Textual Queries
A. Duarte
Samuel Albanie
Xavier Giró-i-Nieto
Gül Varol
SLR
90
29
0
07 Jan 2022
Improving Mandarin End-to-End Speech Recognition with Word N-gram
  Language Model
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
64
11
0
06 Jan 2022
Discrete and continuous representations and processing in deep learning:
  Looking forward
Discrete and continuous representations and processing in deep learning: Looking forward
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
95
20
0
04 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural
  Language Question
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
75
14
0
04 Jan 2022
Voice Quality and Pitch Features in Transformer-Based Speech Recognition
Voice Quality and Pitch Features in Transformer-Based Speech Recognition
Guillermo Cámbara
Jordi Luque
Mireia Farrús
52
0
0
21 Dec 2021
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
  Label Mixing
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing
Joonhyung Park
J. Yang
Jinwoo Shin
Sung Ju Hwang
Eunho Yang
70
24
0
16 Dec 2021
Prompt Tuning GPT-2 language model for parameter-efficient domain
  adaptation of ASR systems
Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems
Saket Dingliwal
Ashish Shenoy
S. Bodapati
Ankur Gandhe
R. Gadde
Katrin Kirchhoff
VLM
72
4
0
16 Dec 2021
Improving Hybrid CTC/Attention End-to-end Speech Recognition with
  Pretrained Acoustic and Language Model
Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Keqi Deng
Songjun Cao
Yike Zhang
Long Ma
VLM
56
31
0
14 Dec 2021
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit
  Training for Phonetic-Reduction-Robust E2E Speech Recognition
PM-MMUT: Boosted Phone-Mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
Guodong Ma
Pengfei Hu
Nurmemet Yolwas
Shen Huang
Hao-Ming Huang
92
4
0
13 Dec 2021
Consistent Training and Decoding For End-to-end Speech Recognition Using
  Lattice-free MMI
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Jinchuan Tian
Jianwei Yu
Chao Weng
Shi-Xiong Zhang
Jane Polak Scowcroft
Dong Yu
Yuexian Zou
AuLLM
77
13
0
05 Dec 2021
Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding
Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding
Weiran Wang
Ke Hu
Tara N. Sainath
65
21
0
01 Dec 2021
Mixed Precision Low-bit Quantization of Neural Network Language Models
  for Speech Recognition
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition
Junhao Xu
Jianwei Yu
Shoukang Hu
Xunying Liu
Helen Meng
MQ
109
15
0
29 Nov 2021
Lattention: Lattice-attention in ASR rescoring
Lattention: Lattice-attention in ASR rescoring
Prabhat Pandey
Sergio Duarte Torres
Ali Orkan Bayer
Ankur Gandhe
Volker Leutnant
44
7
0
19 Nov 2021
A comparison of streaming models and data augmentation methods for
  robust speech recognition
A comparison of streaming models and data augmentation methods for robust speech recognition
Jiyeon Kim
Mehul Kumar
Dhananjaya N. Gowda
Abhinav Garg
Chanwoo Kim
86
6
0
19 Nov 2021
Integrated Semantic and Phonetic Post-correction for Chinese Speech
  Recognition
Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition
Yi-Chang Chen
Chun-Yen Cheng
Chien-An Chen
Ming-Chieh Sung
Yi-Ren Yeh
61
6
0
16 Nov 2021
Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer
  in ASR
Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Ondˇrej Klejch
E. Wallington
P. Bell
54
12
0
12 Nov 2021
Enhancing Backdoor Attacks with Multi-Level MMD Regularization
Enhancing Backdoor Attacks with Multi-Level MMD Regularization
Pengfei Xia
Hongjing Niu
Ziqiang Li
Bin Li
AAML
78
31
0
09 Nov 2021
Conformer-based Hybrid ASR System for Switchboard Dataset
Conformer-based Hybrid ASR System for Switchboard Dataset
Mohammad Zeineldeen
Jingjing Xu
Christoph Luscher
Wilfried Michel
Alexander Gerstenberger
Ralf Schluter
Hermann Ney
79
25
0
05 Nov 2021
Context-Aware Transformer Transducer for Speech Recognition
Context-Aware Transformer Transducer for Speech Recognition
Feng-Ju Chang
Jing Liu
Martin H. Radfar
Athanasios Mouchtaris
M. Omologo
Ariya Rastrow
Siegfried Kunzmann
71
85
0
05 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
176
379
0
02 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric
  Action Recognition
With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
125
46
0
01 Nov 2021
Previous
123...789...192021
Next