ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXivPDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,033 papers shown
Title
Blank Collapse: Compressing CTC emission for the faster decoding
Blank Collapse: Compressing CTC emission for the faster decoding
Minkyu Jung
Ohhyeok Kwon
S. Seo
Soonshin Seo
36
3
0
31 Oct 2022
Partitioned Gradient Matching-based Data Subset Selection for
  Compute-Efficient Robust ASR Training
Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training
Ashish R. Mittal
D. Sivasubramanian
Rishabh K. Iyer
P. Jyothi
Ganesh Ramakrishnan
19
3
0
30 Oct 2022
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with
  Pre-trained Masked Language Model
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
54
25
0
29 Oct 2022
Accelerating RNN-T Training and Inference Using CTC guidance
Accelerating RNN-T Training and Inference Using CTC guidance
Yongqiang Wang
Zhehuai Chen
Cheng-yong Zheng
Yu Zhang
Wei Han
Parisa Haghani
40
23
0
29 Oct 2022
Filter and evolve: progressive pseudo label refining for semi-supervised
  automatic speech recognition
Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition
Zezhong Jin
Dading Zhong
Xiao Song
Zhaoyi Liu
Naipeng Ye
Qingcheng Zeng
11
2
0
28 Oct 2022
Random Utterance Concatenation Based Data Augmentation for Improving
  Short-video Speech Recognition
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition
Yist Y. Lin
Tao Han
Haihua Xu
Van Tung Pham
Yerbolat Khassanov
Tze Yuang Chong
Yi He
Lu Lu
Zejun Ma
13
2
0
28 Oct 2022
Token-level Sequence Labeling for Spoken Language Understanding using
  Compositional End-to-End Models
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Siddhant Arora
Siddharth Dalmia
Brian Yan
Florian Metze
A. Black
Shinji Watanabe
23
12
0
27 Oct 2022
Monotonic segmental attention for automatic speech recognition
Monotonic segmental attention for automatic speech recognition
Albert Zeyer
Robin Schmitt
Wei Zhou
Ralf Schluter
Hermann Ney
16
8
0
26 Oct 2022
Linguistic-Enhanced Transformer with CTC Embedding for Speech
  Recognition
Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition
Xulong Zhang
Jianzong Wang
Ning Cheng
Mengyuan Zhao
Zhiyong Zhang
Jing Xiao
6
0
0
25 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
30
24
0
24 Oct 2022
Optimizing Bilingual Neural Transducer with Synthetic Code-switching
  Text Generation
Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation
Thien Nguyen
Nathalie Tran
Liuhui Deng
Thiago Fraga da Silva
Matthew Radzihovsky
...
Honza Silovsky
Arnab Ghoshal
M. Martel
Bharat Ram Ambati
Mohamed Ali
35
5
0
21 Oct 2022
Improving Semi-supervised End-to-end Automatic Speech Recognition using
  CycleGAN and Inter-domain Losses
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses
C. Li
Ngoc Thang Vu
21
2
0
20 Oct 2022
Anchored Speech Recognition with Neural Transducers
Anchored Speech Recognition with Neural Transducers
Desh Raj
J. Jia
Jay Mahadeokar
Chunyang Wu
Niko Moritz
Xiaohui Zhang
Ozlem Kalinli
13
2
0
20 Oct 2022
End-to-End Integration of Speech Recognition, Dereverberation,
  Beamforming, and Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Yoshiki Masuyama
Xuankai Chang
Samuele Cornell
Shinji Watanabe
Nobutaka Ono
17
19
0
19 Oct 2022
Helpful Neighbors: Leveraging Neighbors in Geographic Feature
  Pronunciation
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
Llion Jones
R. Sproat
Haruko Ishikawa
Alexander Gutkin
30
1
0
18 Oct 2022
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample
  Decoding
Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Ruchao Fan
Guoli Ye
Yashesh Gaur
Jinyu Li
19
4
0
16 Oct 2022
A Policy-based Approach to the SpecAugment Method for Low Resource E2E
  ASR
A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR
Rui Li
Guodong Ma
Dexin Zhao
Ranran Zeng
Xiaoyu Li
Haolin Huang
29
2
0
16 Oct 2022
On Compressing Sequences for Self-Supervised Speech Models
On Compressing Sequences for Self-Supervised Speech Models
Yen Meng
Hsuan-Jui Chen
Jiatong Shi
Shinji Watanabe
Paola García
Hung-yi Lee
Hao Tang
SSL
21
14
0
13 Oct 2022
An Experimental Study on Private Aggregation of Teacher Ensemble
  Learning for End-to-End Speech Recognition
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition
Chao-Han Huck Yang
I-Fan Chen
A. Stolcke
Sabato Marco Siniscalchi
Chin-Hui Lee
32
2
0
11 Oct 2022
CTC Alignments Improve Autoregressive Translation
CTC Alignments Improve Autoregressive Translation
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
46
33
0
11 Oct 2022
DeepPerform: An Efficient Approach for Performance Testing of
  Resource-Constrained Neural Networks
DeepPerform: An Efficient Approach for Performance Testing of Resource-Constrained Neural Networks
Simin Chen
Mirazul Haque
Cong Liu
Wei Yang
47
21
0
10 Oct 2022
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
Mayumi Ohta
Julia Kreutzer
Stefan Riezler
19
0
0
05 Oct 2022
Relaxed Attention for Transformer Models
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples
  on Self-Supervised Speech Recognition models
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models
R. Olivier
H. Abdullah
Bhiksha Raj
AAML
26
1
0
17 Sep 2022
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for
  End-to-End Speech Recognition
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Ye Bai
Jie Li
W. Han
Hao Ni
Kaituo Xu
Zhuo Zhang
Cheng Yi
Xiaorui Wang
MoE
26
1
0
17 Sep 2022
Analysis of Self-Attention Head Diversity for Conformer-based Automatic
  Speech Recognition
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Kartik Audhkhasi
Yinghui Huang
Bhuvana Ramabhadran
Pedro J. Moreno
24
3
0
13 Sep 2022
Non-autoregressive Error Correction for CTC-based ASR with
  Phone-conditioned Masked LM
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM
Hayato Futami
Hirofumi Inaguma
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
53
12
0
08 Sep 2022
Distilling the Knowledge of BERT for CTC-based ASR
Distilling the Knowledge of BERT for CTC-based ASR
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
21
8
0
05 Sep 2022
Vision-Language Adaptive Mutual Decoder for OOV-STR
Vision-Language Adaptive Mutual Decoder for OOV-STR
Jinshui Hu
Chenyu Liu
Qiandong Yan
Xuyang Zhu
Jiajia Wu
Feng Yu
Bing Yin
VLM
32
0
0
02 Sep 2022
Bayesian Neural Network Language Modeling for Speech Recognition
Bayesian Neural Network Language Modeling for Speech Recognition
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCV
BDL
44
14
0
28 Aug 2022
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of
  Speech and Image Data
Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
CVBM
30
22
0
25 Aug 2022
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
Georgios Karakasidis
Tamás Grósz
M. Kurimo
22
2
0
10 Aug 2022
ASR Error Correction with Constrained Decoding on Operation Prediction
ASR Error Correction with Constrained Decoding on Operation Prediction
J. Yang
Rong-Zhi Li
Wei Peng
32
9
0
09 Aug 2022
Adversarial Attacks on ASR Systems: An Overview
Adversarial Attacks on ASR Systems: An Overview
Xiao Zhang
Hao Tan
Xuan Huang
Denghui Zhang
Keke Tang
Zhaoquan Gu
AAML
19
3
0
03 Aug 2022
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Jiatong Shi
G. Saon
David Haws
Shinji Watanabe
Brian Kingsbury
32
3
0
03 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based
  Mandarin speech recognition
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Peng Shen
Xugang Lu
Hisashi Kawai
19
2
0
29 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Reducing Geographic Disparities in Automatic Speech Recognition via
  Elastic Weight Consolidation
Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
V. Trinh
Pegah Ghahremani
Brian King
J. Droppo
A. Stolcke
Roland Maas
MoMe
19
5
0
16 Jul 2022
PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber
  for Polyphonic Music
PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music
Xiaoxue Gao
Chitralekha Gupta
Haizhou Li
33
7
0
15 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
Tree-constrained Pointer Generator with Graph Neural Network Encodings
  for Contextual Speech Recognition
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition
Guangzhi Sun
C. Zhang
P. Woodland
22
12
0
02 Jul 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech
  Self-Supervised Learning
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
23
29
0
01 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech
  Recognition
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
J. Dang
11
26
0
29 Jun 2022
Contextual Density Ratio for Language Model Biasing of Sequence to
  Sequence ASR Systems
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems
Jesús Andrés-Ferrer
Dario Albesano
P. Zhan
Paul Vozila
16
6
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech
  Recognition in Standalone and Rescoring Mode
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
36
2
0
26 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System
  on the 300-hr Switchboard Corpus
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
19
5
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of
  End-to-end Conformer and Hybrid TDNN ASR Systems
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
33
9
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Avoid Overfitting User Specific Information in Federated Keyword
  Spotting
Avoid Overfitting User Specific Information in Federated Keyword Spotting
Xin-Chun Li
Jin-Lin Tang
Shaoming Song
Bingshuai Li
Yinchuan Li
Yunfeng Shao
Le Gan
De-Chuan Zhan
FedML
AAML
30
9
0
17 Jun 2022
Paraformer: Fast and Accurate Parallel Transformer for
  Non-autoregressive End-to-End Speech Recognition
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Zhifu Gao
Shiliang Zhang
Ian Mcloughlin
Zhijie Yan
14
92
0
16 Jun 2022
Previous
123...567...192021
Next