ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech
  Self-Supervised Learning
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
121
33
0
01 Jul 2022
Language-specific Characteristic Assistance for Code-switching Speech
  Recognition
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
Jianwu Dang
84
27
0
29 Jun 2022
Contextual Density Ratio for Language Model Biasing of Sequence to
  Sequence ASR Systems
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems
Jesús Andrés-Ferrer
Dario Albesano
P. Zhan
Paul Vozila
58
6
0
29 Jun 2022
On Comparison of Encoders for Attention based End to End Speech
  Recognition in Standalone and Rescoring Mode
On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode
Raviraj Joshi
Subodh Kumar
77
2
0
26 Jun 2022
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System
  on the 300-hr Switchboard Corpus
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus
Junhao Xu
Shoukang Hu
Xunying Liu
Helen M. Meng
MQ
77
5
0
23 Jun 2022
Two-pass Decoding and Cross-adaptation Based System Combination of
  End-to-end Conformer and Hybrid TDNN ASR Systems
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Mingyu Cui
Jiajun Deng
Shoukang Hu
Xurong Xie
Tianzi Wang
Shujie Hu
Mengzhe Geng
Boyang Xue
Xunying Liu
Helen M. Meng
80
9
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
102
16
0
20 Jun 2022
Avoid Overfitting User Specific Information in Federated Keyword
  Spotting
Avoid Overfitting User Specific Information in Federated Keyword Spotting
Xin-Chun Li
Jin-Lin Tang
Shaoming Song
Bingshuai Li
Yinchuan Li
Yunfeng Shao
Le Gan
De-Chuan Zhan
FedMLAAML
64
9
0
17 Jun 2022
Paraformer: Fast and Accurate Parallel Transformer for
  Non-autoregressive End-to-End Speech Recognition
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Zhifu Gao
Shiliang Zhang
Ian Mcloughlin
Zhijie Yan
98
110
0
16 Jun 2022
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
51
11
0
15 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLMMoE
73
14
0
07 Jun 2022
Contextual Adapters for Personalized Speech Recognition in Neural
  Transducers
Contextual Adapters for Personalized Speech Recognition in Neural Transducers
Kanthashree Mysore Sathyendra
Thejaswi Muniyappa
Feng-Ju Chang
Jing Liu
Jinru Su
Grant P. Strimel
Athanasios Mouchtaris
Siegfried Kunzmann
85
79
0
26 May 2022
Transcormer: Transformer for Sentence Scoring with Sliding Language
  Modeling
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling
Kaitao Song
Yichong Leng
Xu Tan
Yicheng Zou
Tao Qin
Dongsheng Li
111
11
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Adaptive multilingual speech recognition with pretrained models
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
72
23
0
24 May 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Yuting Yang
Binbin Du
Yuke Li
71
1
0
24 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
98
36
0
22 May 2022
Minimising Biasing Word Errors for Contextual ASR with the
  Tree-Constrained Pointer Generator
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
Chuxu Zhang
P. Woodland
103
14
0
18 May 2022
Evaluating Membership Inference Through Adversarial Robustness
Evaluating Membership Inference Through Adversarial Robustness
Zhaoxi Zhang
L. Zhang
Xufei Zheng
Bilal Hussain Abbasi
Shengshan Hu
AAML
92
17
0
14 May 2022
Improved Consistency Training for Semi-Supervised Sequence-to-Sequence
  ASR via Speech Chain Reconstruction and Self-Transcribing
Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Heli Qi
Sashi Novitasari
S. Sakti
Satoshi Nakamura
AI4TS
144
2
0
14 May 2022
Personalized Adversarial Data Augmentation for Dysarthric and Elderly
  Speech Recognition
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Zengrui Jin
Mengzhe Geng
Jiajun Deng
Tianzi Wang
Shujie Hu
Guinan Li
Xunying Liu
91
22
0
13 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
105
39
0
02 May 2022
How does a spontaneously speaking conversational agent affect user
  behavior?
How does a spontaneously speaking conversational agent affect user behavior?
Takahisa Iizuka
H. Mori
29
3
0
02 May 2022
Bilingual End-to-End ASR with Byte-Level Subwords
Bilingual End-to-End ASR with Byte-Level Subwords
Liuhui Deng
Roger Hsiao
Arnab Ghoshal
42
4
0
01 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
131
183
0
27 Apr 2022
Supervised Attention in Sequence-to-Sequence Models for Speech
  Recognition
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
Gene-Ping Yang
Hao Tang
64
3
0
25 Apr 2022
Efficient Training of Neural Transducer for Speech Recognition
Efficient Training of Neural Transducer for Speech Recognition
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
104
24
0
22 Apr 2022
Cross-stitched Multi-modal Encoders
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
55
0
0
20 Apr 2022
An Investigation of Monotonic Transducers for Large-Scale Automatic
  Speech Recognition
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Niko Moritz
Frank Seide
Duc Le
Jay Mahadeokar
Christian Fuegen
94
8
0
19 Apr 2022
Self-critical Sequence Training for Automatic Speech Recognition
Self-critical Sequence Training for Automatic Speech Recognition
Chen Chen
Yuchen Hu
Nana Hou
Xiaofeng Qi
Heqing Zou
Chng Eng Siong
76
16
0
13 Apr 2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in
  End-to-End Speech-to-Intent Systems
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Vishal Sunder
Eric Fosler-Lussier
Samuel Thomas
H. Kuo
Brian Kingsbury
83
7
0
11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve
  Its Decoder Efficiency For Speech Recognition
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition
N. J. Wang
Zongfeng Quan
Shaojun Wang
Jing Xiao
48
1
0
08 Apr 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text
  for Low-Resource Automatic Speech Recognition
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Ye Du
Jie Zhang
Qiu-shi Zhu
Lirong Dai
Ming Wu
Xin Fang
Zhouwang Yang
68
2
0
05 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive
  Feature Consolidation
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
109
191
0
02 Apr 2022
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur
  Speech Recognition
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
Guodong Ma
Pengfei Hu
Jian Kang
Shen Huang
Hao-Ming Huang
78
9
0
02 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language
  Understanding
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
55
4
0
01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
108
8
0
31 Mar 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin
  Conversational(RAMC) Speech Dataset
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
69
51
0
31 Mar 2022
NeuFA: Neural Network Based End-to-End Forced Alignment with
  Bidirectional Attention Mechanism
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Jingbei Li
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
45
21
0
31 Mar 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming
  ASR
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
52
19
0
31 Mar 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention
  VAE
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
63
5
0
30 Mar 2022
Recent improvements of ASR models in the face of adversarial attacks
Recent improvements of ASR models in the face of adversarial attacks
R. Olivier
Bhiksha Raj
AAML
126
14
0
29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded
  encoders
Streaming parallel transducer beam search with fast-slow cascaded encoders
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
78
16
0
29 Mar 2022
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
108
8
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
112
99
0
29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological
  Speech Recognition
Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Lester Phillip Violeta
Wen-Chin Huang
Tomoki Toda
94
34
0
29 Mar 2022
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain
  Data
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain Data
Chen Chen
Nana Hou
Yuchen Hu
Shashank Shirol
Chng Eng Siong
NoLa
103
43
0
29 Mar 2022
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Fangyuan Wang
Bo Xu
79
5
0
29 Mar 2022
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
A. Virkkunen
Aku Rouhe
Nhan Phan
M. Kurimo
104
4
0
28 Mar 2022
Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Yuchen Hu
Nana Hou
Chen Chen
Chng Eng Siong
99
15
0
28 Mar 2022
Joint Transformer/RNN Architecture for Gesture Typing in Indic Languages
Joint Transformer/RNN Architecture for Gesture Typing in Indic Languages
Emil Biju
Anirudh Sriram
Mitesh M. Khapra
Pratyush Kumar
32
3
0
26 Mar 2022
Previous
123...678...192021
Next