ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXivPDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,033 papers shown
Title
Residual Language Model for End-to-end Speech Recognition
Residual Language Model for End-to-end Speech Recognition
E. Tsunoo
Yosuke Kashiwagi
Chaitanya Narisetty
Shinji Watanabe
22
11
0
15 Jun 2022
LegoNN: Building Modular Encoder-Decoder Models
LegoNN: Building Modular Encoder-Decoder Models
Siddharth Dalmia
Dmytro Okhonko
M. Lewis
Sergey Edunov
Shinji Watanabe
Florian Metze
Luke Zettlemoyer
Abdel-rahman Mohamed
AuLLM
MoE
29
14
0
07 Jun 2022
Contextual Adapters for Personalized Speech Recognition in Neural
  Transducers
Contextual Adapters for Personalized Speech Recognition in Neural Transducers
Kanthashree Mysore Sathyendra
Thejaswi Muniyappa
Feng-Ju Chang
Jing Liu
Jinru Su
Grant P. Strimel
Athanasios Mouchtaris
Siegfried Kunzmann
19
75
0
26 May 2022
Transcormer: Transformer for Sentence Scoring with Sliding Language
  Modeling
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling
Kaitao Song
Yichong Leng
Xu Tan
Yicheng Zou
Tao Qin
Dongsheng Li
14
11
0
25 May 2022
Adaptive multilingual speech recognition with pretrained models
Adaptive multilingual speech recognition with pretrained models
Ngoc-Quan Pham
A. Waibel
Jan Niehues
VLM
17
23
0
24 May 2022
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition
Yuting Yang
Binbin Du
Yuke Li
26
1
0
24 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
29
33
0
22 May 2022
Minimising Biasing Word Errors for Contextual ASR with the
  Tree-Constrained Pointer Generator
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
C. Zhang
P. Woodland
32
14
0
18 May 2022
Evaluating Membership Inference Through Adversarial Robustness
Evaluating Membership Inference Through Adversarial Robustness
Zhaoxi Zhang
L. Zhang
Xufei Zheng
Bilal Hussain Abbasi
Shengshan Hu
AAML
57
14
0
14 May 2022
Improved Consistency Training for Semi-Supervised Sequence-to-Sequence
  ASR via Speech Chain Reconstruction and Self-Transcribing
Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing
Heli Qi
Sashi Novitasari
S. Sakti
Satoshi Nakamura
AI4TS
13
2
0
14 May 2022
Personalized Adversarial Data Augmentation for Dysarthric and Elderly
  Speech Recognition
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition
Zengrui Jin
Mengzhe Geng
Jiajun Deng
Tianzi Wang
Shujie Hu
Guinan Li
Xunying Liu
25
19
0
13 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
37
0
02 May 2022
How does a spontaneously speaking conversational agent affect user
  behavior?
How does a spontaneously speaking conversational agent affect user behavior?
Takahisa Iizuka
H. Mori
13
2
0
02 May 2022
Bilingual End-to-End ASR with Byte-Level Subwords
Bilingual End-to-End ASR with Byte-Level Subwords
Liuhui Deng
Roger Hsiao
Arnab Ghoshal
18
4
0
01 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
44
149
0
27 Apr 2022
Supervised Attention in Sequence-to-Sequence Models for Speech
  Recognition
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition
Gene-Ping Yang
Hao Tang
23
2
0
25 Apr 2022
Efficient Training of Neural Transducer for Speech Recognition
Efficient Training of Neural Transducer for Speech Recognition
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
24
22
0
22 Apr 2022
Cross-stitched Multi-modal Encoders
Cross-stitched Multi-modal Encoders
Karan Singla
Daniel Pressel
Ryan Price
Bhargav Srinivas Chinnari
Yeon-Jun Kim
S. Bangalore
21
0
0
20 Apr 2022
An Investigation of Monotonic Transducers for Large-Scale Automatic
  Speech Recognition
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition
Niko Moritz
Frank Seide
Duc Le
Jay Mahadeokar
Christian Fuegen
23
8
0
19 Apr 2022
Self-critical Sequence Training for Automatic Speech Recognition
Self-critical Sequence Training for Automatic Speech Recognition
Chen Chen
Yuchen Hu
Nana Hou
Xiaofeng Qi
Heqing Zou
Chng Eng Siong
27
15
0
13 Apr 2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in
  End-to-End Speech-to-Intent Systems
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Vishal Sunder
Eric Fosler-Lussier
Samuel Thomas
H. Kuo
Brian Kingsbury
23
7
0
11 Apr 2022
Adding Connectionist Temporal Summarization into Conformer to Improve
  Its Decoder Efficiency For Speech Recognition
Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition
N. J. Wang
Zongfeng Quan
Shaojun Wang
Jing Xiao
23
1
0
08 Apr 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text
  for Low-Resource Automatic Speech Recognition
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Ye Du
Jie Zhang
Qiu-shi Zhu
Lirong Dai
Ming Wu
Xin Fang
Zhouwang Yang
34
2
0
05 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive
  Feature Consolidation
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation
Minsoo Kang
Jaeyoo Park
Bohyung Han
CLL
27
179
0
02 Apr 2022
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur
  Speech Recognition
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
Guodong Ma
Pengfei Hu
Jian Kang
Shen Huang
Hao-Ming Huang
26
9
0
02 Apr 2022
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language
  Understanding
Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding
Xuandi Fu
Feng-Ju Chang
Martin H. Radfar
Kailin Wei
Jing Liu
Grant P. Strimel
Kanthashree Mysore Sathyendra
18
4
0
01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
27
8
0
31 Mar 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin
  Conversational(RAMC) Speech Dataset
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
15
47
0
31 Mar 2022
NeuFA: Neural Network Based End-to-End Forced Alignment with
  Bidirectional Attention Mechanism
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
Jingbei Li
Yi Meng
Zhiyong Wu
Helen Meng
Qiao Tian
Yuping Wang
Yuxuan Wang
15
21
0
31 Mar 2022
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming
  ASR
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Keyu An
Huahuan Zheng
Zhijian Ou
Hongyu Xiang
Ke Ding
Guanglu Wan
AI4TS
28
17
0
31 Mar 2022
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention
  VAE
Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE
Ziang Long
Yunling Zheng
Meng Yu
Jack Xin
DRL
27
5
0
30 Mar 2022
Recent improvements of ASR models in the face of adversarial attacks
Recent improvements of ASR models in the face of adversarial attacks
R. Olivier
Bhiksha Raj
AAML
24
13
0
29 Mar 2022
Streaming parallel transducer beam search with fast-slow cascaded
  encoders
Streaming parallel transducer beam search with fast-slow cascaded encoders
Jay Mahadeokar
Yangyang Shi
Ke Li
Duc Le
Jiedan Zhu
Vikas Chandra
Ozlem Kalinli
M. Seltzer
35
15
0
29 Mar 2022
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Integrating Lattice-Free MMI into End-to-End Speech Recognition
Jinchuan Tian
Jianwei Yu
Chao Weng
Yuexian Zou
Dong Yu
35
8
0
29 Mar 2022
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit
Binbin Zhang
Di Wu
Zhendong Peng
Xingcheng Song
Zhuoyuan Yao
Hang Lv
Linfu Xie
Chao Yang
Fuping Pan
Jianwei Niu
VLM
29
94
0
29 Mar 2022
Investigating Self-supervised Pretraining Frameworks for Pathological
  Speech Recognition
Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Lester Phillip Violeta
Wen-Chin Huang
T. Toda
22
31
0
29 Mar 2022
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain
  Data
Noise-robust Speech Recognition with 10 Minutes Unparalleled In-domain Data
Chen Chen
Nana Hou
Yuchen Hu
Shashank Shirol
Chng Eng Siong
NoLa
14
43
0
29 Mar 2022
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR
Fangyuan Wang
Bo Xu
21
4
0
29 Mar 2022
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
A. Virkkunen
Aku Rouhe
Nhan Phan
M. Kurimo
17
4
0
28 Mar 2022
Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Yuchen Hu
Nana Hou
Chen Chen
Chng Eng Siong
27
13
0
28 Mar 2022
Joint Transformer/RNN Architecture for Gesture Typing in Indic Languages
Joint Transformer/RNN Architecture for Gesture Typing in Indic Languages
Emil Biju
Anirudh Sriram
Mitesh M. Khapra
Pratyush Kumar
23
3
0
26 Mar 2022
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some
  benchmarks
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks
Anssi Moisio
Dejan Porjazovski
Aku Rouhe
Yaroslav Getman
A. Virkkunen
Tamás Grósz
Krister Lindén
M. Kurimo
19
21
0
24 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network
  Fail in Deep Learning? (Provably)
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
19
91
0
23 Mar 2022
Transformer-based Streaming ASR with Cumulative Attention
Transformer-based Streaming ASR with Cumulative Attention
Mohan Li
Shucong Zhang
Catalin Zorila
R. Doddipatla
27
9
0
11 Mar 2022
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
aaeCAPTCHA: The Design and Implementation of Audio Adversarial CAPTCHA
Md. Imran Hossen
X. Hei
31
4
0
05 Mar 2022
Towards Contextual Spelling Correction for Customization of End-to-end
  Speech Recognition Systems
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Xiaoqiang Wang
Yanqing Liu
Jinyu Li
Veljko Miljanic
Sheng Zhao
H. Khalil
KELM
11
18
0
02 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
19
11
0
01 Mar 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical
  Applications: A Survey
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
29
6
0
22 Feb 2022
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric
  and Elderly Speech Recognition
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition
Mengzhe Geng
Xurong Xie
Zi Ye
Tianzi Wang
Guinan Li
Shujie Hu
Xunying Liu
Helen Meng
22
28
0
21 Feb 2022
Learning Representations Robust to Group Shifts and Adversarial Examples
Learning Representations Robust to Group Shifts and Adversarial Examples
Ming-Chang Chiu
Xuezhe Ma
OOD
13
0
0
18 Feb 2022
Previous
123...678...192021
Next