ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.01211
  4. Cited By
Listen, Attend and Spell
v1v2 (latest)

Listen, Attend and Spell

5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
    RALM
ArXiv (abs)PDFHTML

Papers citing "Listen, Attend and Spell"

50 / 1,041 papers shown
Title
Sim-T: Simplify the Transformer Network by Multiplexing Technique for
  Speech Recognition
Sim-T: Simplify the Transformer Network by Multiplexing Technique for Speech Recognition
Guangyong Wei
Zhikui Duan
Shiren Li
Guangguang Yang
Xinmei Yu
Junhua Li
74
5
0
11 Apr 2023
Wav2code: Restore Clean Speech Representations via Codebook Lookup for
  Noise-Robust ASR
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Yuchen Hu
Cheng Chen
Qiu-shi Zhu
Eng Siong Chng
130
16
0
11 Apr 2023
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in
  Speech Recognition
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Saumya Yashmohini Sahai
Jing Liu
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Anastasios Alexandridis
...
Ross McGowan
Ariya Rastrow
Feng-Ju Chang
Athanasios Mouchtaris
Siegfried Kunzmann
88
5
0
03 Apr 2023
Dialog act guided contextual adapter for personalized speech recognition
Dialog act guided contextual adapter for personalized speech recognition
Feng-Ju Chang
Thejaswi Muniyappa
Kanthashree Mysore Sathyendra
Kailin Wei
Grant P. Strimel
Ross McGowan
53
5
0
31 Mar 2023
PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech
  recognition in neural transducers
PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers
R. Pandey
Roger Ren
Qi Luo
Jing Liu
Ariya Rastrow
Ankur Gandhe
Denis Filimonov
Grant P. Strimel
A. Stolcke
I. Bulyko
92
13
0
30 Mar 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot
  AV-ASR
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
60
15
0
29 Mar 2023
Cross-utterance ASR Rescoring with Graph-based Label Propagation
Cross-utterance ASR Rescoring with Graph-based Label Propagation
Srinath Tankasala
Long Chen
A. Stolcke
A. Raju
Qianli Deng
Chander Chandak
Aparna Khare
Roland Maas
Venkatesh Ravichandran
55
0
0
27 Mar 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
146
0
0
23 Mar 2023
I3D: Transformer architectures with input-dependent dynamic depth for
  speech recognition
I3D: Transformer architectures with input-dependent dynamic depth for speech recognition
Yifan Peng
Jaesong Lee
Shinji Watanabe
67
25
0
14 Mar 2023
Probing neural representations of scene perception in a hippocampally
  dependent task using artificial neural networks
Probing neural representations of scene perception in a hippocampally dependent task using artificial neural networks
Markus Frey
Christian F. Doeller
Caswell Barry
65
4
0
11 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
95
47
0
10 Mar 2023
MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
A. Huq
Weiyi Zhang
Xiaolin Hu
AAML
88
3
0
10 Mar 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
91
172
0
03 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
192
270
0
02 Mar 2023
Building High-accuracy Multilingual ASR with Gated Language Experts and
  Curriculum Training
Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Eric Sun
Jinyu Li
Yuxuan Hu
Yilun Zhu
Long Zhou
...
Peidong Wang
Linquan Liu
Shujie Liu
Ed Lin
Yifan Gong
90
6
0
01 Mar 2023
N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses
  and Constrained Decoding Space
N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding Space
Rao Ma
Mark Gales
Kate Knill
Mengjie Qian
82
33
0
01 Mar 2023
MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech
  Recognition
MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech Recognition
Yoohwan Kwon
Soo-Whan Chung
MoE
81
18
0
27 Feb 2023
Efficient CTC Regularization via Coarse Labels for End-to-End Speech
  Translation
Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Biao Zhang
Barry Haddow
Rico Sennrich
78
3
0
21 Feb 2023
A Sidecar Separator Can Convert a Single-Talker Speech Recognition
  System to a Multi-Talker One
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One
Lingwei Meng
Jiawen Kang
Mingyu Cui
Yuejiao Wang
Xixin Wu
Helen M. Meng
76
17
0
20 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
76
5
0
18 Feb 2023
Confidence Score Based Speaker Adaptation of Conformer Speech
  Recognition Systems
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Jiajun Deng
Xurong Xie
Tianzi Wang
Mingyu Cui
Boyang Xue
Zengrui Jin
Guinan Li
Shujie Hu
Xunying Liu
63
6
0
15 Feb 2023
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech
  Recognizers via Hierarchical Distillation
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Minglun Han
Feilong Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
100
13
0
30 Jan 2023
Achieving Timestamp Prediction While Recognizing with Non-Autoregressive
  End-to-End ASR Model
Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model
Xian Shi
Yanni Chen
Shiliang Zhang
Zhijie Yan
55
8
0
29 Jan 2023
Regeneration Learning: A Learning Paradigm for Data Generation
Regeneration Learning: A Learning Paradigm for Data Generation
Xu Tan
Tao Qin
Jiang Bian
Tie-Yan Liu
Yoshua Bengio
GAN
64
15
0
21 Jan 2023
Neural Architecture Search: Insights from 1000 Papers
Neural Architecture Search: Insights from 1000 Papers
Colin White
Mahmoud Safari
R. Sukthanker
Binxin Ru
T. Elsken
Arber Zela
Debadeepta Dey
Frank Hutter
3DVAI4CE
131
143
0
20 Jan 2023
Two Stage Contextual Word Filtering for Context bias in Unified
  Streaming and Non-streaming Transducer
Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer
Zhanheng Yang
Sining Sun
Xiong Wang
Yike Zhang
Long Ma
Linfu Xie
63
11
0
17 Jan 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech
  Recognition
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDLUQCV
46
0
0
16 Jan 2023
SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Heli Qi
Sashi Novitasari
Andros Tjandra
S. Sakti
Satoshi Nakamura
82
3
0
08 Jan 2023
Object Segmentation with Audio Context
Object Segmentation with Audio Context
Kaihui Zheng
Yuqing Ren
Zixin Shen
Tianxu Qin
VOS
59
0
0
04 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition
  Systems A case study for Modern Greek
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
Vassilis Katsouros
Alexandros Potamianos
VLM
80
9
0
31 Dec 2022
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict
  decoders
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders
Yui Sudo
Muhammad Shakeel
Brian Yan
Jiatong Shi
Shinji Watanabe
51
10
0
21 Dec 2022
Attention as a Guide for Simultaneous Speech Translation
Attention as a Guide for Simultaneous Speech Translation
Sara Papi
Matteo Negri
Marco Turchi
93
31
0
15 Dec 2022
GAMMA: Generative Augmentation for Attentive Marine Debris Detection
GAMMA: Generative Augmentation for Attentive Marine Debris Detection
Vaishnavi Khindkar
Janhavi Khindkar
ViT
64
1
0
07 Dec 2022
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural
  Transducers
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
70
4
0
07 Dec 2022
Learning the joint distribution of two sequences using little or no
  paired data
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
89
2
0
06 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based
  Conformer for Speech Recognition
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Yu Pan
Jingjing Yin
Heng Lu
108
3
0
05 Dec 2022
Continual Learning for On-Device Speech Recognition using Disentangled
  Conformers
Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Anuj Diwan
Ching-Feng Yeh
Wei-Ning Hsu
Paden Tomasello
Eunsol Choi
David Harwath
Abdel-rahman Mohamed
CLLBDL
81
8
0
02 Dec 2022
Neural Transducer Training: Reduced Memory Consumption with Sample-wise
  Computation
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
Stefan Braun
Erik McDermott
Roger Hsiao
66
1
0
29 Nov 2022
Unsupervised Model-based speaker adaptation of end-to-end lattice-free
  MMI model for speech recognition
Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
Xurong Xie
Xunying Liu
Hui Chen
Hongan Wang
80
1
0
17 Nov 2022
Continuous Soft Pseudo-Labeling in ASR
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
84
3
0
11 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture,
  and Generalization Capabilities
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
99
13
0
10 Nov 2022
Adaptive Multi-Corpora Language Model Training for Speech Recognition
Adaptive Multi-Corpora Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Xuedong Zhang
74
2
0
09 Nov 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying
  Peak-First Regularization
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization
Zhengkun Tian
Hongyu Xiang
Min Li
Fei Lin
Ke Ding
Guanglu Wan
56
7
0
07 Nov 2022
Deliberation Networks and How to Train Them
Deliberation Networks and How to Train Them
Qingyun Dou
Mark Gales
63
0
0
06 Nov 2022
Multi-blank Transducers for Speech Recognition
Multi-blank Transducers for Speech Recognition
Hainan Xu
Fei Jia
Somshubra Majumdar
Shinji Watanabe
Boris Ginsburg
89
11
0
04 Nov 2022
Once-for-All Sequence Compression for Self-Supervised Speech Models
Once-for-All Sequence Compression for Self-Supervised Speech Models
Hsuan-Jui Chen
Yen Meng
Hung-yi Lee
64
6
0
04 Nov 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge
  (ICSRC): Dataset, Tracks, Baseline and Results
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
Ao Zhang
F. Yu
Kaixun Huang
Linfu Xie
Longbiao Wang
Eng Siong Chng
Hui Bu
Binbin Zhang
Wei Chen
Xin Xu
96
5
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
168
9
0
02 Nov 2022
Internal Language Model Estimation based Adaptive Language Model Fusion
  for Domain Adaptation
Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation
Rao Ma
Xiaobo Wu
Jin Qiu
Yanan Qin
Haihua Xu
Peihao Wu
Zejun Ma
70
2
0
02 Nov 2022
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint
  CTC/Attention Frames
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames
Che-Yuan Liang
Xiao-Lei Zhang
BinBin Zhang
Di Wu
Shengqiang Li
Xingcheng Song
Zhendong Peng
Fuping Pan
45
9
0
02 Nov 2022
Previous
123456...192021
Next