Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1508.01211
Cited By
Listen, Attend and Spell
5 August 2015
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Listen, Attend and Spell"
50 / 1,033 papers shown
Title
I3D: Transformer architectures with input-dependent dynamic depth for speech recognition
Yifan Peng
Jaesong Lee
Shinji Watanabe
27
19
0
14 Mar 2023
Probing neural representations of scene perception in a hippocampally dependent task using artificial neural networks
Markus Frey
Christian F. Doeller
Caswell Barry
31
4
0
11 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
25
42
0
10 Mar 2023
MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
A. Huq
Weiyi Zhang
Xiaolin Hu
AAML
27
3
0
10 Mar 2023
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
150
0
03 Mar 2023
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Yu Zhang
Wei Han
James Qin
Yongqiang Wang
Ankur Bapna
...
Pedro J. Moreno
Chung-Cheng Chiu
J. Schalkwyk
Franccoise Beaufays
Yonghui Wu
VLM
79
253
0
02 Mar 2023
Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Eric Sun
Jinyu Li
Yuxuan Hu
Yilun Zhu
Long Zhou
...
Peidong Wang
Linquan Liu
Shujie Liu
Ed Lin
Yifan Gong
29
6
0
01 Mar 2023
N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding Space
Rao Ma
Mark J. F. Gales
Kate Knill
Mengjie Qian
11
32
0
01 Mar 2023
MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech Recognition
Yoohwan Kwon
Soo-Whan Chung
MoE
24
16
0
27 Feb 2023
Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation
Biao Zhang
Barry Haddow
Rico Sennrich
17
3
0
21 Feb 2023
A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One
Lingwei Meng
Jiawen Kang
Mingyu Cui
Yuejiao Wang
Xixin Wu
Helen M. Meng
20
17
0
20 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
56
4
0
18 Feb 2023
Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Jiajun Deng
Xurong Xie
Tianzi Wang
Mingyu Cui
Boyang Xue
Zengrui Jin
Guinan Li
Shujie Hu
Xunying Liu
26
5
0
15 Feb 2023
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Minglun Han
Feilong Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
46
11
0
30 Jan 2023
Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model
Xian Shi
Yanni Chen
Shiliang Zhang
Zhijie Yan
21
8
0
29 Jan 2023
Regeneration Learning: A Learning Paradigm for Data Generation
Xu Tan
Tao Qin
Jiang Bian
Tie-Yan Liu
Yoshua Bengio
GAN
38
15
0
21 Jan 2023
Neural Architecture Search: Insights from 1000 Papers
Colin White
Mahmoud Safari
R. Sukthanker
Binxin Ru
T. Elsken
Arber Zela
Debadeepta Dey
Frank Hutter
3DV
AI4CE
34
130
0
20 Jan 2023
Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer
Zhanheng Yang
Sining Sun
Xiong Wang
Yike Zhang
Long Ma
Linfu Xie
26
9
0
17 Jan 2023
BayesSpeech: A Bayesian Transformer Network for Automatic Speech Recognition
Will Rieger
BDL
UQCV
19
0
0
16 Jan 2023
SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Heli Qi
Sashi Novitasari
Andros Tjandra
S. Sakti
Satoshi Nakamura
19
3
0
08 Jan 2023
Object Segmentation with Audio Context
Kaihui Zheng
Yuqing Ren
Zixin Shen
Tianxu Qin
VOS
27
0
0
04 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
25
7
0
31 Dec 2022
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders
Yui Sudo
Muhammad Shakeel
Brian Yan
Jiatong Shi
Shinji Watanabe
27
10
0
21 Dec 2022
Attention as a Guide for Simultaneous Speech Translation
Sara Papi
Matteo Negri
Marco Turchi
26
30
0
15 Dec 2022
GAMMA: Generative Augmentation for Attentive Marine Debris Detection
Vaishnavi Khindkar
Janhavi Khindkar
ViT
19
1
0
07 Dec 2022
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
32
4
0
07 Dec 2022
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
22
2
0
06 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Y. Pan
Jingjing Yin
Heng Lu
32
3
0
05 Dec 2022
Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Anuj Diwan
Ching-Feng Yeh
Wei-Ning Hsu
Paden Tomasello
Eunsol Choi
David Harwath
Abdel-rahman Mohamed
CLL
BDL
25
7
0
02 Dec 2022
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
Stefan Braun
Erik McDermott
Roger Hsiao
40
1
0
29 Nov 2022
Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
Xurong Xie
Xunying Liu
Hui Chen
Hongan Wang
24
1
0
17 Nov 2022
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
24
3
0
11 Nov 2022
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities
Andros Tjandra
Nayan Singhal
David C. Zhang
Ozlem Kalinli
Abdel-rahman Mohamed
Duc Le
M. Seltzer
37
12
0
10 Nov 2022
Adaptive Multi-Corpora Language Model Training for Speech Recognition
Yingyi Ma
Zhe Liu
Xuedong Zhang
33
2
0
09 Nov 2022
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization
Zhengkun Tian
Hongyu Xiang
Min Li
Fei Lin
Ke Ding
Guanglu Wan
15
6
0
07 Nov 2022
Deliberation Networks and How to Train Them
Qingyun Dou
Mark J. F. Gales
24
0
0
06 Nov 2022
Multi-blank Transducers for Speech Recognition
Hainan Xu
Fei Jia
Somshubra Majumdar
Shinji Watanabe
Boris Ginsburg
25
10
0
04 Nov 2022
Once-for-All Sequence Compression for Self-Supervised Speech Models
Hsuan-Jui Chen
Yen Meng
Hung-yi Lee
30
5
0
04 Nov 2022
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
Ao Zhang
F. Yu
Kaixun Huang
Linfu Xie
Longbiao Wang
E. Chng
Hui Bu
Binbin Zhang
Wei Chen
Xin Xu
32
4
0
03 Nov 2022
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
30
8
0
02 Nov 2022
Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation
Rao Ma
Xiaobo Wu
Jin Qiu
Yanan Qin
Haihua Xu
Peihao Wu
Zejun Ma
32
2
0
02 Nov 2022
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames
Che-Yuan Liang
Xiao-Lei Zhang
BinBin Zhang
Di Wu
Shengqiang Li
Xingcheng Song
Zhendong Peng
Fuping Pan
18
8
0
02 Nov 2022
Conversation-oriented ASR with multi-look-ahead CBS architecture
Huaibo Zhao
S. Fujie
Tetsuji Ogawa
Jin Sakuma
Yusuke Kida
Tetsunori Kobayashi
19
3
0
02 Nov 2022
InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
32
0
0
02 Nov 2022
TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
Xingcheng Song
Di Wu
Zhiyong Wu
Binbin Zhang
Yuekai Zhang
Zhendong Peng
Wenpeng Li
Fuping Pan
Changbao Zhu
34
8
0
01 Nov 2022
Speech-text based multi-modal training with bidirectional attention for improved speech recognition
Yuhang Yang
Haihua Xu
Hao-Ming Huang
E. Chng
Sheng Li
47
7
0
01 Nov 2022
Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition
Suyoun Kim
Ke Li
Lucas Kabela
Rongqing Huang
Jiedan Zhu
Ozlem Kalinli
Duc Le
25
8
0
31 Oct 2022
Structured State Space Decoder for Speech Recognition and Synthesis
Koichi Miyazaki
Masato Murata
Tomoki Koriyama
34
12
0
31 Oct 2022
FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Xingcheng Song
Di Wu
Binbin Zhang
Zhiyong Wu
Wenpeng Li
...
Peng Zhang
Zhendong Peng
Fuping Pan
Changbao Zhu
Zhongqin Wu
27
2
0
31 Oct 2022
Modular Hybrid Autoregressive Transducer
Zhong Meng
Tongzhou Chen
Rohit Prabhavalkar
Yu Zhang
Gary Wang
...
Bhuvana Ramabhadran
Yifan Jiang
Ehsan Variani
Yinghui Huang
Pedro J. Moreno
34
20
0
31 Oct 2022
Previous
1
2
3
4
5
6
...
19
20
21
Next