Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.01769
Cited By
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
5 December 2017
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
Zhehuai Chen
Anjuli Kannan
Ron J. Weiss
Kanishka Rao
Katya Gonina
Navdeep Jaitly
Yue Liu
J. Chorowski
M. Bacchiani
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"State-of-the-art Speech Recognition With Sequence-to-Sequence Models"
50 / 501 papers shown
Title
Transfer Learning Approaches for Streaming End-to-End Speech Recognition System
Vikas Joshi
Rui Zhao
Rupeshkumar Mehta
Kshitiz Kumar
Jinyu Li
30
22
0
12 Aug 2020
Audio- and Gaze-driven Facial Animation of Codec Avatars
Alexander Richard
Colin S. Lea
Shugao Ma
Juergen Gall
Fernando de la Torre
Yaser Sheikh
CVBM
23
81
0
11 Aug 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Jin Xu
Xu Tan
Yi Ren
Tao Qin
Jian Li
Sheng Zhao
Tie-Yan Liu
VLM
18
90
0
09 Aug 2020
Federated Transfer Learning with Dynamic Gradient Aggregation
Dimitrios Dimitriadis
K. Kumatani
R. Gmyr
Yashesh Gaur
Sefik Emre Eskimez
FedML
24
15
0
06 Aug 2020
Self-attention encoding and pooling for speaker recognition
Pooyan Safari
Miquel India
Javier Hernando
ViT
22
81
0
03 Aug 2020
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability
Jinyu Li
Rui Zhao
Zhong Meng
Yanqing Liu
Wenning Wei
...
V. Mazalov
Zhenghao Wang
Lei He
Sheng Zhao
Jiawei Liu
18
107
0
30 Jul 2020
Rethinking Recurrent Neural Networks and Other Improvements for Image Classification
N. H. Phong
B. Ribeiro
VLM
32
8
0
30 Jul 2020
Privacy-preserving Voice Analysis via Disentangled Representations
Ranya Aloufi
Hamed Haddadi
David E. Boyle
DRL
31
58
0
29 Jul 2020
Semi-Supervised Learning with Data Augmentation for End-to-End ASR
F. Weninger
F. Mana
R. Gemello
Jesús Andrés-Ferrer
P. Zhan
27
30
0
27 Jul 2020
Consistent Transcription and Translation of Speech
Matthias Sperber
Hendra Setiawan
Christian Gollan
Udhyakumar Nallasamy
Matthias Paulik
31
18
0
24 Jul 2020
CoVoST 2 and Massively Multilingual Speech-to-Text Translation
Changhan Wang
Anne Wu
J. Pino
SLR
31
72
0
20 Jul 2020
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results
Xian Shi
Qiangze Feng
Lei Xie
14
47
0
12 Jul 2020
Class LM and word mapping for contextual biasing in End-to-End ASR
Rongqing Huang
Ossama Abdel-Hamid
Xinwei Li
G. Evermann
31
47
0
10 Jul 2020
Gated Recurrent Context: Softmax-free Attention for Online Encoder-Decoder Speech Recognition
Hyeonseung Lee
Woohyun Kang
Sung Jun Cheon
Hyeongju Kim
N. Kim
34
3
0
10 Jul 2020
Uncertainty Prediction for Deep Sequential Regression Using Meta Models
Jirí Navrátil
Matthew Arnold
Benjamin Elder
BDL
UQCV
14
6
0
02 Jul 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhehuai Chen
MoE
43
1,118
0
30 Jun 2020
Streaming Transformer ASR with Blockwise Synchronous Beam Search
E. Tsunoo
Yosuke Kashiwagi
Shinji Watanabe
22
11
0
25 Jun 2020
Boosting Active Learning for Speech Recognition with Noisy Pseudo-labeled Samples
Jihwan Bang
Heesu Kim
Y. Yoo
Jung-Woo Ha
11
2
0
19 Jun 2020
Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition
Xinyuan Zhou
Emre Yilmaz
Yanhua Long
Yijie Li
Haizhou Li
11
51
0
18 Jun 2020
Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-based LVCSR
Xinyuan Zhou
Grandee Lee
Emre Yilmaz
Yanhua Long
Jiaen Liang
Haizhou Li
27
7
0
18 Jun 2020
Measuring Model Complexity of Neural Networks with Curve Activation Functions
X. Hu
Weiqing Liu
Jiang Bian
J. Pei
30
20
0
16 Jun 2020
Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation
Changhan Wang
J. Pino
Jiatao Gu
31
30
0
09 Jun 2020
Learning to Count Words in Fluent Speech enables Online Speech Recognition
George Sterpu
Christian Saam
N. Harte
16
4
0
08 Jun 2020
Fusion Recurrent Neural Network
Yiwen Sun
Yulu Wang
Kun Fu
Zheng Wang
Changshui Zhang
Jieping Ye
18
1
0
07 Jun 2020
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition
Jinyu Li
Yu-Huan Wu
Yashesh Gaur
Chengyi Wang
Rui Zhao
Shujie Liu
17
133
0
28 May 2020
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
Keyu An
Hongyu Xiang
Zhijian Ou
14
18
0
27 May 2020
Insertion-Based Modeling for End-to-End Automatic Speech Recognition
Yuya Fujita
Shinji Watanabe
Motoi Omachi
Xuankai Chan
22
31
0
27 May 2020
End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming
Wangyou Zhang
Aswin Shanmugam Subramanian
Xuankai Chang
Shinji Watanabe
Y. Qian
6
27
0
21 May 2020
Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Haoneng Luo
Shiliang Zhang
Ming Lei
Lei Xie
40
33
0
21 May 2020
Enhancing Monotonic Multihead Attention for Streaming ASR
Hirofumi Inaguma
Masato Mimura
Tatsuya Kawahara
24
32
0
19 May 2020
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict
Yosuke Higuchi
Shinji Watanabe
Nanxin Chen
Tetsuji Ogawa
Tetsunori Kobayashi
19
137
0
18 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
134
3,044
0
16 May 2020
Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory
Chunyang Wu
Yongqiang Wang
Yangyang Shi
Ching-Feng Yeh
Frank Zhang
RALM
31
60
0
16 May 2020
FaceFilter: Audio-visual speech separation using still images
Soo-Whan Chung
Soyeon Choe
Joon Son Chung
Hong-Goo Kang
CVBM
21
66
0
14 May 2020
Discriminative Multi-modality Speech Recognition
Bo Xu
Cheng Lu
Yandong Guo
Jacob Wang
26
98
0
12 May 2020
CTC-synchronous Training for Monotonic Attention Model
Hirofumi Inaguma
Masato Mimura
Tatsuya Kawahara
14
7
0
10 May 2020
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Chung-Cheng Chiu
A. Narayanan
Wei Han
Rohit Prabhavalkar
Yu Zhang
...
Ruoming Pang
Tara N. Sainath
Patrick Nguyen
Liangliang Cao
Yonghui Wu
29
42
0
07 May 2020
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training
Heng-Jui Chang
Alexander H. Liu
Hung-yi Lee
Lin-Shan Lee
16
2
0
05 May 2020
Streaming Object Detection for 3-D Point Clouds
Wei Han
Zhengdong Zhang
Benjamin Caine
Brandon Yang
Christoph Sprunk
O. Alsharif
Jiquan Ngiam
Vijay Vasudevan
Jonathon Shlens
Zhehuai Chen
3DPC
27
26
0
04 May 2020
Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition
Hu Hu
Rui Zhao
Jinyu Li
Liang Lu
Jiawei Liu
27
27
0
01 May 2020
Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Baiji Liu
Songjun Cao
Sining Sun
Weibin Zhang
Long Ma
31
9
0
01 May 2020
Progressive Transformers for End-to-End Sign Language Production
Ben Saunders
Necati Cihan Camgöz
Richard Bowden
SLR
29
128
0
30 Apr 2020
Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classifiers
Loc Truong
Chace Jones
Brian Hutchinson
Andrew August
Brenda Praggastis
Robert J. Jasper
Nicole Nichols
Aaron Tuor
AAML
11
49
0
24 Apr 2020
Curriculum Pre-training for End-to-End Speech Translation
Chengyi Wang
Yu Wu
Shujie Liu
Ming Zhou
Zhenglu Yang
29
108
0
21 Apr 2020
Classifying CMB time-ordered data through deep neural networks
F. Rojas
L. Maurin
R. Dünner
K. Pichara
9
4
0
13 Apr 2020
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR
Hirofumi Inaguma
Yashesh Gaur
Liang Lu
Jinyu Li
Jiawei Liu
AI4TS
27
46
0
10 Apr 2020
Direct Speech-to-image Translation
Jiguo Li
Xinfeng Zhang
Chuanmin Jia
Jizheng Xu
Li Zhang
Y. Wang
Siwei Ma
Wen Gao
36
29
0
07 Apr 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Tara N. Sainath
Yanzhang He
Yue Liu
A. Narayanan
Ruoming Pang
...
Trevor Strohman
Mirkó Visontai
Yonghui Wu
Yu Zhang
Ding Zhao
25
215
0
28 Mar 2020
Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments
Koichiro Yoshino
Kohei Wakimoto
Yuta Nishimura
Satoshi Nakamura
6
8
0
23 Mar 2020
High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
T. Nguyen
Ngoc-Quan Pham
S. Stueker
A. Waibel
11
7
0
22 Mar 2020
Previous
1
2
3
...
5
6
7
...
9
10
11
Next