Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.16663
Cited By
v1
v2 (latest)
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model
29 October 2022
Yosuke Higuchi
Brian Yan
Siddhant Arora
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model"
50 / 65 papers shown
Title
A Non-autoregressive Model for Joint STT and TTS
Vishal Sunder
Brian Kingsbury
G. Saon
Samuel Thomas
Slava Shechtman Hagai Aronowitz
Hagai Aronowitz
Eric Fosler-Lussier
Luis A. Lastras
119
0
0
15 Jan 2025
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Takuma Udagawa
Masayuki Suzuki
Gakuto Kurata
N. Itoh
G. Saon
96
24
0
01 Apr 2022
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
Jaesong Lee
Lukas Lee
Shinji Watanabe
83
8
0
31 Mar 2022
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Yotaro Kubo
Shigeki Karita
M. Bacchiani
38
27
0
16 Feb 2022
Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
Keqi Deng
Zehui Yang
Shinji Watanabe
Yosuke Higuchi
Gaofeng Cheng
Pengyuan Zhang
51
23
0
25 Jan 2022
A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Florian Boyer
Yusuke Shinohara
Takaaki Ishii
Hirofumi Inaguma
Shinji Watanabe
67
35
0
14 Jan 2022
Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model
Keqi Deng
Songjun Cao
Yike Zhang
Long Ma
VLM
39
31
0
14 Dec 2021
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Siddhant Arora
Siddharth Dalmia
Pavel Denisov
Xuankai Chang
Yushi Ueda
...
Karthik Ganesan
Brian Yan
Ngoc Thang Vu
A. Black
Shinji Watanabe
VLM
75
75
0
29 Nov 2021
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Yosuke Higuchi
Nanxin Chen
Yuya Fujita
Hirofumi Inaguma
Tatsuya Komatsu
Jaesong Lee
Jumon Nozaki
Tianzi Wang
Shinji Watanabe
49
42
0
11 Oct 2021
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units
Yosuke Higuchi
Keita Karube
Tetsuji Ogawa
Tetsunori Kobayashi
47
23
0
08 Oct 2021
ASR Rescoring and Confidence Estimation with ELECTRA
Hayato Futami
Hirofumi Inaguma
Masato Mimura
S. Sakai
Tatsuya Kawahara
KELM
94
20
0
05 Oct 2021
Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
Guolin Zheng
Yubei Xiao
Ke Gong
Pan Zhou
Xiaodan Liang
Liang Lin
67
26
0
19 Sep 2021
Layer Pruning on Demand with Intermediate CTC
Jaesong Lee
Jingu Kang
Shinji Watanabe
36
18
0
17 Jun 2021
Integration of Pre-trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding
S. Seo
Donghyun Kwak
Bowon Lee
64
33
0
15 Apr 2021
Innovative Bert-based Reranking Language Models for Speech Recognition
Shih-Hsuan Chiu
Berlin Chen
38
45
0
11 Apr 2021
Non-autoregressive Transformer-based End-to-end ASR using BERT
Fu-Hao Yu
Kuan-Yu Chen
39
23
0
10 Apr 2021
Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions
Jumon Nozaki
Tatsuya Komatsu
62
74
0
06 Apr 2021
Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
Shuai Zhang
RALM
83
51
0
15 Feb 2021
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong Lee
Shinji Watanabe
132
138
0
05 Feb 2021
Speech Recognition by Simply Fine-tuning BERT
Wen-Chin Huang
Chia-Hua Wu
Shang-Bao Luo
Kuan-Yu Chen
Hsin-Min Wang
Tomoki Toda
111
28
0
30 Jan 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
Cheng Yi
Shiyu Zhou
Bo Xu
92
40
0
17 Jan 2021
SLURP: A Spoken Language Understanding Resource Package
E. Bastianelli
Andrea Vanzo
P. Swietojanski
Verena Rieser
VLM
88
229
0
26 Nov 2020
Multitask Learning and Joint Optimization for Transformer-RNN-Transducer Speech Recognition
J. Jeon
Eesung Kim
29
13
0
02 Nov 2020
Recent Developments on ESPnet Toolkit Boosted by Conformer
Pengcheng Guo
Florian Boyer
Xuankai Chang
Tomoki Hayashi
Yosuke Higuchi
...
Jing Shi
Shinji Watanabe
Kun Wei
Wangyou Zhang
Yuekai Zhang
75
263
0
26 Oct 2020
Improved Mask-CTC for Non-Autoregressive End-to-End ASR
Yosuke Higuchi
Hirofumi Inaguma
Shinji Watanabe
Tetsuji Ogawa
Tetsunori Kobayashi
56
61
0
26 Oct 2020
Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment
Ethan A. Chi
Julian Salazar
Katrin Kirchhoff
AI4TS
70
51
0
24 Oct 2020
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Hicham El Boukkouri
Olivier Ferret
Thomas Lavergne
Hiroshi Noji
Pierre Zweigenbaum
Junichi Tsujii
110
161
0
20 Oct 2020
Incorporating BERT into Parallel Sequence Decoding with Adapters
Junliang Guo
Zhirui Zhang
Linli Xu
Hao-Ran Wei
Boxing Chen
Enhong Chen
85
69
0
13 Oct 2020
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR
Hayato Futami
Hirofumi Inaguma
Sei Ueno
Masato Mimura
S. Sakai
Tatsuya Kawahara
65
53
0
09 Aug 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
795
42,055
0
28 May 2020
Insertion-Based Modeling for End-to-End Automatic Speech Recognition
Yuya Fujita
Shinji Watanabe
Motoi Omachi
Xuankai Chan
64
31
0
27 May 2020
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict
Yosuke Higuchi
Shinji Watanabe
Nanxin Chen
Tetsuji Ogawa
Tetsunori Kobayashi
55
138
0
18 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
223
3,139
0
16 May 2020
Imputer: Sequence Modelling via Imputation and Dynamic Programming
William Chan
Chitwan Saharia
Geoffrey E. Hinton
Mohammad Norouzi
Navdeep Jaitly
BDL
AI4TS
70
115
0
20 Feb 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
514
42,449
0
03 Dec 2019
Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks
Andros Tjandra
Chunxi Liu
Frank Zhang
Xiaohui Zhang
Yongqiang Wang
Gabriel Synnaeve
Satoshi Nakamura
Geoffrey Zweig
ViT
65
45
0
23 Oct 2019
Transformer ASR with Contextual Block Processing
E. Tsunoo
Yosuke Kashiwagi
Toshiyuki Kumakura
Shinji Watanabe
84
64
0
16 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
232
7,520
0
02 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
368
6,455
0
26 Sep 2019
Two-Pass End-to-End Speech Recognition
Tara N. Sainath
Ruoming Pang
David Rybach
Yanzhang He
Rohit Prabhavalkar
...
Qiao Liang
Trevor Strohman
Yonghui Wu
Ian McGraw
Chung-Cheng Chiu
77
148
0
29 Aug 2019
Levenshtein Transformer
Jiatao Gu
Changhan Wang
Jake Zhao
116
359
0
27 May 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
138
1,476
0
15 May 2019
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Daniel S. Park
William Chan
Yu Zhang
Chung-Cheng Chiu
Barret Zoph
E. D. Cubuk
Quoc V. Le
VLM
177
3,461
0
18 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
Hierarchical Multi Task Learning With CTC
Ramon Sanabria
Florian Metze
62
50
0
18 Jul 2018
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Taku Kudo
223
1,169
0
29 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
109
1,507
0
30 Mar 2018
An analysis of incorporating an external language model into a sequence-to-sequence model
Anjuli Kannan
Yonghui Wu
Patrick Nguyen
Tara N. Sainath
Zhiwen Chen
Rohit Prabhavalkar
72
247
0
06 Dec 2017
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Chung-Cheng Chiu
Tara N. Sainath
Yonghui Wu
Rohit Prabhavalkar
Patrick Nguyen
...
Katya Gonina
Navdeep Jaitly
Yue Liu
J. Chorowski
M. Bacchiani
AI4TS
89
1,153
0
05 Dec 2017
Non-Autoregressive Neural Machine Translation
Jiatao Gu
James Bradbury
Caiming Xiong
Victor O.K. Li
R. Socher
105
796
0
07 Nov 2017
1
2
Next