Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.08100
Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition
16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conformer: Convolution-augmented Transformer for Speech Recognition"
50 / 1,749 papers shown
Title
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Yukun Liu
Ta Li
Pengyuan Zhang
Yonghong Yan
AI4TS
19
6
0
12 Apr 2021
A Toolbox for Construction and Analysis of Speech Datasets
Evelina Bakhturina
Vitaly Lavrukhin
Boris Ginsburg
22
12
0
11 Apr 2021
Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR
Fan Yu
Haoneng Luo
Pengcheng Guo
Yuhao Liang
Zhuoyuan Yao
Lei Xie
Yingying Gao
Leijing Hou
Shilei Zhang
11
11
0
10 Apr 2021
Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
Yifan Jiang
Tara N. Sainath
Cal Peyser
Shankar Kumar
David Rybach
Trevor Strohman
RALM
LMTD
28
5
0
09 Apr 2021
On Architectures and Training for Raw Waveform Feature Extraction in ASR
Peter Vieting
Christoph Luscher
Wilfried Michel
Ralf Schluter
Hermann Ney
24
9
0
09 Apr 2021
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency
Jinchuan Tian
Rongzhi Gu
Helin Wang
Yuexian Zou
26
0
0
08 Apr 2021
WNARS: WFST based Non-autoregressive Streaming End-to-End Speech Recognition
Zhichao Wang
Wenwen Yang
Pan Zhou
Wei Chen
RALM
32
17
0
08 Apr 2021
Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings
L. Pepino
Pablo Riera
Luciana Ferrer
16
349
0
08 Apr 2021
Pushing the Limits of Non-Autoregressive Speech Recognition
Edwin G. Ng
Chung-Cheng Chiu
Yu Zhang
William Chan
VLM
8
27
0
07 Apr 2021
Librispeech Transducer Model with Internal Language Model Prior Correction
Albert Zeyer
André Merboldt
Wilfried Michel
Ralf Schluter
Hermann Ney
13
28
0
07 Apr 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
30
56
0
07 Apr 2021
Darts-Conformer: Towards Efficient Gradient-Based Neural Architecture Search For End-to-End ASR
Xian Shi
Pan Zhou
Wei Chen
Lei Xie
22
17
0
07 Apr 2021
Interpreting A Pre-trained Model Is A Key For Model Architecture Optimization: A Case Study On Wav2Vec 2.0
Liu Chen
Meysam Asgari
19
1
0
07 Apr 2021
Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models
Zhiyun Lu
Wei Han
Yu Zhang
Liangliang Cao
AAML
54
16
0
06 Apr 2021
Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions
Jumon Nozaki
Tatsuya Komatsu
18
71
0
06 Apr 2021
Non-autoregressive Mandarin-English Code-switching Speech Recognition
Shun-Po Chuang
Heng-Jui Chang
Sung-Feng Huang
Hung-yi Lee
18
15
0
06 Apr 2021
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Yuan Shangguan
Rohit Prabhavalkar
Hang Su
Jay Mahadeokar
Yangyang Shi
...
Chunyang Wu
Duc Le
Ozlem Kalinli
Christian Fuegen
M. Seltzer
28
27
0
06 Apr 2021
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Duc Le
Mahaveer Jain
Gil Keren
Suyoun Kim
Yangyang Shi
...
Yuan Shangguan
Christian Fuegen
Ozlem Kalinli
Yatharth Saraf
M. Seltzer
27
90
0
05 Apr 2021
Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Yangyang Shi
Varun K. Nagaraja
Chunyang Wu
Jay Mahadeokar
Duc Le
...
Ching-Feng Yeh
Julian Chan
Christian Fuegen
Ozlem Kalinli
M. Seltzer
27
15
0
05 Apr 2021
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
William Chan
Daniel S. Park
Chris A. Lee
Yu Zhang
Quoc V. Le
Mohammad Norouzi
AI4TS
37
136
0
05 Apr 2021
End-to-End Speaker-Attributed ASR with Transformer
Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
19
47
0
05 Apr 2021
SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Patrick K. O’Neill
Vitaly Lavrukhin
Somshubra Majumdar
Vahid Noroozi
Yuekai Zhang
...
Keenan Freyberg
Michael D. Shulman
Boris Ginsburg
Shinji Watanabe
Georg Kucsko
AI4TS
28
59
0
05 Apr 2021
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
28
830
0
05 Apr 2021
Towards Lifelong Learning of End-to-end ASR
Heng-Jui Chang
Hung-yi Lee
Lin-Shan Lee
KELM
CLL
35
34
0
04 Apr 2021
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
Wei-Ning Hsu
Anuroop Sriram
Alexei Baevski
Tatiana Likhomanenko
Qiantong Xu
...
Jacob Kahn
Ann Lee
R. Collobert
Gabriel Synnaeve
Michael Auli
SSL
22
236
0
02 Apr 2021
Keyword Transformer: A Self-Attention Model for Keyword Spotting
Axel Berg
Mark O'Connor
M. T. Cruz
24
132
0
01 Apr 2021
Evaluating Neural Word Embeddings for Sanskrit
Kevin Qinghong Lin
Om Adideva
Digumarthi Komal
Laxmidhar Behera
Pawan Goyal
29
12
0
01 Apr 2021
Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
Gonçalo Mordido
Matthijs Van Keirsbilck
A. Keller
32
6
0
31 Mar 2021
Integer-only Zero-shot Quantization for Efficient Speech Recognition
Sehoon Kim
A. Gholami
Z. Yao
Nicholas Lee
Patrick Wang
Aniruddha Nrusimha
Bohan Zhai
Tianren Gao
Michael W. Mahoney
Kurt Keutzer
MQ
23
23
0
31 Mar 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Naoyuki Kanda
Guoli Ye
Yu-Huan Wu
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
31
41
0
31 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
63
1,878
0
29 Mar 2021
Scaling sparsemax based channel selection for speech recognition with ad-hoc microphone arrays
Junqi Chen
Xiao-Lei Zhang
11
10
0
29 Mar 2021
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
14
93
0
26 Mar 2021
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation
Md. Akmal Haidar
Chao Xing
Mehdi Rezagholizadeh
27
7
0
17 Mar 2021
Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
A. Laptev
A. Andrusenko
Ivan Podluzhny
Anton Mitrofanov
Ivan Medennikov
Yuri N. Matveev
VLM
26
14
0
12 Mar 2021
Learning Word-Level Confidence For Subword End-to-End ASR
David Qiu
Qiujia Li
Yanzhang He
Yu Zhang
Bo-wen Li
...
Deepti Bhatia
Wei Li
Ke Hu
Tara N. Sainath
Ian McGraw
32
32
0
11 Mar 2021
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition
Hirofumi Inaguma
Tatsuya Kawahara
16
13
0
28 Feb 2021
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Tao Lei
RALM
VLM
59
47
0
24 Feb 2021
Conditional Positional Encodings for Vision Transformers
Xiangxiang Chu
Zhi Tian
Bo-Wen Zhang
Xinlong Wang
Chunhua Shen
ViT
36
605
0
22 Feb 2021
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Xian Shi
Fan Yu
Yizhou Lu
Yuhao Liang
Qiangze Feng
Daliang Wang
Y. Qian
Lei Xie
24
66
0
20 Feb 2021
Echo State Speech Recognition
H. Shrivastava
Ankush Garg
Yuan Cao
Yu Zhang
Tara N. Sainath
50
22
0
18 Feb 2021
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
84
225
0
12 Feb 2021
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong Lee
Shinji Watanabe
118
135
0
05 Feb 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
41
207
0
03 Feb 2021
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit
Zhuoyuan Yao
Di Wu
Xiong Wang
Binbin Zhang
Fan Yu
Chao Yang
Zhendong Peng
Xiaoyu Chen
Lei Xie
X. Lei
19
260
0
02 Feb 2021
The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap
Shota Horiguchi
Nelson Yalta
Leibny Paola García-Perera
Yuki Takashima
Yawen Xue
Desh Raj
Zili Huang
Yusuke Fujita
Shinji Watanabe
Sanjeev Khudanpur
BDL
19
36
0
02 Feb 2021
Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yoloxóchitl Mixtec
Jiatong Shi
Jiatong Shi. Jonathan D. Amith
Rey Castillo García
Esteban Guadalupe Sierra
Kevin Duh
Shinji Watanabe
25
46
0
26 Jan 2021
Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Yuekai Zhang
Sining Sun
Long Ma
27
28
0
18 Jan 2021
Fast offline Transformer-based end-to-end automatic speech recognition for real-world applications
Y. Oh
Kiyoung Park
Jeongue Park
OffRL
22
5
0
14 Jan 2021
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection
Qing Wang
Jun Du
Hua-Xin Wu
Jia Pan
Feng Ma
Chin-Hui Lee
13
79
0
08 Jan 2021
Previous
1
2
3
...
33
34
35
Next