Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.08100
Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition
16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Conformer: Convolution-augmented Transformer for Speech Recognition"
50 / 1,750 papers shown
Title
A Language Agnostic Multilingual Streaming On-Device ASR System
Bo-wen Li
Tara N. Sainath
Ruoming Pang
Shuo-yiin Chang
Qiumin Xu
...
Qiao Liang
Heguang Liu
Yanzhang He
Parisa Haghani
Sameer Bidichandani
AuLLM
36
11
0
29 Aug 2022
Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Shuo-yiin Chang
Guru Prakash
Zelin Wu
Qiao Liang
Tara N. Sainath
Bo-wen Li
Adam Stambler
Shyam Upadhyay
Manaal Faruqui
Trevor Strohman
42
5
0
29 Aug 2022
Turn-Taking Prediction for Natural Conversational Speech
Shuo-yiin Chang
Bo-wen Li
Tara N. Sainath
Chaoyang Zhang
Trevor Strohman
Qiao Liang
Yanzhang He
43
19
0
29 Aug 2022
Towards Disentangled Speech Representations
Cal Peyser
Ronny Huang Andrew Rosenberg Tara N. Sainath
M. Picheny
Kyunghyun Cho
DRL
19
7
0
28 Aug 2022
Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Kaushal Bhogale
A. Raman
Tahir Javed
Sumanth Doddapaneni
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
36
22
0
26 Aug 2022
Combining AI and AM - Improving Approximate Matching through Transformer Networks
Frieder Uhlig
Lukas Struppek
Dominik Hintersdorf
Thomas Gobel
Harald Baier
Kristian Kersting
21
7
0
24 Aug 2022
Transfer Ranking in Finance: Applications to Cross-Sectional Momentum with Data Scarcity
Daniel Poh
Stephen J. Roberts
S. Zohren
31
7
0
21 Aug 2022
Boosting Distributed Training Performance of the Unpadded BERT Model
Jinle Zeng
Min Li
Zhihua Wu
Jiaqi Liu
Yuang Liu
Dianhai Yu
Yanjun Ma
25
10
0
17 Aug 2022
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
A. Andrusenko
R. Nasretdinov
A. Romanenko
20
18
0
16 Aug 2022
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
29
21
0
13 Aug 2022
USB: A Unified Semi-supervised Learning Benchmark for Classification
Yidong Wang
Hao Chen
Yue Fan
Wangbin Sun
R. Tao
...
T. Shinozaki
Bernt Schiele
Jindong Wang
Xingxu Xie
Yue Zhang
32
113
0
12 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
39
24
0
09 Aug 2022
A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation
L. T. Nguyen
Nguyen Luong Tran
Long Doan
Manh Luong
Dat Quoc Nguyen
29
4
0
08 Aug 2022
The SJTU System for Short-duration Speaker Verification Challenge 2021
Bing Han
Zhengyang Chen
Zhikai Zhou
Y. Qian
12
6
0
03 Aug 2022
OLLIE: Derivation-based Tensor Program Optimizer
Liyan Zheng
Haojie Wang
Jidong Zhai
Muyan Hu
Zixuan Ma
Tuowei Wang
Shizhi Tang
Lei Xie
Kezhao Huang
Zhihao Jia
46
3
0
02 Aug 2022
Unified Normalization for Accelerating and Stabilizing Transformers
Qiming Yang
Kai Zhang
Chaoxiang Lan
Zhi Yang
Zheyang Li
Wenming Tan
Jun Xiao
Shiliang Pu
23
8
0
02 Aug 2022
DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition
Zixun Guo
C. Chen
Chng Eng Siong
30
5
0
01 Aug 2022
Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition
Peng Shen
Xugang Lu
Hisashi Kawai
21
2
0
29 Jul 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
Da-Rong Liu
Po-Chun Hsu
Yi-Chen Chen
Sung-Feng Huang
Shun-Po Chuang
Da-Yi Wu
Hung-yi Lee
GAN
31
7
0
29 Jul 2022
Is Attention All That NeRF Needs?
T. MukundVarma
Peihao Wang
Xuxi Chen
Tianlong Chen
Subhashini Venugopalan
Zhangyang Wang
ViT
38
107
0
27 Jul 2022
Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Chunxi Liu
Yuan Shangguan
Haichuan Yang
Yangyang Shi
Raghuraman Krishnamoorthi
Ozlem Kalinli
SSL
34
7
0
25 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
38
9
0
24 Jul 2022
Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration
Haotian Bai
Ruimao Zhang
Jiong Wang
Xiang Wan
WSOL
39
35
0
21 Jul 2022
ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale
Gopinath Chennupati
Milind Rao
Gurpreet Chadha
Aaron Eakin
A. Raju
...
Andrew Oberlin
Buddha Nandanoor
Prahalad Venkataramanan
Zheng Wu
Pankaj Sitpure
CLL
27
8
0
19 Jul 2022
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition
Xun Gong
Zhikai Zhou
Y. Qian
20
3
0
15 Jul 2022
Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments
Yicheng Du
Aditya Arie Nugraha
Kouhei Sekiguchi
Yoshiaki Bando
Mathieu Fontaine
Kazuyoshi Yoshii
22
0
0
15 Jul 2022
Two-Pass Low Latency End-to-End Spoken Language Understanding
Siddhant Arora
Siddharth Dalmia
Xuankai Chang
Brian Yan
A. Black
Shinji Watanabe
VLM
30
19
0
14 Jul 2022
Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Joanna Hong
Minsu Kim
Daehun Yoo
Y. Ro
26
21
0
13 Jul 2022
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks
Nigamaa Nayakanti
Rami Al-Rfou
Aurick Zhou
Kratarth Goel
Khaled S. Refaat
Benjamin Sapp
AI4TS
44
237
0
12 Jul 2022
End-to-end speech recognition modeling from de-identified data
M. Flechl
Shou-Chun Yin
Junho Park
Peter Skala
17
4
0
12 Jul 2022
pMCT: Patched Multi-Condition Training for Robust Speech Recognition
Pablo Peso Parada
A. Dobrowolska
Karthikeyan P. Saravanan
Mete Ozay
40
6
0
11 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
24
5
0
11 Jul 2022
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer
Florian Lux
Pavel Denisov
Julia Koch
Pascal Tilli
Ngoc Thang Vu
34
27
0
11 Jul 2022
Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder
Jicheng Zhang
Yizhou Peng
Haihua Xu
Yi He
Chng Eng Siong
Hao-Ming Huang
AuLLM
28
6
0
09 Jul 2022
Internal Language Model Estimation based Language Model Fusion for Cross-Domain Code-Switching Speech Recognition
Yizhou Peng
Yufei Liu
Jicheng Zhang
Haihua Xu
Yi He
Hao-Ming Huang
Chng Eng Siong
16
9
0
09 Jul 2022
Training Transformers Together
Alexander Borzunov
Max Ryabinin
Tim Dettmers
Quentin Lhoest
Lucile Saulnier
Michael Diskin
Yacine Jernite
Thomas Wolf
ViT
31
8
0
07 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
38
3
0
07 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer
Jiashu Pan
Y. Ting 丁
Jie Yu
20
3
0
06 Jul 2022
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ali Siahkoohi
Michael Chinen
Tom Denton
W. Kleijn
Jan Skoglund
27
8
0
05 Jul 2022
Compute Cost Amortized Transformer for Streaming ASR
Yifan Xie
J. Macoskey
Martin H. Radfar
Feng-Ju Chang
Brian King
Ariya Rastrow
Athanasios Mouchtaris
Grant P. Strimel
30
7
0
05 Jul 2022
Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR
Kun Wei
Yike Zhang
Sining Sun
Lei Xie
Long Ma
26
9
0
03 Jul 2022
M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation
Jinming Zhao
Haomiao Yang
Ehsan Shareghi
Gholamreza Haffari
56
19
0
03 Jul 2022
Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Kun Wei
Pengcheng Guo
Ning Jiang
56
11
0
02 Jul 2022
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition
Guangzhi Sun
C. Zhang
P. Woodland
22
12
0
02 Jul 2022
UserLibri: A Dataset for ASR Personalization Using Only Text
Theresa Breiner
Swaroop Indra Ramaswamy
Ehsan Variani
Shefali Garg
Rajiv Mathews
K. Sim
Kilol Gupta
Mingqing Chen
Lara McConnaughey
35
16
0
02 Jul 2022
Measuring Forgetting of Memorized Training Examples
Matthew Jagielski
Om Thakkar
Florian Tramèr
Daphne Ippolito
Katherine Lee
...
Eric Wallace
Shuang Song
Abhradeep Thakurta
Nicolas Papernot
Chiyuan Zhang
TDI
75
102
0
30 Jun 2022
Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition
Kai Zhen
Hieu Duy Nguyen
Ravi Chinta
Nathan Susanj
Athanasios Mouchtaris
Tariq Afzal
Ariya Rastrow
MQ
30
11
0
30 Jun 2022
FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition
Szu-Jui Chen
Jiamin Xie
John H. L. Hansen
45
8
0
30 Jun 2022
Improving Deliberation by Text-Only and Semi-Supervised Training
Ke Hu
Tara N. Sainath
Yanzhang He
Rohit Prabhavalkar
Trevor Strohman
S. Mavandadi
Weiran Wang
39
12
0
29 Jun 2022
Previous
1
2
3
...
24
25
26
...
33
34
35
Next