Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.03191
Cited By
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
7 May 2020
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context"
50 / 131 papers shown
Title
Multi-blank Transducers for Speech Recognition
Hainan Xu
Fei Jia
Somshubra Majumdar
Shinji Watanabe
Boris Ginsburg
33
11
0
04 Nov 2022
Structured State Space Decoder for Speech Recognition and Synthesis
Koichi Miyazaki
Masato Murata
Tomoki Koriyama
34
12
0
31 Oct 2022
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification
Fei Jia
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
33
3
0
27 Oct 2022
LW-ISP: A Lightweight Model with ISP and Deep Learning
Hongyang Chen
Kaisheng Ma
VLM
24
1
0
08 Oct 2022
A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Kyuhong Shim
Wonyong Sung
27
2
0
01 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
61
105
0
30 Sep 2022
ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition
Martin H. Radfar
Rohit Barnwal
R. Swaminathan
Feng-Ju Chang
Grant P. Strimel
Nathan Susanj
Athanasios Mouchtaris
34
13
0
29 Sep 2022
Attention Enhanced Citrinet for Speech Recognition
Xianchao Wu
13
1
0
01 Sep 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
Da-Rong Liu
Po-Chun Hsu
Yi-Chen Chen
Sung-Feng Huang
Shun-Po Chuang
Da-Yi Wu
Hung-yi Lee
GAN
31
7
0
29 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
30
143
0
06 Jul 2022
On the Prediction Network Architecture in RNN-T for ASR
Dario Albesano
Jesús Andrés-Ferrer
Nicola Ferri
Puming Zhan
AI4TS
24
0
0
29 Jun 2022
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Sehoon Kim
A. Gholami
Albert Eaton Shaw
Nicholas Lee
K. Mangalam
Jitendra Malik
Michael W. Mahoney
Kurt Keutzer
32
99
0
02 Jun 2022
Easter2.0: Improving convolutional models for handwritten text recognition
Kartik Chaudhary
Raghav Bali
36
9
0
30 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
38
0
02 May 2022
Efficient Training of Neural Transducer for Speech Recognition
Wei Zhou
Wilfried Michel
Ralf Schluter
Hermann Ney
AI4TS
24
22
0
22 Apr 2022
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
26
5
0
12 Apr 2022
Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
Zehai Tu
Ning Ma
Jon Barker
11
14
0
08 Apr 2022
Auditory-Based Data Augmentation for End-to-End Automatic Speech Recognition
Zehai Tu
Jack Deadman
Ning Ma
Jon Barker
35
4
0
08 Apr 2022
A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition
Rishabh Jain
Andrei Barcovschi
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
28
31
0
06 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
31
74
0
05 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
33
51
0
04 Apr 2022
Combination of Time-domain, Frequency-domain, and Cepstral-domain Acoustic Features for Speech Commands Classification
Yikang Wang
Hiromitsu Nishizaki
36
1
0
30 Mar 2022
Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition
J. Sun
Guiping Zhong
Dinghao Zhou
Baoxiang Li
Yiran Zhong
33
7
0
29 Mar 2022
Spatial Processing Front-End For Distant ASR Exploiting Self-Attention Channel Combinator
D. Sharma
Rong Gong
James Fosburgh
S. Kruchinin
Patrick A. Naylor
Ljubomir Milanović
42
6
0
25 Mar 2022
MuSE-SVS: Multi-Singer Emotional Singing Voice Synthesizer that Controls Emotional Intensity
Sungjae Kim
Y.E. Kim
Jewoo Jun
Injung Kim
31
14
0
02 Mar 2022
A Survey of Multilingual Models for Automatic Speech Recognition
Hemant Yadav
Sunayana Sitaram
24
35
0
25 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
38
23
0
25 Jan 2022
MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcription
Dabiao Ma
Yitong Zhang
Meng Li
Feng Ye
19
1
0
19 Jan 2022
Are E2E ASR models ready for an industrial usage?
Valentin Vielzeuf
G. Antipov
26
8
0
09 Dec 2021
Soft-Sensing ConFormer: A Curriculum Learning-based Convolutional Transformer
Jaswanth K. Yella
Chao Zhang
Sergei Petrov
Yu Huang
Xiaoye Qian
A. Minai
Sthitie Bom
33
7
0
12 Nov 2021
Conformer-based Hybrid ASR System for Switchboard Dataset
Mohammad Zeineldeen
Jingjing Xu
Christoph Luscher
Wilfried Michel
Alexander Gerstenberger
Ralf Schluter
Hermann Ney
27
24
0
05 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
132
1,721
0
26 Oct 2021
Multi-Modal Pre-Training for Automated Speech Recognition
David M. Chan
Shalini Ghosh
D. Chakrabarty
Björn Hoffmeister
SSL
30
16
0
12 Oct 2021
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Jing Pan
Tao Lei
Kwangyoun Kim
Kyu Jeong Han
Shinji Watanabe
VLM
34
9
0
11 Oct 2021
TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Nithin Rao Koluguri
Taejin Park
Boris Ginsburg
ViT
36
94
0
08 Oct 2021
Spell my name: keyword boosted speech recognition
Namkyu Jung
Geon-min Kim
Joon Son Chung
51
13
0
06 Oct 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
27
71
0
14 Sep 2021
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition
Rong Gong
Carl Quillen
D. Sharma
Andrew Goderre
José Laínez
Ljubomir Milanović
39
13
0
10 Sep 2021
Real World Robustness from Systematic Noise
Yan Wang
Yuhang Li
Ruihao Gong
36
7
0
02 Sep 2021
Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Maxime Burchi
Valentin Vielzeuf
37
84
0
31 Aug 2021
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
A. Kalinov
Somshubra Majumdar
Jagadeesh Balam
Boris Ginsburg
MoE
24
3
0
22 Jul 2021
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Cheng-I Jeff Lai
Yang Zhang
Alexander H. Liu
Shiyu Chang
Yi-Lun Liao
Yung-Sung Chuang
Kaizhi Qian
Sameer Khurana
David D. Cox
James R. Glass
VLM
75
70
0
10 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
24
752
0
08 Jun 2021
A Neural Acoustic Echo Canceller Optimized Using An Automatic Speech Recognizer And Large Scale Synthetic Data
N. Howard
Alex Park
T. Shabestary
A. Gruenstein
Rohit Prabhavalkar
13
15
0
01 Jun 2021
Unsupervised Speech Recognition
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
28
271
0
24 May 2021
Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries
Sukhdeep S. Sodhi
E. Chio
Ambarish Jash
Santiago Ontañón
Ajit Apte
...
Tameen Khan
Amol Wankhede
M. Alzantot
Allen Wu
Tushar Chandra
25
9
0
20 May 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR
Bo-wen Li
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Yifan Jiang
Min Ma
Junwen Bai
CLL
34
76
0
30 Apr 2021
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Thibault Doutre
Wei Han
Chung-Cheng Chiu
Ruoming Pang
Olivier Siohan
Liangliang Cao
38
5
0
25 Apr 2021
Efficient conformer-based speech recognition with linear attention
Shengqiang Li
Menglong Xu
Xiao-Lei Zhang
24
20
0
14 Apr 2021
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Yukun Liu
Ta Li
Pengyuan Zhang
Yonghong Yan
AI4TS
27
6
0
12 Apr 2021
Previous
1
2
3
Next