ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.08100
  4. Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition

Conformer: Convolution-augmented Transformer for Speech Recognition

16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
ArXivPDFHTML

Papers citing "Conformer: Convolution-augmented Transformer for Speech Recognition"

50 / 1,750 papers shown
Title
Perception and Semantic Aware Regularization for Sequential Confidence
  Calibration
Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Zhenghua Peng
Yuanmao Luo
Tianshui Chen
Keke Xu
Shuangping Huang
AI4TS
35
2
0
31 May 2023
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial
  Attack in Speaker Identification
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification
Qing Wang
Jixun Yao
Ziqian Wang
Pengcheng Guo
Linfu Xie
AAML
39
1
0
30 May 2023
Client: Cross-variable Linear Integrated Enhanced Transformer for
  Multivariate Long-Term Time Series Forecasting
Client: Cross-variable Linear Integrated Enhanced Transformer for Multivariate Long-Term Time Series Forecasting
Jiaxin Gao
Wenbo Hu
Yuntian Chen
AI4TS
27
13
0
30 May 2023
Graph Neural Networks for Contextual ASR with the Tree-Constrained
  Pointer Generator
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun
Chuxu Zhang
P. Woodland
20
4
0
30 May 2023
HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition
HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition
Florian Mai
Juan Pablo Zuluaga
Titouan Parcollet
P. Motlícek
36
10
0
29 May 2023
Exploration of Efficient End-to-End ASR using Discretized Input from
  Self-Supervised Learning
Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning
Xuankai Chang
Brian Yan
Yuya Fujita
Takashi Maekaku
Shinji Watanabe
32
38
0
29 May 2023
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Xilin Jiang
Yinghao Aaron Li
N. Mesgarani
CLL
29
1
0
29 May 2023
Retraining-free Customized ASR for Enharmonic Words Based on a
  Named-Entity-Aware Model and Phoneme Similarity Estimation
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation
Yui Sudo
K. Hata
K. Nakadai
31
2
0
29 May 2023
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech
  Recognition
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition
Wei Zhou
Eugen Beck
Simon Berger
Ralf Schluter
Hermann Ney
VLM
42
4
0
28 May 2023
Translatotron 3: Speech to Speech Translation with Monolingual Data
Translatotron 3: Speech to Speech Translation with Monolingual Data
Eliya Nachmani
Alon Levkovitch
Yi-Yang Ding
Chulayutsh Asawaroengchai
Heiga Zen
Michelle Tadmor Ramanovich
41
14
0
27 May 2023
CIF-PT: Bridging Speech and Text Representations for Spoken Language
  Understanding via Continuous Integrate-and-Fire Pre-Training
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training
Linhao Dong
Zhecheng An
Peihao Wu
Jun Zhang
Lu Lu
Zejun Ma
24
6
0
27 May 2023
Bridging the Granularity Gap for Acoustic Modeling
Bridging the Granularity Gap for Acoustic Modeling
Chen Xu
Yuhao Zhang
Chengbo Jiao
Xiaoqian Liu
Chi Hu
Xin Zeng
Tong Xiao
Anxiang Ma
Huizhen Wang
JingBo Zhu
34
6
0
27 May 2023
TranSFormer: Slow-Fast Transformer for Machine Translation
TranSFormer: Slow-Fast Transformer for Machine Translation
Bei Li
Yi Jing
Xu Tan
Zhen Xing
Tong Xiao
Jingbo Zhu
49
7
0
26 May 2023
Robustness of Multi-Source MT to Transcription Errors
Robustness of Multi-Source MT to Transcription Errors
Dominik Machávcek
Peter Polák
Ondrej Bojar
Raj Dabre
36
4
0
26 May 2023
Svarah: Evaluating English ASR Systems on Indian Accents
Svarah: Evaluating English ASR Systems on Indian Accents
Tahir Javed
Sakshi Joshi
Vignesh Nagarajan
Sairam Sundaresan
J. Nawale
A. Raman
Kaushal Bhogale
Pratyush Kumar
Mitesh M. Khapra
25
8
0
25 May 2023
Efficient Neural Music Generation
Efficient Neural Music Generation
Max W. Y. Lam
Qiao Tian
Tang-Chun Li
Zongyu Yin
Siyuan Feng
...
Mingbo Ma
Xuchen Song
Jitong Chen
Yuping Wang
Yuxuan Wang
DiffM
MGen
34
49
0
25 May 2023
Mixture-of-Expert Conformer for Streaming Multilingual ASR
Mixture-of-Expert Conformer for Streaming Multilingual ASR
Ke Hu
Yue Liu
Tara N. Sainath
Yu Zhang
F. Beaufays
MoE
41
14
0
25 May 2023
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
David Qiu
David Rim
Shaojin Ding
Oleg Rybakov
Yanzhang He
MQ
37
4
0
24 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
78
8
0
24 May 2023
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Kaushal Bhogale
Sairam Sundaresan
A. Raman
Tahir Javed
Mitesh M. Khapra
Pratyush Kumar
VLM
41
10
0
24 May 2023
Spoken Question Answering and Speech Continuation Using
  Spectrogram-Powered LLM
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Eliya Nachmani
Alon Levkovitch
Roy Hirsch
Julián Salazar
Chulayutsh Asawaroengchai
Soroosh Mariooryad
Ehud Rivlin
RJ Skerry-Ryan
Michelle Tadmor Ramanovich
AuLLM
39
35
0
24 May 2023
Iteratively Improving Speech Recognition and Voice Conversion
Iteratively Improving Speech Recognition and Voice Conversion
Mayank Singh
Naoya Takahashi
Ono Naoyuki
20
4
0
24 May 2023
InterFormer: Interactive Local and Global Features Fusion for Automatic
  Speech Recognition
InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Zhibing Lai
Tianren Zhang
Qi Liu
Xinyuan Qian
Li-Fang Wei
Songlu Chen
Feng Chen
Xu-Cheng Yin
35
2
0
24 May 2023
P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker
  Verification
P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker Verification
Xiyuan Wang
Fangyuan Wang
Bo Xu
Liang Xu
Jing Xiao
21
6
0
24 May 2023
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic
  and Semantic Cooperative Decoding
Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding
Tianren Zhang
Haibo Qin
Zhibing Lai
Songlu Chen
Qi Liu
Feng Chen
Xinyuan Qian
Xu-Cheng Yin
38
0
0
23 May 2023
Detection of Cross-Dataset Fake Audio Based on Prosodic and
  Pronunciation Features
Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features
Chenglong Wang
Jiangyan Yi
J. Tao
Chuyuan Zhang
Shuai Zhang
Xun Chen
26
16
0
23 May 2023
Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for
  Low-Resource Speech Recognition with Transducers
Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers
J. Silovský
Liuhui Deng
Arturo Argueta
Tresi Arvizo
Roger Hsiao
Sasha Kuznietsov
Yiu-Chang Lin
Xiaoqiang Xiao
Yuanyuan Zhang
30
2
0
23 May 2023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with
  Context-Aware Diffusion Models
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Ziyue Jiang
Qiang Yang
Jia-li Zuo
Zhe Ye
Rongjie Huang
Yixiang Ren
Zhou Zhao
DiffM
70
14
0
23 May 2023
Modular Domain Adaptation for Conformer-Based Streaming ASR
Modular Domain Adaptation for Conformer-Based Streaming ASR
Qiujia Li
Yue Liu
DongSeon Hwang
Tara N. Sainath
P. M. Mengibar
41
12
0
22 May 2023
CopyNE: Better Contextual ASR by Copying Named Entities
CopyNE: Better Contextual ASR by Copying Named Entities
Shilin Zhou
Zhenghua Li
Yu Hong
Hao Fei
Zhefeng Wang
Baoxing Huai
28
6
0
22 May 2023
GNCformer Enhanced Self-attention for Automatic Speech Recognition
GNCformer Enhanced Self-attention for Automatic Speech Recognition
Junlong Li
Z. Duan
S. Li
X. Yu
G. Yang
20
1
0
22 May 2023
Exploring Energy-based Language Models with Different Architectures and
  Training Methods for Speech Recognition
Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition
Hong Liu
Z. Lv
Zhijian Ou
Wenbo Zhao
Qing Xiao
26
0
0
22 May 2023
Duplex Diffusion Models Improve Speech-to-Speech Translation
Duplex Diffusion Models Improve Speech-to-Speech Translation
Xianchao Wu
DiffM
27
4
0
22 May 2023
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Karel Beneš
M. Kocour
L. Burget
40
2
0
21 May 2023
Multi-Head State Space Model for Speech Recognition
Multi-Head State Space Model for Speech Recognition
Yassir Fathullah
Chunyang Wu
Yuan Shangguan
Junteng Jia
Wenhan Xiong
...
Chunxi Liu
Yangyang Shi
Ozlem Kalinli
M. Seltzer
Mark Gales
34
13
0
21 May 2023
CASA-ASR: Context-Aware Speaker-Attributed ASR
CASA-ASR: Context-Aware Speaker-Attributed ASR
Mohan Shi
Zhihao Du
Qian Chen
Fan Yu
Yangze Li
Shiliang Zhang
Jie Zhang
Lirong Dai
36
8
0
21 May 2023
Semantic VAD: Low-Latency Voice Activity Detection for Speech
  Interaction
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction
Mohan Shi
Yuchun Shu
Lingyun Zuo
Qiang Chen
Shiliang Zhang
Jie Zhang
Lirong Dai
VLM
40
3
0
21 May 2023
EE-TTS: Emphatic Expressive TTS with Linguistic Information
EE-TTS: Emphatic Expressive TTS with Linguistic Information
Yifan Zhong
Chen Zhang
Xule Liu
Chenxi Sun
Weishan Deng
Haifeng Hu
Zhongqian Sun
26
3
0
20 May 2023
A New Benchmark of Aphasia Speech Recognition and Detection Based on
  E-Branchformer and Multi-task Learning
A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning
Jiyang Tang
William Chen
Xuankai Chang
Shinji Watanabe
B. MacWhinney
29
10
0
19 May 2023
Recycle-and-Distill: Universal Compression Strategy for
  Transformer-based Speech SSL Models with Attention Map Reusing and Masking
  Distillation
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
37
5
0
19 May 2023
Blank-regularized CTC for Frame Skipping in Neural Transducer
Blank-regularized CTC for Frame Skipping in Neural Transducer
Yifan Yang
Xiaoyu Yang
Liyong Guo
Zengwei Yao
Wei Kang
Fangjun Kuang
Long Lin
Xie Chen
Daniel Povey
18
9
0
19 May 2023
AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide
  for Simultaneous Speech Translation
AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation
Sara Papi
Marco Turchi
Matteo Negri
40
20
0
19 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
39
17
0
18 May 2023
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Zhifu Gao
Zerui Li
Jiaming Wang
Haoneng Luo
Xian Shi
...
Yabin Li
Lingyun Zuo
Zhihao Du
Zhangyu Xiao
Shiliang Zhang
39
54
0
18 May 2023
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Jinzheng He
Jinglin Liu
Zhenhui Ye
Rongjie Huang
Chenye Cui
Huadai Liu
Zhou Zhao
DiffM
22
19
0
18 May 2023
Use of Speech Impairment Severity for Dysarthric Speech Recognition
Use of Speech Impairment Severity for Dysarthric Speech Recognition
Mengzhe Geng
Zengrui Jin
Tianzi Wang
Shujie Hu
Jiajun Deng
Mingyu Cui
Guinan Li
Jianwei Yu
Xurong Xie
Xunying Liu
28
9
0
18 May 2023
EENED: End-to-End Neural Epilepsy Detection based on Convolutional
  Transformer
EENED: End-to-End Neural Epilepsy Detection based on Convolutional Transformer
Chenyu Liu
Xin-qiu Zhou
Yang Liu
ViT
MedIm
26
1
0
17 May 2023
Restoring Images Captured in Arbitrary Hybrid Adverse Weather Conditions
  in One Go
Restoring Images Captured in Arbitrary Hybrid Adverse Weather Conditions in One Go
Yecong Wan
Mingzhen Shao
Yuanshuo Cheng
YueQin Liu
Zhipeng Bao
26
5
0
17 May 2023
SoundStorm: Efficient Parallel Audio Generation
SoundStorm: Efficient Parallel Audio Generation
Zalan Borsos
Matthew Sharifi
Damien Vincent
Eugene Kharitonov
Neil Zeghidour
Marco Tagliasacchi
28
98
0
16 May 2023
Cross-Modal Global Interaction and Local Alignment for Audio-Visual
  Speech Recognition
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition
Yuchen Hu
Ruizhe Li
Chen Chen
Heqing Zou
Qiu-shi Zhu
Eng Siong Chng
39
7
0
16 May 2023
Previous
123...171819...333435
Next