ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.08100
  4. Cited By
Conformer: Convolution-augmented Transformer for Speech Recognition

Conformer: Convolution-augmented Transformer for Speech Recognition

16 May 2020
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
Jiahui Yu
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
ArXivPDFHTML

Papers citing "Conformer: Convolution-augmented Transformer for Speech Recognition"

50 / 1,750 papers shown
Title
Toward Joint Language Modeling for Speech Units and Text
Toward Joint Language Modeling for Speech Units and Text
Ju-Chieh Chou
Chung-Ming Chien
Wei-Ning Hsu
Karen Livescu
Arun Babu
Alexis Conneau
Alexei Baevski
Michael Auli
VLM
28
20
0
12 Oct 2023
Lag-Llama: Towards Foundation Models for Probabilistic Time Series
  Forecasting
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Kashif Rasul
Arjun Ashok
Andrew Robert Williams
Hena Ghonia
Rishika Bhagwatkar
...
Nicolas Chapados
Alexandre Drouin
Valentina Zantedeschi
Yuriy Nevmyvaka
Irina Rish
AI4TS
BDL
34
47
0
12 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Chanho Park
Chengsong Lu
Mingjie Chen
Thomas Hain
36
3
0
12 Oct 2023
On the Relevance of Phoneme Duration Variability of Synthesized Training
  Data for Automatic Speech Recognition
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
Benedikt Hilmes
Ralf Schluter
26
3
0
12 Oct 2023
DASpeech: Directed Acyclic Transformer for Fast and High-quality
  Speech-to-Speech Translation
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Qingkai Fang
Yan Zhou
Yangzhou Feng
45
7
0
11 Oct 2023
Investigating the Effect of Language Models in Sequence Discriminative
  Training for Neural Transducers
Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers
Zijian Yang
Wei Zhou
Ralf Schluter
Hermann Ney
33
0
0
11 Oct 2023
Acoustic Model Fusion for End-to-end Speech Recognition
Acoustic Model Fusion for End-to-end Speech Recognition
Zhihong Lei
Mingbin Xu
Shiyi Han
Leo Liu
Zhen Huang
...
Yuanyuan Zhang
Ernest Pusateri
Mirko Hannemann
Yaqiao Deng
Man-Hung Siu
29
5
0
10 Oct 2023
No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech
  Recognition through Pitch Manipulation
No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Dennis Fucci
Marco Gaido
Matteo Negri
Mauro Cettolo
L. Bentivogli
36
5
0
10 Oct 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy
  Reverberant Acoustic Environments
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
Stefan Goetze
Thomas Hain
38
7
0
09 Oct 2023
Leveraging Multilingual Self-Supervised Pretrained Models for
  Sequence-to-Sequence End-to-End Spoken Language Understanding
Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding
Pavel Denisov
Ngoc Thang Vu
31
1
0
09 Oct 2023
Tailoring Self-Attention for Graph via Rooted Subtrees
Tailoring Self-Attention for Graph via Rooted Subtrees
Siyuan Huang
Yunchong Song
Jiayue Zhou
Zhouhan Lin
35
8
0
08 Oct 2023
Enhancing Representations through Heterogeneous Self-Supervised Learning
Enhancing Representations through Heterogeneous Self-Supervised Learning
Zhongyu Li
Bo-Wen Yin
Yongxiang Liu
Li Liu
Ming-Ming Cheng
SSL
30
2
0
08 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
24
3
0
07 Oct 2023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling
  for Zero-Shot Voice Cloning
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Tao Li
Zhichao Wang
Xinfa Zhu
Jian Cong
Qiao Tian
Yuping Wang
Lei Xie
DiffM
40
3
0
06 Oct 2023
Challenges and Insights: Exploring 3D Spatial Features and Complex
  Networks on the MISP Dataset
Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Yiwen Shao
23
0
0
05 Oct 2023
Neural Language Model Pruning for Automatic Speech Recognition
Neural Language Model Pruning for Automatic Speech Recognition
Leonardo Emili
Thiago Fraga-Silva
Ernest Pusateri
M. Nußbaum-Thom
Youssef Oualil
43
1
0
05 Oct 2023
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and
  Capitalization Capabilities of end-to-end ASR Models
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models
Aleksandr Meister
Matvei Novikov
Nikolay Karpov
Evelina Bakhturina
Vitaly Lavrukhin
Boris Ginsburg
22
12
0
04 Oct 2023
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching
  Speech Recognition
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Peikun Chen
Fan Yu
Yuhao Liang
Hongfei Xue
Xucheng Wan
Naijun Zheng
Huan Zhou
Lei Xie
MoE
32
7
0
04 Oct 2023
ResidualTransformer: Residual Low-Rank Learning with Weight-Sharing for
  Transformer Layers
ResidualTransformer: Residual Low-Rank Learning with Weight-Sharing for Transformer Layers
Yiming Wang
Jinyu Li
20
4
0
03 Oct 2023
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and
  General Domain ASR
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
Tobi Olatunji
Tejumade Afonja
Aditya Yadavalli
Chris C. Emezue
Sahib Singh
...
Joanne I. Osuchukwu
Salomey Osei
A. Tonja
Naome A. Etori
Clinton Mbataku
40
16
0
30 Sep 2023
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm
Weiran Wang
Zelin Wu
D. Caseiro
Tsendsuren Munkhdalai
K. Sim
...
Rohit Prabhavalkar
Zhong Meng
Ding Zhao
Tara N. Sainath
P. M. Mengibar
58
5
0
29 Sep 2023
The Gift of Feedback: Improving ASR Model Quality by Learning from User
  Corrections through Federated Learning
The Gift of Feedback: Improving ASR Model Quality by Learning from User Corrections through Federated Learning
Lillian Zhou
Yuxin Ding
Mingqing Chen
Harry Zhang
Rohit Prabhavalkar
Dhruv Guliani
Giovanni Motta
Rajiv Mathews
8
1
0
29 Sep 2023
Federated Learning with Differential Privacy for End-to-End Speech
  Recognition
Federated Learning with Differential Privacy for End-to-End Speech Recognition
Martin Pelikan
Sheikh Shams Azam
Vitaly Feldman
Jan Honza Silovsky
Kunal Talwar
Tatiana Likhomanenko
58
7
0
29 Sep 2023
Improving Audio Captioning Models with Fine-grained Audio Features, Text
  Embedding Supervision, and LLM Mix-up Augmentation
Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Shih-Lun Wu
Xuankai Chang
Gordon Wichern
Jee-weon Jung
Franccois G. Germain
Jonathan Le Roux
Shinji Watanabe
18
17
0
29 Sep 2023
Wiki-En-ASR-Adapt: Large-scale synthetic dataset for English ASR
  Customization
Wiki-En-ASR-Adapt: Large-scale synthetic dataset for English ASR Customization
Alexandra Antonova
58
0
0
29 Sep 2023
Enhancing Code-switching Speech Recognition with Interactive Language
  Biases
Enhancing Code-switching Speech Recognition with Interactive Language Biases
Hexin Liu
Leibny Paola García
Jingze Lu
Wenchao Wang
Sanjeev Khudanpur
28
11
0
29 Sep 2023
Astroconformer: The Prospects of Analyzing Stellar Light Curves with
  Transformer-Based Deep Learning Models
Astroconformer: The Prospects of Analyzing Stellar Light Curves with Transformer-Based Deep Learning Models
Kishankumar Bhimani
Yuan-Sen Ting
Jie Yu
24
4
0
28 Sep 2023
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription
  System
PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System
Xiang Lyu
Yuhang Cao
Qing Wang
Jingjing Yin
Yuguang Yang
Pengpeng Zou
G. Zachmann
Heng Lu
VLM
39
3
0
28 Sep 2023
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation
  Auxiliary Task for E2E Code-switching ASR
LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Guodong Ma
Wenxuan Wang
Yuke Li
Yuting Yang
Binbin Du
Haoran Fu
31
5
0
28 Sep 2023
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention
  for CTC-based ASR
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Ambar Pal
Jeremias Sulam
Yu Tsao
Rene Vidal
23
2
0
28 Sep 2023
Exploring Speech Recognition, Translation, and Understanding with
  Discrete Speech Units: A Comparative Study
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Xuankai Chang
Brian Yan
Kwanghee Choi
Jee-weon Jung
Yichen Lu
...
Pengcheng Guo
Yao-Fei Cheng
Pavel Denisov
Kohei Saijo
Hsiu-Hsuan Wang
40
38
0
27 Sep 2023
HyPoradise: An Open Baseline for Generative Speech Recognition with
  Large Language Models
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Cheng Chen
Yuchen Hu
Chao-Han Huck Yang
Sabato Marco Siniscalchi
Pin-Yu Chen
Eng Siong Chng
34
42
0
27 Sep 2023
Enhancing End-to-End Conversational Speech Translation Through Target
  Language Context Utilization
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization
A. Hussein
Brian Yan
Antonios Anastasopoulos
Shinji Watanabe
Sanjeev Khudanpur
46
3
0
27 Sep 2023
Speech collage: code-switched audio generation by collaging monolingual
  corpora
Speech collage: code-switched audio generation by collaging monolingual corpora
A. Hussein
Dorsa Zeinali
Ondˇrej Klejch
Sanjeev Khudanpur
Brian Yan
Shammur A. Chowdhury
Ahmed M. Ali
Shinji Watanabe
Sanjeev Khudanpur
27
1
0
27 Sep 2023
Generative Speech Recognition Error Correction with Large Language
  Models and Task-Activating Prompting
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Chao-Han Huck Yang
Yile Gu
Yi-Chieh Liu
Shalini Ghosh
I. Bulyko
A. Stolcke
KELM
LRM
46
40
0
27 Sep 2023
Direct Models for Simultaneous Translation and Automatic Subtitling:
  FBK@IWSLT2023
Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023
Sara Papi
Marco Gaido
Matteo Negri
48
7
0
27 Sep 2023
DualVC 2: Dynamic Masked Convolution for Unified Streaming and
  Non-Streaming Voice Conversion
DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
Ziqian Ning
Yuepeng Jiang
Pengcheng Zhu
Shuai Wang
Jixun Yao
Linfu Xie
Mengxiao Bi
28
5
0
27 Sep 2023
DefectHunter: A Novel LLM-Driven Boosted-Conformer-based Code
  Vulnerability Detection Mechanism
DefectHunter: A Novel LLM-Driven Boosted-Conformer-based Code Vulnerability Detection Mechanism
Jin Wang
Zishan Huang
Hengli Liu
Nianyi Yang
Yinhao Xiao
38
17
0
27 Sep 2023
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in
  the HYKIST Project
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project
Khai Le-Duc
17
2
0
26 Sep 2023
Updated Corpora and Benchmarks for Long-Form Speech Recognition
Updated Corpora and Benchmarks for Long-Form Speech Recognition
Jennifer Drexler Fox
Desh Raj
Natalie Delworth
Quinn Mcnamara
Corey Miller
Miguel Jetté
AuLLM
43
7
0
26 Sep 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive
  Inference
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
21
0
0
26 Sep 2023
Learning from Flawed Data: Weakly Supervised Automatic Speech
  Recognition
Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Dongji Gao
Hainan Xu
Desh Raj
Leibny Paola García Perera
Daniel Povey
Sanjeev Khudanpur
38
4
0
26 Sep 2023
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Keyu An
Shiliang Zhang
36
4
0
26 Sep 2023
DeWave: Discrete EEG Waves Encoding for Brain Dynamics to Text
  Translation
DeWave: Discrete EEG Waves Encoding for Brain Dynamics to Text Translation
Yiqun Duan
Jinzhao Zhou
Zhen Wang
Yu-Kai Wang
Ching-Teng Lin
30
31
0
25 Sep 2023
AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech
  Data
AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Jianwei Yu
Hangting Chen
Yanyao Bian
Xiang Li
Yimin Luo
Jinchuan Tian
Mengyang Liu
Jiayi Jiang
Shuai Wang
VLM
28
12
0
25 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
44
36
0
25 Sep 2023
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of
  Experts
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts
Huy Nguyen
Pedram Akbarian
Fanqi Yan
Nhat Ho
MoE
46
16
0
25 Sep 2023
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Cross-modal Alignment with Optimal Transport for CTC-based ASR
Xugang Lu
Peng Shen
Yu Tsao
Hisashi Kawai
39
5
0
24 Sep 2023
The second multi-channel multi-party meeting transcription challenge
  (M2MeT) 2.0): A benchmark for speaker-attributed ASR
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR
Yuhao Liang
Mohan Shi
Fan Yu
Yangze Li
Shiliang Zhang
...
Jian Wu
Zhuo Chen
Kong Aik Lee
Zhijie Yan
Hui Bu
35
5
0
24 Sep 2023
Human Transcription Quality Improvement
Human Transcription Quality Improvement
Jian Gao
Hanbo Sun
Cheng Cao
Zheng Du
48
2
0
24 Sep 2023
Previous
123...121314...333435
Next