ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.00077
  4. Cited By
E-Branchformer: Branchformer with Enhanced merging for speech
  recognition

E-Branchformer: Branchformer with Enhanced merging for speech recognition

30 September 2022
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
ArXivPDFHTML

Papers citing "E-Branchformer: Branchformer with Enhanced merging for speech recognition"

50 / 62 papers shown
Title
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
Haoshuai Zhou
Boxuan Cao
Changgeng Mo
Linkai Li
Shan Xiang Wang
AI4CE
31
0
0
13 May 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
W. Dong
Changsheng Xu
69
0
0
17 Apr 2025
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages
Yangyang Meng
Jinpeng Li
Guodong Lin
Yu Pu
G. Wang
Hu Du
Zhiming Shao
Yukai Huang
Ke Li
Wei-Qiang Zhang
ObjD
99
0
0
26 Mar 2025
SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors
SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors
Yang Chen
Hui Wang
Shiyao Wang
Jianfei Chen
Jiabei He
Jiaming Zhou
Xi Yang
Yali Wang
Yonghua Lin
Yong Qin
38
0
0
20 Mar 2025
CR-CTC: Consistency regularization on CTC for improved speech recognition
CR-CTC: Consistency regularization on CTC for improved speech recognition
Zengwei Yao
Wei Kang
Xiaoyu Yang
Fangjun Kuang
Liyong Guo
Han Zhu
Zengrui Jin
Zhaoqing Li
Long Lin
Daniel Povey
53
0
0
17 Feb 2025
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
Adam Stooke
Rohit Prabhavalkar
K. Sim
P. M. Mengibar
39
0
0
06 Feb 2025
Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings
Summary of the NOTSOFAR-1 Challenge: Highlights and Learnings
Igor Abramovski
Alon Vinnikov
Shalev Shaer
Naoyuki Kanda
Xiaofei Wang
Amir Ivry
Eyal Krupka
39
0
0
28 Jan 2025
Complexity boosted adaptive training for better low resource ASR
  performance
Complexity boosted adaptive training for better low resource ASR performance
Hongxuan Lu
Shenjian Wang
Biao Li
72
0
0
01 Dec 2024
STCON System for the CHiME-8 Challenge
STCON System for the CHiME-8 Challenge
Anton Mitrofanov
Tatiana Prisyach
Tatiana Timofeeva
Sergei Novoselov
M. Korenevsky
...
Dmitriy Miroshnichenko
Nikita Mamaev
Ilya Odegov
Olga Rudnitskaya
A. Romanenko
26
1
0
17 Oct 2024
FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech
  Language Model
FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model
Yichen Lu
Jiaqi Song
Chao-Han Huck Yang
Shinji Watanabe
21
0
0
03 Oct 2024
The Conformer Encoder May Reverse the Time Dimension
The Conformer Encoder May Reverse the Time Dimension
Robin Schmitt
Albert Zeyer
Mohammad Zeineldeen
Ralf Schluter
Hermann Ney
31
0
0
01 Oct 2024
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts
Yihan Wu
Yifan Peng
Yichen Lu
Xuankai Chang
Ruihua Song
Shinji Watanabe
46
2
0
19 Sep 2024
Exploring the Impact of Data Quantity on ASR in Extremely Low-resource
  Languages
Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages
Yao-Fei Cheng
Li-Wei Chen
Hung-Shin Lee
Hsin-Min Wang
21
0
0
13 Sep 2024
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic
  Speech Recognition Challenge
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Hongfei Xue
Rong Gong
Mingchen Shao
Xin Xu
L. xilinx Wang
...
Yong Qin
Jun Du
Ming Li
Binbin Zhang
Bin Jia
23
1
0
09 Sep 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
88
2
0
09 Jul 2024
Ternary Spike-based Neuromorphic Signal Processing System
Ternary Spike-based Neuromorphic Signal Processing System
Shuai Wang
Dehao Zhang
A. Belatreche
Yichen Xiao
Hongyu Qing
Wenjie We
Malu Zhang
Yang Yang
40
7
0
07 Jul 2024
Finetuning End-to-End Models for Estonian Conversational Spoken Language
  Translation
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation
Tiia Sildam
Andra Velve
Tanel Alumäe
40
0
0
04 Jul 2024
Multi-Convformer: Extending Conformer with Multiple Convolution Kernels
Multi-Convformer: Extending Conformer with Multiple Convolution Kernels
Darshan Prabhu
Yifan Peng
P. Jyothi
Shinji Watanabe
39
0
0
04 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
42
6
0
30 Jun 2024
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech
  Units: A Pilot Study
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Peikun Chen
Sining Sun
Changhao Shan
Qing Yang
Lei Xie
40
2
0
27 Jun 2024
Exploring the Capability of Mamba in Speech Applications
Exploring the Capability of Mamba in Speech Applications
Koichi Miyazaki
Yoshiki Masuyama
Masato Murata
Mamba
40
12
0
24 Jun 2024
Children's Speech Recognition through Discrete Token Enhancement
Children's Speech Recognition through Discrete Token Enhancement
Vrunda N. Sukhadia
Shammur A. Chowdhury
40
1
0
19 Jun 2024
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition
  Challenge
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge
Chen Chen
Zehua Liu
Xiaolou Li
Lantian Li
D. Wang
35
2
0
14 Jun 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
46
2
0
13 Jun 2024
On the Effects of Heterogeneous Data Sources on Speech-to-Text
  Foundation Models
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models
Jinchuan Tian
Yifan Peng
William Chen
Kwanghee Choi
Karen Livescu
Shinji Watanabe
26
5
0
13 Jun 2024
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling
  Constraints, Languages, and Datasets
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Jiatong Shi
Shih-Heng Wang
William Chen
Martijn Bartelds
Vanya Bannihatti Kumar
...
Xuankai Chang
Dan Jurafsky
Karen Livescu
Hung-yi Lee
Shinji Watanabe
AuLLM
77
5
0
12 Jun 2024
Neural Blind Source Separation and Diarization for Distant Speech
  Recognition
Neural Blind Source Separation and Diarization for Distant Speech Recognition
Yoshiaki Bando
Tomohiko Nakamura
Shinji Watanabe
BDL
31
5
0
12 Jun 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
45
15
0
11 Jun 2024
Denoising LM: Pushing the Limits of Error Correction Models for Speech
  Recognition
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition
Zijin Gu
Tatiana Likhomanenko
Richard He Bai
Erik McDermott
R. Collobert
Navdeep Jaitly
AuLLM
48
2
0
24 May 2024
Low-resource speech recognition and dialect identification of Irish in a
  multi-task framework
Low-resource speech recognition and dialect identification of Irish in a multi-task framework
Liam Lonergan
Mengjie Qian
Neasa Ní Chiaráin
Christer Gobl
A. N. Chasaide
43
2
0
02 May 2024
EfficientASR: Speech Recognition Network Compression via Attention
  Redundancy and Chunk-Level FFN Optimization
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
Jianzong Wang
Ziqi Liang
Xulong Zhang
Ning Cheng
Jing Xiao
35
0
0
30 Apr 2024
Teaching a Multilingual Large Language Model to Understand Multilingual
  Speech via Multi-Instructional Training
Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
Pavel Denisov
Ngoc Thang Vu
38
2
0
16 Apr 2024
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
He Wang
Pengcheng Guo
Xucheng Wan
Huan Zhou
Lei Xie
18
2
0
08 Apr 2024
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model
  Improves End-to-End ASR
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
Jintao Jiang
Yingbo Gao
Mohammad Zeineldeen
Zoltán Tüske
34
0
0
23 Feb 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
37
17
0
20 Feb 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on
  E-Branchformer
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
...
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
31
40
0
30 Jan 2024
The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in
  CNVSRC 2023
The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023
He Wang
Pengcheng Guo
Wei Chen
Pan Zhou
Lei Xie
18
2
0
07 Jan 2024
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech
  Recognition
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
He Wang
Pengcheng Guo
Pan Zhou
Lei Xie
18
12
0
07 Jan 2024
Automatic channel selection and spatial feature integration for
  multi-channel speech recognition across various array topologies
Automatic channel selection and spatial feature integration for multi-channel speech recognition across various array topologies
Bingshen Mu
Pengcheng Guo
Dake Guo
Pan Zhou
Wei Chen
Lei Xie
30
2
0
15 Dec 2023
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Jintao Jiang
Yingbo Gao
Zoltán Tüske
21
1
0
24 Nov 2023
Loss Masking Is Not Needed in Decoder-only Transformer for
  Discrete-token-based ASR
Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR
Qian Chen
Wen Wang
Qinglin Zhang
Siqi Zheng
Shiliang Zhang
Chong Deng
Yukun Ma
Hai Yu
Jiaqing Liu
Chong Zhang
13
8
0
08 Nov 2023
Zipformer: A faster and better encoder for automatic speech recognition
Zipformer: A faster and better encoder for automatic speech recognition
Zengwei Yao
Liyong Guo
Xiaoyu Yang
Wei Kang
Fangjun Kuang
Yifan Yang
Zengrui Jin
Long Lin
Daniel Povey
VLM
25
65
0
17 Oct 2023
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
  Parameter Sharing
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing
B. Grimstad
Xuankai Chang
Antonios Anastasopoulos
Yuya Fujita
Shinji Watanabe
23
2
0
27 Sep 2023
Exploring Speech Recognition, Translation, and Understanding with
  Discrete Speech Units: A Comparative Study
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Xuankai Chang
Brian Yan
Kwanghee Choi
Jee-weon Jung
Yichen Lu
...
Pengcheng Guo
Yao-Fei Cheng
Pavel Denisov
Kohei Saijo
Hsiu-Hsuan Wang
28
36
0
27 Sep 2023
Joint Prediction and Denoising for Large-scale Multilingual
  Self-supervised Learning
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
William Chen
Jiatong Shi
Brian Yan
Dan Berrebbi
Wangyou Zhang
Yifan Peng
Xuankai Chang
Soumi Maiti
Shinji Watanabe
24
8
0
26 Sep 2023
Segment-Level Vectorized Beam Search Based on Partially Autoregressive
  Inference
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
N. Eng
Yosuke Higuchi
Shinji Watanabe
13
0
0
26 Sep 2023
Reproducing Whisper-Style Training Using an Open-Source Toolkit and
  Publicly Available Data
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Yifan Peng
Jinchuan Tian
Brian Yan
Dan Berrebbi
Xuankai Chang
...
Yui Sudo
Muhammad Shakeel
Jee-weon Jung
Soumi Maiti
Shinji Watanabe
VLM
36
35
0
25 Sep 2023
Unimodal Aggregation for CTC-based Speech Recognition
Unimodal Aggregation for CTC-based Speech Recognition
Ying Fang
Xiaofei Li
15
1
0
15 Sep 2023
Voxtlm: unified decoder-only models for consolidating speech
  recognition/synthesis and speech/text continuation tasks
Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Soumi Maiti
Yifan Peng
Shukjae Choi
Jee-weon Jung
Xuankai Chang
Shinji Watanabe
VLM
AuLLM
16
56
0
14 Sep 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
26
6
0
12 Jul 2023
12
Next