ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.12609
  4. Cited By
Mamba in Speech: Towards an Alternative to Self-Attention

Mamba in Speech: Towards an Alternative to Self-Attention

21 May 2024
Xiangyu Zhang
Qiquan Zhang
Hexin Liu
Tianyi Xiao
Xinyuan Qian
Beena Ahmed
E. Ambikairajah
Haizhou Li
Julien Epps
    Mamba
ArXivPDFHTML

Papers citing "Mamba in Speech: Towards an Alternative to Self-Attention"

31 / 31 papers shown
Title
Active Speech Enhancement: Active Speech Denoising Decliping and Deveraberation
Active Speech Enhancement: Active Speech Denoising Decliping and Deveraberation
Ofir Yaish
Yehuda Mishaly
Eliya Nachmani
142
0
0
22 May 2025
CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning
CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning
Sjoerd Groot
Qinyu Chen
Jan C. van Gemert
Chang Gao
Mamba
360
0
0
14 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
140
0
0
09 Oct 2024
DocMamba: Efficient Document Pre-training with State Space Model
DocMamba: Efficient Document Pre-training with State Space Model
Pengfei Hu
Zhenrong Zhang
Jiefeng Ma
Shuhang Liu
Jun Du
Jianshu Zhang
Mamba
56
1
0
18 Sep 2024
An Investigation of Incorporating Mamba for Speech Enhancement
An Investigation of Incorporating Mamba for Speech Enhancement
Rong-Yu Chao
Wen-Huang Cheng
Moreno La Quatra
Sabato Marco Siniscalchi
Chao-Han Huck Yang
Szu-Wei Fu
Yu Tsao
Mamba
80
30
0
10 May 2024
Reducing Language confusion for Code-switching Speech Recognition with
  Token-level Language Diarization
Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization
Hexin Liu
Haihua Xu
Leibny Paola García
Andy W. H. Khong
Yi He
Sanjeev Khudanpur
37
24
0
26 Oct 2022
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Sherif Abdulatif
Ru Cao
Bin Yang
46
72
0
22 Sep 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
63
147
0
06 Jul 2022
On the Parameterization and Initialization of Diagonal State Space
  Models
On the Parameterization and Initialization of Diagonal State Space Models
Albert Gu
Ankit Gupta
Karan Goel
Christopher Ré
71
314
0
23 Jun 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
105
305
0
27 Mar 2022
Speech Denoising in the Waveform Domain with Self-Attention
Speech Denoising in the Waveform Domain with Self-Attention
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
39
62
0
15 Feb 2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
206
1,846
0
26 Oct 2021
DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech
  Enhancement
DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement
Shubo Lv
Yanxin Hu
Shimin Zhang
Lei Xie
43
93
0
16 Jun 2021
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement
Szu-Wei Fu
Cheng Yu
Tsun-An Hsieh
Peter William VanHarn Plantinga
Mirco Ravanelli
Xugang Lu
Yu Tsao
61
216
0
08 Apr 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
185
2,137
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
400
21,281
0
25 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
155
1,173
0
18 Mar 2021
Speech Enhancement Using Multi-Stage Self-Attentive Temporal
  Convolutional Networks
Speech Enhancement Using Multi-Stage Self-Attentive Temporal Convolutional Networks
Ju Lin
A. Wijngaarden
Kuang-Ching Wang
M. C. Smith
44
51
0
24 Feb 2021
HiPPO: Recurrent Memory with Optimal Polynomial Projections
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu
Tri Dao
Stefano Ermon
Atri Rudra
Christopher Ré
100
512
0
17 Aug 2020
DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech
  Enhancement
DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
Yanxin Hu
Yun Liu
Shubo Lv
Mengtao Xing
Shimin Zhang
Yihui Fu
Jian Wu
Bihong Zhang
Lei Xie
50
591
0
01 Aug 2020
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition
  Challenge: Open Datasets, Tracks, Methods and Results
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results
Xian Shi
Qiangze Feng
Lei Xie
30
47
0
12 Jul 2020
Real Time Speech Enhancement in the Waveform Domain
Real Time Speech Enhancement in the Waveform Domain
Alexandre Défossez
Gabriel Synnaeve
Yossi Adi
69
457
0
23 Jun 2020
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech
  Deep Features in Adversarial Networks
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Jiaqi Su
Zeyu Jin
Adam Finkelstein
62
139
0
10 Jun 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
210
3,119
0
16 May 2020
Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention
Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention
Yuma Koizumi
Kohei Yatabe
Marc Delcroix
Yoshiki Masuyama
Daiki Takeuchi
43
125
0
14 Feb 2020
Improving GANs for Speech Enhancement
Improving GANs for Speech Enhancement
Huy P Phan
Ian Mcloughlin
L. D. Pham
Oliver Y. Chén
P. Koch
M. D. Vos
Alfred Mertins
54
115
0
15 Jan 2020
PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network
PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network
Dacheng Yin
Chong Luo
Zhiwei Xiong
Wenjun Zeng
61
316
0
12 Nov 2019
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores
  Optimization for Speech Enhancement
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement
Szu-Wei Fu
Chien-Feng Liao
Yu Tsao
Shou-De Lin
43
331
0
13 May 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
190
3,721
0
09 Jan 2019
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
93
1,501
0
30 Mar 2018
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
Santiago Pascual
Antonio Bonafonte
Joan Serrà
GAN
76
1,143
0
28 Mar 2017
1