ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.04248
  4. Cited By
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning
  Tasks

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

6 February 2024
Jongho Park
Jaeseung Park
Zheyang Xiong
Nayoung Lee
Jaewoong Cho
Samet Oymak
Kangwook Lee
Dimitris Papailiopoulos
ArXivPDFHTML

Papers citing "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks"

50 / 52 papers shown
Title
Turbo-ICL: In-Context Learning-Based Turbo Equalization
Turbo-ICL: In-Context Learning-Based Turbo Equalization
Zihang Song
Matteo Zecchin
Bipin Rajendran
Osvaldo Simeone
39
0
0
09 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
96
1
0
01 May 2025
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
88
0
0
22 Apr 2025
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
Yingcong Li
Davoud Ataee Tarzanagh
A. S. Rawat
Maryam Fazel
Samet Oymak
25
0
0
06 Apr 2025
Attention Mamba: Time Series Modeling with Adaptive Pooling Acceleration and Receptive Field Enhancements
Attention Mamba: Time Series Modeling with Adaptive Pooling Acceleration and Receptive Field Enhancements
Sijie Xiong
Shuqing Liu
Cheng Tang
Fumiya Okubo
Haoling Xiong
Atsushi Shimada
Mamba
AI4TS
60
0
0
02 Apr 2025
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Xihuai Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
45
1
0
28 Mar 2025
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
Yunze Liu
Peiran Wu
C. Liang
Junxiao Shen
Limin Wang
Li Yi
Mamba
53
0
0
16 Mar 2025
In-Context Learning with Hypothesis-Class Guidance
In-Context Learning with Hypothesis-Class Guidance
Ziqian Lin
Shubham Kumar Bharti
Kangwook Lee
76
0
0
27 Feb 2025
Merging Context Clustering with Visual State Space Models for Medical Image Segmentation
Merging Context Clustering with Visual State Space Models for Medical Image Segmentation
Yun Zhu
Dong Zhang
Yi-Mou Lin
Yifei Feng
Jinhui Tang
Mamba
27
1
0
03 Jan 2025
Marconi: Prefix Caching for the Era of Hybrid LLMs
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
95
4
0
28 Nov 2024
Hymba: A Hybrid-head Architecture for Small Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
...
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
Mamba
100
21
0
20 Nov 2024
Can Custom Models Learn In-Context? An Exploration of Hybrid
  Architecture Performance on In-Context Learning Tasks
Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks
Ryan Campbell
Nelson Lojo
Kesava Viswanadha
Christoffer Grondal Tryggestad
Derrick Han Sun
Sriteja Vijapurapu
August Rolfsen
Anant Sahai
31
0
0
06 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
64
4
0
06 Nov 2024
Revealing and Mitigating the Local Pattern Shortcuts of Mamba
Revealing and Mitigating the Local Pattern Shortcuts of Mamba
Wangjie You
Zecheng Tang
Juntao Li
Lili Yao
Min Zhang
Mamba
24
0
0
21 Oct 2024
Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space
  Models
Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space Models
Sathya Kamesh Bhethanabhotla
Omar Swelam
Julien N. Siems
David Salinas
Frank Hutter
Mamba
AI4TS
AI4CE
43
3
0
12 Oct 2024
Parameter-Efficient Fine-Tuning of State Space Models
Parameter-Efficient Fine-Tuning of State Space Models
Kevin Galim
Wonjun Kang
Yuchen Zeng
H. Koo
Kangwook Lee
29
4
0
11 Oct 2024
A Survey: Collaborative Hardware and Software Design in the Era of Large
  Language Models
A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
Cong Guo
Feng Cheng
Zhixu Du
James Kiessling
Jonathan Ku
...
Qilin Zheng
Guanglei Zhou
Hai
Li-Wei Li
Yiran Chen
31
7
0
08 Oct 2024
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple
  Tasks in Superposition
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition
Zheyang Xiong
Ziyang Cai
John Cooper
Albert Ge
Vasilis Papageorgiou
...
Saurabh Agarwal
Grigorios G Chrysos
Samet Oymak
Kangwook Lee
Dimitris Papailiopoulos
LRM
35
1
0
08 Oct 2024
Task Diversity Shortens the ICL Plateau
Task Diversity Shortens the ICL Plateau
Jaeyeon Kim
Sehyun Kwon
Joo Young Choi
Jongho Park
Jaewoong Cho
Jason D. Lee
Ernest K. Ryu
MoMe
31
2
0
07 Oct 2024
Can Mamba Always Enjoy the "Free Lunch"?
Can Mamba Always Enjoy the "Free Lunch"?
Ruifeng Ren
Zhicong Li
Yong Liu
44
1
0
04 Oct 2024
Training Nonlinear Transformers for Chain-of-Thought Inference: A
  Theoretical Generalization Analysis
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Hongkang Li
Meng Wang
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
LRM
27
5
0
03 Oct 2024
A Little Goes a Long Way: Efficient Long Context Training and Inference
  with Partial Contexts
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
Suyu Ge
Xihui Lin
Yunan Zhang
Jiawei Han
Hao Peng
31
4
0
02 Oct 2024
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Ameen Ali
Lior Wolf
Ivan Titov
36
2
0
02 Oct 2024
Integration of Mamba and Transformer -- MAT for Long-Short Range Time
  Series Forecasting with Application to Weather Dynamics
Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics
Wenqing Zhang
Junming Huang
Ruotong Wang
Changsong Wei
Wenqian Huang
Yuxin Qiao
Mamba
32
10
0
13 Sep 2024
Shaking Up VLMs: Comparing Transformers and Structured State Space
  Models for Vision & Language Modeling
Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Georgios Pantazopoulos
Malvina Nikandrou
Alessandro Suglia
Oliver Lemon
Arash Eshghi
Mamba
45
1
0
09 Sep 2024
DualKanbaFormer: An Efficient Selective Sparse Framework for Multimodal Aspect-based Sentiment Analysis
DualKanbaFormer: An Efficient Selective Sparse Framework for Multimodal Aspect-based Sentiment Analysis
A. Lawan
Juhua Pu
Haruna Yunusa
Muhammad Lawan
Aliyu Umar
Adamu Sani Yahya
Mahmoud Basi
Mamba
39
0
0
27 Aug 2024
Simplified Mamba with Disentangled Dependency Encoding for Long-Term
  Time Series Forecasting
Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series Forecasting
Zixuan Weng
Jindong Han
Wenzhao Jiang
Hao Liu
Mamba
AI4TS
33
2
0
22 Aug 2024
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
Dongyuan Li
Shiyin Tan
Ying Zhang
Ming Jin
Shirui Pan
Manabu Okumura
Renhe Jiang
Mamba
34
3
0
13 Aug 2024
A Survey of Mamba
A Survey of Mamba
Shuwei Shi
Shibing Chu
Rui An
Wenqi Fan
Yuee Xie
Hui Liu
Yuanping Chen
Qing Li
AI4CE
40
26
0
02 Aug 2024
Fine-grained Analysis of In-context Linear Estimation: Data,
  Architecture, and Beyond
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond
Yingcong Li
A. S. Rawat
Samet Oymak
25
6
0
13 Jul 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing
  Architechtural Inductive Biases on Long-Context Abilities
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
Jerry Huang
52
7
0
11 Jul 2024
On the Power of Convolution Augmented Transformer
On the Power of Convolution Augmented Transformer
Mingchen Li
Xuechen Zhang
Yixiao Huang
Samet Oymak
32
0
0
08 Jul 2024
Venturing into Uncharted Waters: The Navigation Compass from Transformer
  to Mamba
Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba
Yuchen Zou
Yineng Chen
Zuchao Li
Lefei Zhang
Hai Zhao
52
1
0
24 Jun 2024
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
Alexander Nikulin
Ilya Zisman
Alexey Zemtsov
Viacheslav Sinii
107
4
0
13 Jun 2024
An Empirical Study of Mamba-based Language Models
An Empirical Study of Mamba-based Language Models
R. Waleffe
Wonmin Byeon
Duncan Riach
Brandon Norick
V. Korthikanti
...
Vartika Singh
Jared Casper
Jan Kautz
M. Shoeybi
Bryan Catanzaro
61
64
0
12 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
74
56
0
11 Jun 2024
Dimba: Transformer-Mamba Diffusion Models
Dimba: Transformer-Mamba Diffusion Models
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Debang Li
Youqiang Zhang
Junshi Huang
Mamba
62
16
0
03 Jun 2024
Zamba: A Compact 7B SSM Hybrid Model
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
30
45
0
26 May 2024
PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud
  Analysis
PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis
Zicheng Wang
Zhen Chen
Yiming Wu
Zhen Zhao
Luping Zhou
Dong Xu
Mamba
45
14
0
24 May 2024
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud
  Learning
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning
Qingdong He
Jiangning Zhang
Jinlong Peng
Haoyang He
Yabiao Wang
Chengjie Wang
3DPC
40
12
0
24 May 2024
Visual Mamba: A Survey and New Outlooks
Visual Mamba: A Survey and New Outlooks
Rui Xu
Shu Yang
Yihui Wang
Yu Cai
Bo Du
Hao Chen
Mamba
42
26
0
29 Apr 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for
  Long Sequence Modelling: Methods, Applications, and Challenges
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
46
38
0
24 Apr 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
83
0
22 Apr 2024
State Space Model for New-Generation Network Alternative to
  Transformers: A Survey
State Space Model for New-Generation Network Alternative to Transformers: A Survey
Xiao Wang
Shiao Wang
Yuhe Ding
Yuehang Li
Wentao Wu
...
Bowei Jiang
Chenglong Li
Yaowei Wang
Yonghong Tian
Jin Tang
Mamba
33
49
0
15 Apr 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
26
208
0
28 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and
  Time-Series Analysis
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
43
2
0
26 Mar 2024
The Hidden Attention of Mamba Models
The Hidden Attention of Mamba Models
Ameen Ali
Itamar Zimerman
Lior Wolf
Mamba
39
58
0
03 Mar 2024
Is Mamba Capable of In-Context Learning?
Is Mamba Capable of In-Context Learning?
Riccardo Grazzi
Julien N. Siems
Simon Schrodi
Thomas Brox
Frank Hutter
24
40
0
05 Feb 2024
Repeat After Me: Transformers are Better than State Space Models at
  Copying
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
97
78
0
01 Feb 2024
Zoology: Measuring and Improving Recall in Efficient Language Models
Zoology: Measuring and Improving Recall in Efficient Language Models
Simran Arora
Sabri Eyuboglu
Aman Timalsina
Isys Johnson
Michael Poli
James Zou
Atri Rudra
Christopher Ré
64
66
0
08 Dec 2023
12
Next