Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.03170
Cited By
Is Mamba Capable of In-Context Learning?
5 February 2024
Riccardo Grazzi
Julien N. Siems
Simon Schrodi
Thomas Brox
Frank Hutter
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Is Mamba Capable of In-Context Learning?"
33 / 33 papers shown
Title
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
Yingcong Li
Davoud Ataee Tarzanagh
A. S. Rawat
Maryam Fazel
Samet Oymak
25
0
0
06 Apr 2025
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Xinyu Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
45
1
0
28 Mar 2025
From Markov to Laplace: How Mamba In-Context Learns Markov Chains
Marco Bondaschi
Nived Rajaraman
Xiuying Wei
Kannan Ramchandran
Razvan Pascanu
Çağlar Gülçehre
Michael C. Gastpar
Ashok Vardhan Makkuva
63
0
0
17 Feb 2025
Mamba-Based Graph Convolutional Networks: Tackling Over-smoothing with Selective State Space
Xin He
Yixuan Wang
Wenqi Fan
Xu Shen
Xin Juan
Rui Miao
Xin Wang
68
0
0
26 Jan 2025
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
151
0
0
28 Oct 2024
Variable Aperture Bokeh Rendering via Customized Focal Plane Guidance
Kang Chen
Shijun Yan
Aiwen Jiang
Han Li
Zhifeng Wang
45
0
0
18 Oct 2024
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
41
3
0
15 Oct 2024
Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space Models
Sathya Kamesh Bhethanabhotla
Omar Swelam
Julien N. Siems
David Salinas
Frank Hutter
Mamba
AI4TS
AI4CE
43
4
0
12 Oct 2024
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition
Zheyang Xiong
Ziyang Cai
John Cooper
Albert Ge
Vasilis Papageorgiou
...
Saurabh Agarwal
Grigorios G Chrysos
Samet Oymak
Kangwook Lee
Dimitris Papailiopoulos
LRM
35
1
0
08 Oct 2024
Task Diversity Shortens the ICL Plateau
Jaeyeon Kim
Sehyun Kwon
Joo Young Choi
Jongho Park
Jaewoong Cho
Jason D. Lee
Ernest K. Ryu
MoMe
31
2
0
07 Oct 2024
GAMformer: In-Context Learning for Generalized Additive Models
Andreas Mueller
Julien N. Siems
Harsha Nori
David Salinas
Arber Zela
Rich Caruana
Frank Hutter
AI4CE
33
3
0
06 Oct 2024
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Ameen Ali
Lior Wolf
Ivan Titov
36
2
0
02 Oct 2024
"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models
Ricardo Knauer
Mario Koddenbrock
Raphael Wallsberger
Nicholas M. Brisson
Georg N. Duda
Deborah Falla
David W. Evans
Erik Rodner
35
0
0
27 Sep 2024
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
Yu Zhang
Songlin Yang
Ruijie Zhu
Yue Zhang
Leyang Cui
...
Freda Shi
Bailin Wang
Wei Bi
P. Zhou
Guohong Fu
65
17
0
11 Sep 2024
Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling
Georgios Pantazopoulos
Malvina Nikandrou
Alessandro Suglia
Oliver Lemon
Arash Eshghi
Mamba
45
1
0
09 Sep 2024
EDCSSM: Edge Detection with Convolutional State Space Model
Q. Hong
Haoyou Jiang
Pingdan Xiao
Sichun Du
Tao Li
35
0
0
03 Sep 2024
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders
Baijiong Lin
Weisen Jiang
Pengguang Chen
Shu Liu
Ying-Cong Chen
Mamba
40
1
0
27 Aug 2024
Improving VTE Identification through Language Models from Radiology Reports: A Comparative Study of Mamba, Phi-3 Mini, and BERT
Jamie Deng
Yusen Wu
Yelena Yesha
Phuong Nguyen
16
0
0
16 Aug 2024
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond
Yingcong Li
A. S. Rawat
Samet Oymak
25
6
0
13 Jul 2024
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders
Baijiong Lin
Weisen Jiang
Pengguang Chen
Yu Zhang
Shu Liu
Ying-Cong Chen
Mamba
46
9
0
02 Jul 2024
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
Alexander Nikulin
Ilya Zisman
Alexey Zemtsov
Viacheslav Sinii
110
4
0
13 Jun 2024
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
30
45
0
26 May 2024
Visual Mamba: A Survey and New Outlooks
Rui Xu
Shu Yang
Yihui Wang
Yu Cai
Bo Du
Hao Chen
Mamba
42
26
0
29 Apr 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
46
38
0
24 Apr 2024
State Space Model for New-Generation Network Alternative to Transformers: A Survey
Tianlin Li
Shiao Wang
Yuhe Ding
Yuehang Li
Wentao Wu
...
Bowei Jiang
Chenglong Li
Yaowei Wang
Yonghong Tian
Jin Tang
Mamba
33
49
0
15 Apr 2024
Locating and Editing Factual Associations in Mamba
Arnab Sen Sharma
David Atkinson
David Bau
KELM
73
28
0
04 Apr 2024
Is Mamba Effective for Time Series Forecasting?
Zihan Wang
Fanheng Kong
Shi Feng
Ming Wang
Xiaocui Yang
Han Zhao
Daling Wang
Yifei Zhang
Mamba
AI4TS
32
56
0
17 Mar 2024
The Hidden Attention of Mamba Models
Ameen Ali
Itamar Zimerman
Lior Wolf
Mamba
39
58
0
03 Mar 2024
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
100
78
0
01 Feb 2024
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
Jun Ma
Feifei Li
Bo Wang
Mamba
82
331
0
09 Jan 2024
ForecastPFN: Synthetically-Trained Zero-Shot Forecasting
Samuel Dooley
Gurnoor Singh Khurana
Chirag Mohapatra
Siddartha Naidu
Colin White
AI4TS
86
59
0
03 Nov 2023
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
256
1,996
0
31 Dec 2020
1