Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.04451
Cited By
Reformer: The Efficient Transformer
13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reformer: The Efficient Transformer"
50 / 494 papers shown
Title
Learning to Dissipate Energy in Oscillatory State-Space Models
Jared Boyer
T. Konstantin Rusch
Daniela Rus
9
0
0
17 May 2025
GeoMaNO: Geometric Mamba Neural Operator for Partial Differential Equations
Xi Han
Jingwei Zhang
Dimitris Samaras
Fei Hou
Hong Qin
AI4CE
2
0
0
17 May 2025
Bi-directional Recurrence Improves Transformer in Partially Observable Markov Decision Processes
Ashok Arora
Neetesh Kumar
36
0
0
16 May 2025
Hierarchical Sparse Attention Framework for Computationally Efficient Classification of Biological Cells
Elad Yoshai
Dana Yagoda-Aharoni
Eden Dotan
N. Shaked
31
0
0
12 May 2025
OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain
Wenzhen Yue
Yong-Jin Liu
Haoxuan Li
Hao Wang
Xianghua Ying
Ruohao Guo
Bowei Xing
Ji Shi
AI4TS
OOD
34
0
0
12 May 2025
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Andrew Kiruluta
Eric Lundy
Priscilla Burity
29
0
0
09 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
L. Zhang
Zimu Zhou
Yongxin Tong
AI4TS
31
0
0
09 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
211
0
0
06 May 2025
SCFormer: Structured Channel-wise Transformer with Cumulative Historical State for Multivariate Time Series Forecasting
Shiwei Guo
Z. Chen
Yupeng Ma
Yunfei Han
Yi Wang
AI4TS
205
0
0
05 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
106
1
0
01 May 2025
Scalable Meta-Learning via Mixed-Mode Differentiation
Iurii Kemaev
Dan A Calian
Luisa M Zintgraf
Gregory Farquhar
H. V. Hasselt
57
0
0
01 May 2025
SFi-Former: Sparse Flow Induced Attention for Graph Transformer
ZeLin Li
J. Q. Shi
Xinming Zhang
Miao Zhang
B. Li
44
0
0
29 Apr 2025
From Attention to Atoms: Spectral Dictionary Learning for Fast, Interpretable Language Models
Andrew Kiruluta
29
0
0
29 Apr 2025
Multimodal Conditioned Diffusive Time Series Forecasting
Chen Su
Yuanhe Tian
Yan Song
DiffM
AI4TS
60
0
0
28 Apr 2025
CANet: ChronoAdaptive Network for Enhanced Long-Term Time Series Forecasting under Non-Stationarity
Mert Sonmezer
Seyda Ertekin
AI4TS
31
0
0
24 Apr 2025
Pets: General Pattern Assisted Architecture For Time Series Analysis
Xiangkai Ma
Xiaobin Hong
Wenzhong Li
Sanglu Lu
AI4TS
32
0
0
19 Apr 2025
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
217
1
0
03 Apr 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
116
0
0
05 Mar 2025
Attention Condensation via Sparsity Induced Regularized Training
Eli Sason
Darya Frolova
Boris Nazarov
Felix Goldberd
246
0
0
03 Mar 2025
PFformer: A Position-Free Transformer Variant for Extreme-Adaptive Multivariate Time Series Forecasting
Yanhong Li
D. Anastasiu
AI4TS
38
0
0
27 Feb 2025
Low-Rank Thinning
Annabelle Michael Carrell
Albert Gong
Abhishek Shetty
Raaz Dwivedi
Lester W. Mackey
61
0
0
17 Feb 2025
Vision-Enhanced Time Series Forecasting via Latent Diffusion Models
Weilin Ruan
Siru Zhong
Haomin Wen
Keli Zhang
AI4TS
79
1
0
16 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
102
0
0
10 Feb 2025
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models
Tzu-Tao Chang
Shivaram Venkataraman
VLM
235
0
0
04 Feb 2025
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
Qiuhao Zeng
Jerry Huang
Peng Lu
Gezheng Xu
Boxing Chen
Charles Ling
Boyu Wang
57
1
0
24 Jan 2025
Unified CNNs and transformers underlying learning mechanism reveals multi-head attention modus vivendi
Ella Koresh
Ronit D. Gross
Yuval Meir
Yarden Tzach
Tal Halevi
Ido Kanter
ViT
49
0
0
22 Jan 2025
Harnessing the Potential of Large Language Models in Modern Marketing Management: Applications, Future Directions, and Strategic Recommendations
Raha Aghaei
Ali A. Kiaei
Mahnaz Boush
Javad Vahidi
Mohammad Zavvar
Zeynab Barzegar
Mahan Rofoosheh
OffRL
50
1
0
18 Jan 2025
DistPred: A Distribution-Free Probabilistic Inference Method for Regression and Forecasting
Daojun Liang
Haixia Zhang
Dongfeng Yuan
UQCV
77
0
0
08 Jan 2025
TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis
Sabera Talukder
Yisong Yue
Georgia Gkioxari
AI4TS
51
13
0
03 Jan 2025
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu
Meng Chen
Baotong Lu
Huiqiang Jiang
Zhenhua Han
...
Kaipeng Zhang
Chong Chen
Fan Yang
Yuqing Yang
Lili Qiu
60
30
0
03 Jan 2025
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi
Skanda Koppula
Shreya Pathak
Justin T Chiu
Joseph Heyward
Viorica Patraucean
Jiajun Shen
Antoine Miech
Andrew Zisserman
Aida Nematzdeh
VLM
69
24
0
31 Dec 2024
DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images
Enbo Huang
Yuan Zhang
Faliang Huang
Guangyu Zhang
Yong-Jin Liu
DiffM
44
0
0
25 Dec 2024
Does Self-Attention Need Separate Weights in Transformers?
Md. Kowsher
Nusrat Jahan Prottasha
Chun-Nam Yu
O. Garibay
Niloofar Yousefi
256
0
0
30 Nov 2024
MAS-Attention: Memory-Aware Stream Processing for Attention Acceleration on Resource-Constrained Edge Devices
Mohammadali Shakerdargah
Shan Lu
Chao Gao
Di Niu
77
0
0
20 Nov 2024
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
Yanlong Wang
J. Xu
Fei Ma
Shao-Lun Huang
Danny Dongning Sun
Xiao-Ping Zhang
AI4TS
45
1
0
03 Nov 2024
RAM: Replace Attention with MLP for Efficient Multivariate Time Series Forecasting
Suhan Guo
Jiahong Deng
Yi Wei
Hui Dou
Furao Shen
Jian Zhao
AI4TS
206
0
0
31 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
53
0
0
22 Oct 2024
LiNo: Advancing Recursive Residual Decomposition of Linear and Nonlinear Patterns for Robust Time Series Forecasting
Guoqi Yu
Yaoming Li
Xiaoyu Guo
Dayu Wang
Zirui Liu
Shujun Wang
Tong Yang
AI4TS
195
0
0
22 Oct 2024
TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
Shiyu Wang
Jiawei Li
Xiaoming Shi
Zhou Ye
Baichuan Mo
Wenze Lin
Shengtong Ju
Zhixuan Chu
Ming Jin
AI4TS
46
11
0
21 Oct 2024
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Pingyi Chen
Zhongyi Shui
Chenglu Zhu
Lin Yang
MedIm
55
4
0
18 Oct 2024
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Yizhao Gao
Zhichen Zeng
Dayou Du
Shijie Cao
Hayden Kwok-Hay So
...
Junjie Lai
Mao Yang
Ting Cao
Fan Yang
M. Yang
54
19
0
17 Oct 2024
In-context KV-Cache Eviction for LLMs via Attention-Gate
Zihao Zeng
Bokai Lin
Tianqi Hou
Hao Zhang
Zhijie Deng
38
1
0
15 Oct 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Haotian Tang
Yecheng Wu
Shang Yang
Enze Xie
Junsong Chen
Junyu Chen
Zhuoyang Zhang
Han Cai
Yaojie Lu
Song Han
74
34
0
14 Oct 2024
Token Pruning using a Lightweight Background Aware Vision Transformer
Sudhakar Sah
Ravish Kumar
Honnesh Rohmetra
Ehsan Saboori
ViT
26
1
0
12 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
17
0
06 Oct 2024
S7: Selective and Simplified State Space Layers for Sequence Modeling
Taylan Soydan
Nikola Zubić
Nico Messikommer
Siddhartha Mishra
Davide Scaramuzza
47
4
0
04 Oct 2024
Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting
Ignacio Aguilera-Martos
Andrés Herrera-Poyatos
Julián Luengo
Francisco Herrera
AI4TS
33
0
0
04 Oct 2024
Oscillatory State-Space Models
T. Konstantin Rusch
Daniela Rus
AI4TS
171
6
0
04 Oct 2024
1
2
3
4
...
8
9
10
Next