16

TwinFormer: A Dual-Level Transformer for Long-Sequence Time-Series Forecasting

Mahima Kumavat
Aditya Maheshwari
Main:11 Pages
2 Figures
Bibliography:3 Pages
5 Tables
Abstract

TwinFormer is a hierarchical Transformer for long-sequence time-series forecasting. It divides the input into non-overlapping temporal patches and processes them in two stages: (1) a Local Informer with top-kk Sparse Attention models intra-patch dynamics, followed by mean pooling; (2) a Global Informer captures long-range inter-patch dependencies using the same top-kk attention. A lightweight GRU aggregates the globally contextualized patch tokens for direct multi-horizon prediction. The resulting architecture achieves linear O(kLd)O(kLd) time and memory complexity. On eight real-world benchmarking datasets from six different domains, including weather, stock price, temperature, power consumption, electricity, and disease, and forecasting horizons 9672096-720, TwinFormer secures 2727 positions in the top two out of 3434. Out of the 2727, it achieves the best performance on MAE and RMSE at 1717 places and 1010 at the second-best place on MAE and RMSE. This consistently outperforms PatchTST, iTransformer, FEDformer, Informer, and vanilla Transformers. Ablations confirm the superiority of top-kk Sparse Attention over ProbSparse and the effectiveness of GRU-based aggregation. Code is available at this repository:this https URL.

View on arXiv
Comments on this paper