ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.04395
171
9
v1v2 (latest)

Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting

6 February 2025
Siru Zhong
Weilin Ruan
Ming Jin
Huan Li
Qingsong Wen
Yuxuan Liang
    VLMAI4TS
ArXiv (abs)PDFHTML
Main:8 Pages
10 Figures
Bibliography:3 Pages
16 Tables
Appendix:9 Pages
Abstract

Recent advancements in time series forecasting have explored augmenting models with text or vision modalities to improve accuracy. While text provides contextual understanding, it often lacks fine-grained temporal details. Conversely, vision captures intricate temporal patterns but lacks semantic context, limiting the complementary potential of these modalities. To address this, we propose Time-VLM, a novel multimodal framework that leverages pre-trained Vision-Language Models (VLMs) to bridge temporal, visual, and textual modalities for enhanced forecasting. Our framework comprises three key components: (1) a Retrieval-Augmented Learner, which extracts enriched temporal features through memory bank interactions; (2) a Vision-Augmented Learner, which encodes time series as informative images; and (3) a Text-Augmented Learner, which generates contextual textual descriptions. These components collaborate with frozen pre-trained VLMs to produce multimodal embeddings, which are then fused with temporal features for final prediction. Extensive experiments across diverse datasets demonstrate that Time-VLM achieves superior performance, particularly in few-shot and zero-shot scenarios, thereby establishing a new direction for multimodal time series forecasting.

View on arXiv
@article{zhong2025_2502.04395,
  title={ Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting },
  author={ Siru Zhong and Weilin Ruan and Ming Jin and Huan Li and Qingsong Wen and Yuxuan Liang },
  journal={arXiv preprint arXiv:2502.04395},
  year={ 2025 }
}
Comments on this paper