Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach

19 May 2025

Abstract

Existing tracking algorithms typically rely on low-frame-rate RGB cameras coupled with computationally intensive deep neural network architectures to achieve effective tracking. However, such frame-based methods inherently face challenges in achieving low-latency performance and often fail in resource-constrained environments. Visual object tracking using bio-inspired event cameras has emerged as a promising research direction in recent years, offering distinct advantages for low-latency applications. In this paper, we propose a novel Slow-Fast Tracking paradigm that flexibly adapts to different operational requirements, termed SFTrack. The proposed framework supports two complementary modes, i.e., a high-precision slow tracker for scenarios with sufficient computational resources, and an efficient fast tracker tailored for latency-aware, resource-constrained environments. Specifically, our framework first performs graph-based representation learning from high-temporal-resolution event streams, and then integrates the learned graph-structured information into two FlashAttention-based vision backbones, yielding the slow and fast trackers, respectively. The fast tracker achieves low latency through a lightweight network design and by producing multiple bounding box outputs in a single forward pass. Finally, we seamlessly combine both trackers via supervised fine-tuning and further enhance the fast tracker's performance through a knowledge distillation strategy. Extensive experiments on public benchmarks, including FE240, COESOT, and EventVOT, demonstrate the effectiveness and efficiency of our proposed method across different real-world scenarios. The source code has been released onthis https URL.

View on arXiv

@article{wang2025_2505.12903,
  title={ Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach },
  author={ Shiao Wang and Xiao Wang and Liye Jin and Bo Jiang and Lin Zhu and Lan Chen and Yonghong Tian and Bin Luo },
  journal={arXiv preprint arXiv:2505.12903},
  year={ 2025 }
}

Comments on this paper