ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.07709
15
0

Fine-Grained Motion Compression and Selective Temporal Fusion for Neural B-Frame Video Coding

9 June 2025
Xihua Sheng
Peilin Chen
Meng Wang
Li Zhang
Shiqi Wang
Dapeng Oliver Wu
ArXiv (abs)PDFHTML
Main:8 Pages
11 Figures
Bibliography:2 Pages
1 Tables
Abstract

With the remarkable progress in neural P-frame video coding, neural B-frame coding has recently emerged as a critical research direction. However, most existing neural B-frame codecs directly adopt P-frame coding tools without adequately addressing the unique challenges of B-frame compression, leading to suboptimal performance. To bridge this gap, we propose novel enhancements for motion compression and temporal fusion for neural B-frame coding. First, we design a fine-grained motion compression method. This method incorporates an interactive dual-branch motion auto-encoder with per-branch adaptive quantization steps, which enables fine-grained compression of bi-directional motion vectors while accommodating their asymmetric bitrate allocation and reconstruction quality requirements. Furthermore, this method involves an interactive motion entropy model that exploits correlations between bi-directional motion latent representations by interactively leveraging partitioned latent segments as directional priors. Second, we propose a selective temporal fusion method that predicts bi-directional fusion weights to achieve discriminative utilization of bi-directional multi-scale temporal contexts with varying qualities. Additionally, this method introduces a hyperprior-based implicit alignment mechanism for contextual entropy modeling. By treating the hyperprior as a surrogate for the contextual latent representation, this mechanism implicitly mitigates the misalignment in the fused bi-directional temporal priors. Extensive experiments demonstrate that our proposed codec outperforms state-of-the-art neural B-frame codecs and achieves comparable or even superior compression performance to the H.266/VVC reference software under random-access configurations.

View on arXiv
@article{sheng2025_2506.07709,
  title={ Fine-Grained Motion Compression and Selective Temporal Fusion for Neural B-Frame Video Coding },
  author={ Xihua Sheng and Peilin Chen and Meng Wang and Li Zhang and Shiqi Wang and Dapeng Oliver Wu },
  journal={arXiv preprint arXiv:2506.07709},
  year={ 2025 }
}
Comments on this paper