VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning

15 April 2026

Yifan Li

Pei Cheng

Bin Fu

Shuai Yang

Jiaying Liu

VGen

ArXiv (abs)PDF HTML Github (2021★)

Main:7 Pages

12 Figures

Bibliography:2 Pages

1 Tables

Appendix:3 Pages

Abstract

Video chroma-lux editing, which aims to modify illumination and color while preserving structural and temporal fidelity, remains a significant challenge. Existing methods typically rely on expensive supervised training with synthetic paired data. This paper proposes VibeFlow, a novel self-supervised framework that unleashes the intrinsic physical understanding of pre-trained video generation models. Instead of learning color and light transitions from scratch, we introduce a disentangled data perturbation pipeline that enforces the model to adaptively recombine structure from source videos and color-illumination cues from reference images, enabling robust disentanglement in a self-supervised manner. Furthermore, to rectify discretization errors inherent in flow-based models, we introduce Residual Velocity Fields alongside a Structural Distortion Consistency Regularization, ensuring rigorous structural preservation and temporal coherence. Our framework eliminates the need for costly training resources and generalizes in a zero-shot manner to diverse applications, including video relighting, recoloring, low-light enhancement, day-night translation, and object-specific color editing. Extensive experiments demonstrate that VibeFlow achieves impressive visual quality with significantly reduced computational overhead. Our project is publicly available atthis https URL.

View on arXiv

Comments on this paper