SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

Papers citing "SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation"

Title
No papers