
v1v2 (latest)
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Papers citing "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
50 / 328 papers shown
Title |
---|
![]() End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Yongqi Wang Xinxiao Wu Shuo Yang Jiebo Luo |
![]() LVBench: An Extreme Long Video Understanding Benchmark Weihan Wang Zehai He Wenyi Hong Yean Cheng Xiaohan Zhang ...Shiyu Huang Bin Xu Yuxiao Dong Ming Ding Jie Tang |