
v1v2 (latest)
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Papers citing "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision"
50 / 212 papers shown
Title |
---|
![]() End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Yongqi Wang Xinxiao Wu Shuo Yang Jiebo Luo |