
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions
Papers citing "InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions"
27 / 27 papers shown
Title |
---|
![]() Grounded Language-Image Pre-training Liunian Harold Li Pengchuan Zhang Haotian Zhang Jianwei Yang Chunyuan Li ...Lu Yuan Lei Zhang Lei Li Kai-Wei Chang Jianfeng Gao |