Title |
---|
![]() Long Context Transfer from Language to Vision Peiyuan Zhang Kaichen Zhang Bo Li Guangtao Zeng Jingkang Yang Yuanhan Zhang Ziyue Wang Haoran Tan Chunyuan Li Ziwei Liu |
![]() Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Yuxuan Qiao Haodong Duan Xinyu Fang Junming Yang Lin Chen Songyang Zhang Jiaqi Wang Dahua Lin Kai Chen |
![]() GUI Action Narrator: Where and When Did That Action Take Place? Qinchen Wu Difei Gao Kevin Qinghong Lin Zhuoyu Wu Xiangwu Guo Peiran Li Weichen Zhang Hengxu Wang Mike Zheng Shou |