Title |
---|
![]() Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning Jianxiong Li Zhihao Wang Jinliang Zheng Xiaoai Zhou Guanming Wang ...Yu Liu Jingjing Liu Ya-Qin Zhang Junzhi Yu Xianyuan Zhan |
![]() Probing Mechanical Reasoning in Large Vision Language Models Haoran Sun Qingying Gao Haiyun Lyu Dezhi Luo Yijiang Li Hokin Deng |
![]() Vision Language Models See What You Want but not What You See Qingying Gao Yijiang Li Haiyun Lyu Haoran Sun Dezhi Luo Hokin Deng |
![]() MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Haotian Zhang Mingfei Gao Zhe Gan Philipp Dufter Nina Wenzel ...Haoxuan You Zirui Wang Afshin Dehghan Peter Grasch Yinfei Yang |