Title |
---|
![]() VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models Haodong Duan Junming Yang Junming Yang Xinyu Fang Lin Chen ...Yuhang Zang Pan Zhang Jiaqi Wang Dahua Lin Kai Chen |
![]() CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for
Foundation Models Zhong-Zhi Li Ming-Liang Zhang Fei Yin Zhi-Long Ji Jin-Feng Bai Zhen-Ru Pan Fan-Hu Zeng Jian Xu Jia-Xin Zhang Cheng-Lin Liu |
![]() Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Yuxuan Qiao Haodong Duan Xinyu Fang Junming Yang Lin Chen Songyang Zhang Jiaqi Wang Dahua Lin Kai Chen |
![]() PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal
Documents Junjie Wang Yin Zhang Yatai Ji Yuxiang Zhang Chunyang Jiang ...Bei Chen Qunshu Lin Minghao Liu Ge Zhang Wenhu Chen |
![]() OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang Zengzhi Wang Shijie Xia Xuefeng Li Haoyang Zou ...Yuxiang Zheng Shaoting Zhang Dahua Lin Yu Qiao Pengfei Liu |