Title |
---|
![]() CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for
Foundation Models Zhong-Zhi Li Ming-Liang Zhang Fei Yin Zhi-Long Ji Jin-Feng Bai Zhen-Ru Pan Fan-Hu Zeng Jian Xu Jia-Xin Zhang Cheng-Lin Liu |
![]() Chain-of-Probe: Examining the Necessity and Accuracy of CoT Step-by-Step Zezhong Wang Xingshan Zeng Weiwen Liu Yufei Wang Liangyou Li Yasheng Wang Lifeng Shang Xin Jiang Qun Liu Kam-Fai Wong |
![]() OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang Zengzhi Wang Shijie Xia Xuefeng Li Haoyang Zou ...Yuxiang Zheng Shaoting Zhang Dahua Lin Yu Qiao Pengfei Liu |
![]() Scaling Large Language Model-based Multi-Agent Collaboration Chen Qian Zihao Xie YiFei Wang Wei Liu Yufan Dang ...Zhuoyun Du Weize Chen Cheng Yang Zhiyuan Liu Maosong Sun |