Title |
---|
![]() LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs Yujun Zhou Jingdong Yang Kehan Guo Pin-Yu Chen Tian Gao ...Tian Gao Werner Geyer Nuno Moniz Nitesh V Chawla Xiangliang Zhang |
![]() IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web Hongcheng Guo Wei Zhang Junhao Chen Yaonan Gu Jian Yang ...Binyuan Hui Tianyu Liu Jianxin Ma Chang Zhou Zhoujun Li |
![]() OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang Zengzhi Wang Shijie Xia Xuefeng Li Haoyang Zou ...Yuxiang Zheng Shaoting Zhang Dahua Lin Yu Qiao Pengfei Liu |