Title |
---|
![]() Chain-of-Probe: Examining the Necessity and Accuracy of CoT Step-by-Step Zezhong Wang Xingshan Zeng Weiwen Liu Yufei Wang Liangyou Li Yasheng Wang Lifeng Shang Xin Jiang Qun Liu Kam-Fai Wong |
![]() OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang Zengzhi Wang Shijie Xia Xuefeng Li Haoyang Zou ...Yuxiang Zheng Shaoting Zhang Dahua Lin Yu Qiao Pengfei Liu |
![]() The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models Seungone Kim Juyoung Suk Ji Yong Cho Shayne Longpre Chaeeun Kim ...Sean Welleck Graham Neubig Moontae Lee Kyungjae Lee Minjoon Seo |
![]() SELF: Self-Evolution with Language Feedback Jianqiao Lu Wanjun Zhong Wenyong Huang Yufei Wang Qi Zhu ...Weichao Wang Xingshan Zeng Lifeng Shang Xin Jiang Qun Liu |
![]() FLM-101B: An Open LLM and How to Train It with Xiang Li Yiqun Yao Xin Jiang Xuezhi Fang Xuying Meng ...Li Du Bowen Qin Zheng-Wei Zhang Aixin Sun Yequan Wang |