Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction

2 June 2024

Papers citing "Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction"

6 / 6 papers shown

Title
Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling Jiayi Zeng Yizhe Feng Mengliang He Wenhui Lei Wei Zhang Zeming Liu Xiaoming Shi Aimin Zhou LRM 28 0 0 29 May 2025
When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction Yuqing Yang Robin Jia KELM LRM 122 1 0 22 May 2025
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection Yibo Yan Shen Wang Jiahao Huo Philip S. Yu Xuming Hu Qingsong Wen 352 8 0 23 Mar 2025
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs Zhuoshi Pan Yu Li Honglin Lin Qizhi Pei Zinan Tang Wei Wu Chenlin Ming H. Vicky Zhao Zeang Sheng Lijun Wu LRM 155 6 0 21 Mar 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models Mingyang Song Zhaochen Su Xiaoye Qu Jiawei Zhou Yu Cheng LRM 168 40 0 06 Jan 2025
Number Cookbook: Number Understanding of Language Models and How to Improve It Haotong Yang Yi Hu Shijia Kang Zhouchen Lin Muhan Zhang LRM 112 8 0 06 Nov 2024