Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.14275
Cited By
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation
24 January 2025
Sadegh Mahdavi
Muchen Li
Kaiwen Liu
Christos Thrampoulidis
Leonid Sigal
Renjie Liao
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation"
8 / 8 papers shown
Title
ThinkSwitcher: When to Think Hard, When to Think Fast
Guosheng Liang
Longguang Zhong
Ziyi Yang
Xiaojun Quan
LRM
65
1
0
20 May 2025
Think Only When You Need with Large Hybrid-Reasoning Models
Lingjie Jiang
Xun Wu
Shaohan Huang
Qingxiu Dong
Zewen Chi
Li Dong
Xingxing Zhang
Tengchao Lv
Lei Cui
Furu Wei
OffRL
LRM
151
5
0
20 May 2025
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation
Zhenwen Liang
Linfeng Song
Yang Li
Tao Yang
Feng Zhang
Haitao Mi
Dong Yu
LRM
97
2
0
16 May 2025
Towards Contamination Resistant Benchmarks
Rahmatullah Musawi
Sheng Lu
130
0
0
13 May 2025
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset
Ivan Moshkov
Darragh Hanley
Ivan Sorokin
Shubham Toshniwal
Christof Henkel
Benedikt Schifferer
Wei Du
Igor Gitman
ReLM
LRM
88
16
0
23 Apr 2025
An Empirical Study on Eliciting and Improving R1-like Reasoning Models
Zhongfu Chen
Yingqian Min
Beichen Zhang
Jie Chen
Jinhao Jiang
...
Xu Miao
Yaojie Lu
Lei Fang
Zhongyuan Wang
Ji-Rong Wen
ReLM
OffRL
LRM
146
37
0
06 Mar 2025
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
Simin Chen
Yiming Chen
Zexin Li
Yifan Jiang
Zhongwei Wan
...
Dezhi Ran
Tianle Gu
Haoyang Li
Tao Xie
Baishakhi Ray
95
6
0
23 Feb 2025
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
Shanghaoran Quan
Jiaxi Yang
Bowen Yu
Jian Xu
Dayiheng Liu
...
Zeyu Cui
Yang Fan
Yanzhe Zhang
Binyuan Hui
Junyang Lin
ALM
ELM
LRM
122
36
0
02 Jan 2025
1