Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.16160
Cited By
EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios
22 May 2025
Bin Xu
Yu Bai
Huashan Sun
Yiguan Lin
Siming Liu
Xinyue Liang
Yaolin Li
Yang Gao
Heyan Huang
AI4Ed
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios"
21 / 21 papers shown
Title
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu
P. Wang
Ran Xu
Shirong Ma
Chong Ruan
Ziwei Sun
Yang Liu
Y. Wu
OffRL
LRM
81
30
0
03 Apr 2025
Can Language Models Evaluate Human Written Text? Case Study on Korean Student Writing for Education
Seungyoon Kim
Seungone Kim
AI4Ed
53
1
0
24 Jul 2024
Enhancing Computer Programming Education with LLMs: A Study on Effective Prompt Engineering for Python Code Generation
Tianyu Wang
Nianjun Zhou
Zhixiong Chen
74
10
0
07 Jul 2024
Simulating Classroom Education with LLM-Empowered Agents
Zheyuan Zhang
Daniel Zhang-Li
Jifan Yu
Linlu Gong
Jinchang Zhou
Zhiyuan Liu
Lei Hou
Juanzi Li
LLMAG
61
57
0
27 Jun 2024
Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions
Hamdireza Rouzegar
Masoud Makrehchi
19
7
0
20 Jun 2024
Generating Educational Materials with Different Levels of Readability using LLMs
Chieh-Yang Huang
Jing Wei
Ting-Hao 'Kenneth' Huang
123
9
0
18 Jun 2024
Evaluating Contextually Personalized Programming Exercises Created with Generative AI
E. Logacheva
Arto Hellas
James Prather
Sami Sarsa
Juho Leinonen
54
11
0
11 Jun 2024
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
Zhuoxuan Jiang
Haoyuan Peng
Shanshan Feng
Fan Li
Dongsheng Li
KELM
LRM
61
14
0
09 May 2024
Large Language Models for Education: A Survey and Outlook
Shen Wang
Tianlong Xu
Hang Li
Chaoli Zhang
Joleen Liang
Jiliang Tang
Philip S. Yu
Qingsong Wen
AI4Ed
76
103
0
26 Mar 2024
Large Language Models in Education: Vision and Opportunities
Wensheng Gan
Zhenlian Qi
Jiayang Wu
Chun-Wei Lin
AI4Ed
76
74
0
22 Nov 2023
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein
Betty Li Hou
Asa Cooper Stickland
Jackson Petty
Richard Yuanzhe Pang
Julien Dirani
Julian Michael
Samuel R. Bowman
AI4MH
ELM
64
584
0
20 Nov 2023
Learning gain differences between ChatGPT and human tutor generated algebra hints
Z. Pardos
Shreya Bhandari
19
111
0
14 Feb 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
148
1,552
0
15 Dec 2022
Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book
Stephen MacNeil
Andrew Tran
Arto Hellas
Joanne Kim
Sami Sarsa
Paul Denny
Seth Bernstein
Juho Leinonen
65
181
0
04 Nov 2022
Question Generation for Reading Comprehension Assessment by Modeling How and What to Ask
Bilal Ghanem
Lauren Lutz Coleman
Julia Rivard Dexter
Spencer McIntosh von der Ohe
Alona Fyshe
AI4Ed
36
28
0
06 Apr 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
183
4,175
0
27 Oct 2021
EQG-RACE: Examination-Type Question Generation
Xin Jia
Wenjie Zhou
Xu Sun
Yunfang Wu
AI4Ed
33
40
0
11 Dec 2020
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
D. Song
Jacob Steinhardt
ELM
RALM
135
4,222
0
07 Sep 2020
SuperGlue: Learning Feature Matching with Graph Neural Networks
Paul-Edouard Sarlin
Daniel DeTone
Tomasz Malisiewicz
Andrew Rabinovich
3DPC
OffRL
71
1,907
0
26 Nov 2019
A Multi-language Platform for Generating Algebraic Mathematical Word Problems
Vijini Liyanage
Surangika Ranathunga
30
8
0
19 Nov 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
594
7,080
0
20 Apr 2018
1