Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.04873
Cited By
Evaluating Language Models for Generating and Judging Programming Feedback
5 July 2024
Charles Koutcheme
Nicola Dainese
Arto Hellas
Sami Sarsa
Juho Leinonen
Syed Ashraf
Paul Denny
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating Language Models for Generating and Judging Programming Feedback"
8 / 8 papers shown
Title
Rubric Is All You Need: Enhancing LLM-based Code Evaluation With Question-Specific Rubrics
Aditya Pathak
Rachit Gandhi
Vaibhav Uttam
Devansh
Yashwanth Nakka
...
Aditya Mittal
Aashna Ased
Chirag Khatri
Jagat Sesh Challa
Dhruv Kumar
42
0
0
31 Mar 2025
Howzat? Appealing to Expert Judgement for Evaluating Human and AI Next-Step Hints for Novice Programmers
Neil C. C. Brown
Pierre Weill-Tessier
Juho Leinonen
Paul Denny
Michael Kölling
77
0
0
27 Nov 2024
AI-TA: Towards an Intelligent Question-Answer Teaching Assistant using Open-Source LLMs
Yann Hicke
Anmol Agarwal
Qianou Ma
Paul Denny
AI4Ed
34
24
0
05 Nov 2023
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Lianghui Zhu
Xinggang Wang
Xinlong Wang
ELM
ALM
56
108
0
26 Oct 2023
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
224
572
0
03 May 2023
Practical and Ethical Challenges of Large Language Models in Education: A Systematic Scoping Review
Lixiang Yan
Lele Sha
Linxuan Zhao
Yuheng Li
Roberto Martínez-Maldonado
Guanliang Chen
Xinyu Li
Yueqiao Jin
D. Gašević
SyDa
ELM
AI4Ed
59
268
0
17 Mar 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
624
0
20 May 2021
1