Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.15845
Cited By
Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation
24 April 2024
Maja Stahl
Leon Biermann
Andreas Nehring
Henning Wachsmuth
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation"
16 / 16 papers shown
Title
SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models
Peichao Lai
Kaipeng Zhang
Yi Lin
L. Zhang
Feiyang Ye
...
Zifei Shan
Conghui He
Yue Wang
Wentao Zhang
Bin Cui
ELM
LRM
47
0
0
12 May 2025
Does the Prompt-based Large Language Model Recognize Students' Demographics and Introduce Bias in Essay Scoring?
Kaixun Yang
Mladen Raković
D. Gašević
Guanliang Chen
44
0
0
30 Apr 2025
Leveraging LLMs as Meta-Judges: A Multi-Agent Framework for Evaluating LLM Judgments
Y. Li
Jama Hussein Mohamud
Chongren Sun
Di Wu
Benoit Boulet
LLMAG
ELM
72
0
0
23 Apr 2025
Teach-to-Reason with Scoring: Self-Explainable Rationale-Driven Multi-Trait Essay Scoring
Heejin Do
Sangwon Ryu
Gary Geunbae Lee
LRM
55
0
0
28 Feb 2025
Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation
Jessica He
Stephanie Houde
Justin D. Weisz
70
1
0
25 Feb 2025
Improve LLM-based Automatic Essay Scoring with Linguistic Features
Zhaoyi Joey Hou
Alejandro Ciuba
Xiang Lorraine Li
57
1
0
13 Feb 2025
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection
Maximilian Spliethover
Tim Knebler
Fabian Fumagalli
Maximilian Muschalik
Barbara Hammer
Eyke Hüllermeier
Henning Wachsmuth
105
1
0
10 Feb 2025
Validity Arguments For Constructed Response Scoring Using Generative Artificial Intelligence Applications
Jodi M. Casabianca
Daniel F. McCaffrey
Matthew S. Johnson
Naim Alper
Vladimir Zubenko
37
0
0
04 Jan 2025
Can AI grade your essays? A comparative analysis of large language models and teacher ratings in multidimensional essay scoring
Kathrin Seßler
Maurice Fürstenberg
B. Bühler
Enkelejda Kasneci
AI4Ed
ELM
73
3
0
25 Nov 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
119
86
0
18 Sep 2024
Grammatical Error Feedback: An Implicit Evaluation Approach
Stefano Bannò
Kate Knill
Mark J. F. Gales
28
0
0
18 Aug 2024
Development of REGAI: Rubric Enabled Generative Artificial Intelligence
Zach Johnson
Jeremy Straub
41
1
0
05 Aug 2024
Automated Text Scoring in the Age of Generative AI for the GPU-poor
C. Ormerod
Alexander Kwako
46
2
0
02 Jul 2024
Human-AI Collaborative Essay Scoring: A Dual-Process Framework with LLMs
Changrong Xiao
Wenxing Ma
Qingping Song
Sean Xin Xu
Kunpeng Zhang
Yufang Wang
Qi Fu
AI4Ed
29
15
0
12 Jan 2024
Can Large Language Models Be an Alternative to Human Evaluations?
Cheng-Han Chiang
Hung-yi Lee
ALM
LM&MA
226
572
0
03 May 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
1