LLMs in Automated Essay Evaluation: A Case Study

Milan Kostic; Hans Friedrich Witschel; Knut Hinkelmann; Maja Spahic-Bogdanovic

doi:10.1609/aaaiss.v3i1.31193

Authors

Milan Kostic University of Camerino
Hans Friedrich Witschel FHNW University of Applied Sciences and Arts Northwestern Switzerland
Knut Hinkelmann FHNW University of Applied Sciences and Arts Northwestern Switzerland University of Camerino
Maja Spahic-Bogdanovic FHNW University of Applied Sciences and Arts Northwestern Switzerland University of Camerino

DOI:

https://doi.org/10.1609/aaaiss.v3i1.31193

Keywords:

Large Language Models, Automatic Essay Evaluation, Assignment Evaluation, Higher Education

Abstract

This study delves into the application of large language models (LLMs), such as ChatGPT-4, for the automated evaluation of student essays, with a focus on a case study conducted at the Swiss Institute of Business Administration. It explores the effectiveness of LLMs in assessing German-language student transfer assignments, and contrasts their performance with traditional evaluations by human lecturers. The primary findings highlight the challenges faced by LLMs in terms of accurately grading complex texts according to predefined categories and providing detailed feedback. This research illuminates the gap between the capabilities of LLMs and the nuanced requirements of student essay evaluation. The conclusion emphasizes the necessity for ongoing research and development in the area of LLM technology to improve the accuracy, reliability, and consistency of automated essay assessments in educational contexts.

LLMs in Automated Essay Evaluation: A Case Study

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Information