Better Peer Grading through Bayesian Inference
DOI:
https://doi.org/10.1609/aaai.v37i5.25757Keywords:
HAI: Crowdsourcing, APP: Education, GTEP: Applications, GTEP: Mechanism DesignAbstract
Peer grading systems aggregate noisy reports from multiple students to approximate a "true" grade as closely as possible. Most current systems either take the mean or median of reported grades; others aim to estimate students’ grading accuracy under a probabilistic model. This paper extends the state of the art in the latter approach in three key ways: (1) recognizing that students can behave strategically (e.g., reporting grades close to the class average without doing the work); (2) appropriately handling censored data that arises from discrete-valued grading rubrics; and (3) using mixed integer programming to improve the interpretability of the grades assigned to students. We demonstrate how to make Bayesian inference practical in this model and evaluate our approach on both synthetic and real-world data obtained by using our implemented system in four large classes. These extensive experiments show that grade aggregation using our model accurately estimates true grades, students' likelihood of submitting uninformative grades, and the variation in their inherent grading error; we also characterize our models' robustness.Downloads
Published
2023-06-26
How to Cite
Zarkoob, H., d’Eon, G., Podina, L., & Leyton-Brown, K. (2023). Better Peer Grading through Bayesian Inference. Proceedings of the AAAI Conference on Artificial Intelligence, 37(5), 6137-6144. https://doi.org/10.1609/aaai.v37i5.25757
Issue
Section
AAAI Technical Track on Humans and AI