Universal Automatic Short Answer Grading (ASAG) Model: A Comprehensive Approach
Main Article Content
Abstract
Automated Short Answer Grading (ASAG) plays a crucial role in modern e-learning systems by ensuring the efficient, accurate, and consistent assessment of student responses in online education. However, many existing ASAG models struggle with generalization across different domains and question complexities, often facing challenges such as limited training data, high computational costs, and variations in the length of student answers (SA) relative to reference answers (RA). This paper introduces a Universal ASAG Model that combines multiple natural language processing (NLP) techniques, including Sentence-BERT (SBERT), Transformer-based Attention, BERT, LSTMs, and BM25-based Term Weighting. The model features a length-adaptive architecture that categorizes answers into five groups—very short, short, medium, long, and very long—based on their relative length percentages (e.g., very short: 0–30% shorter than the RA). Each category undergoes customized processing to enhance both accuracy and computational efficiency. We provide a comprehensive breakdown of the model’s architecture, detailing its processing pipeline, pseudo-code implementation, mathematical foundations, hyperparameter tuning strategies, and experimental evaluation using benchmark datasets such as SciEntsBank and SemEval-2013. Our model achieves state-of-the-art results, including an F1-score of 91.2%, a Pearson correlation of 0.90, and an RMSE of 0.18, outperforming existing approaches. Additionally, we review recent advancements in ASAG, discussing key contributions, ongoing challenges, and potential future directions.