Ensemble Model for Early Diabetes Prediction Using Machine Learning
Main Article Content
Abstract
Early prediction of diabetes is critical for timely intervention and prevention of long-term complications, yet conventional diagnostic and single-model prediction approaches often fail to capture the complex and multifactorial nature of the disease. This study proposes a multimodal ensemble-based system for early diabetes prediction by integrating heterogeneous data sources, including demographic, clinical, anthropometric, and lifestyle-related variables. Multiple machine learning models are trained as base learners to capture diverse risk patterns, and their predictions are combined using a stacking-based ensemble strategy to improve robustness and predictive accuracy. The proposed system is evaluated using comprehensive performance metrics and statistical validation techniques. Results demonstrate that the multimodal ensemble model consistently outperforms individual classifiers, achieving higher accuracy, recall, and discriminative ability, which are essential for early screening applications. Visual analyses further confirm effective class separation and the model’s capacity to capture nonlinear relationships between key metabolic indicators and diabetes risk. Overall, the findings highlight the effectiveness of ensemble learning combined with multimodal data integration as a reliable and scalable approach for early diabetes prediction, with strong potential for deployment in clinical decision-support and population-level screening systems.