Efficient Lightweight PCA-KNN-Based Model for Annual Dengue Risk Mapping in Urban Indonesia
Main Article Content
Abstract
Introduction: Dengue Hemorrhagic Fever (DHF) remains a critical public health concern in tropical urban environments, particularly those constrained by limited resources. As part of Southeast Asia’s endemic dengue belt, Indonesia experiences recurrent seasonal outbreaks requiring timely, scalable, and data-driven risk stratification strategies.
Objectives: This study aims to develop an efficient and interpretable machine learning framework for annual dengue risk mapping in urban Indonesia. The model enables binary classification of outbreak severity, supporting early warning systems and guiding public health interventions.
Methods: A hybrid approach was implemented, combining Principal Component Analysis (PCA) for dimensionality reduction and K-Nearest Neighbor (KNN) for binary classification. Semarang City, characterized by persistent transmission and pronounced interannual variability, was selected as the empirical case study. The dataset included morbidity and mortality records from 2020 to 2025, enriched with epidemiological, climatological, and demographic indicators. PCA was applied to extract the most informative components, followed by KNN to classify each year into high-risk or low-risk categories. Model performance was evaluated using Leave-One-Out Cross Validation (LOOCV).
Results: The PCA-KNN model achieved an overall classification accuracy of 83.33%, 66.7% precision, 100% recall, and an F1-score of 80%, demonstrating robustness across temporal variations. Its lightweight architecture and minimal computational demands underscore its suitability for deployment in resource-constrained settings.
Conclusions: This study presents a replicable and pragmatic annual dengue risk stratification framework. The model’s computational efficiency, interpretability, and operational relevance highlight its potential utility in epidemic preparedness, vector control planning, and public health surveillance, particularly in urban regions with limited infrastructure and high disease burden.