Texture-Based Biomedical Anomaly Detection Using Supervised Learning Techniques and GLCM
Main Article Content
Abstract
Anomaly detection in biomedical imaging plays a vital role in early disease diagnosis and treatment planning. However, conventional manual interpretation of images is labour-intensive, prone to subjective variation, and often inconsistent, especially when dealing with complex textures or low-contrast anomalies. This paper presents a structured, machine learning (ML)-based approach for the automatic classification of anomalies in biomedical images using texture features extracted via the Gray Level Co-occurrence Matrix (GLCM). The study emphasizes the performance evaluation of three ML classifiers—Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR)—with a focus on Random Forest and Logistic Regression. Images are first preprocessed through resizing and grayscale conversion, followed by GLCM-based texture feature extraction, specifically contrast, correlation, energy, and homogeneity. The data is then split into training and testing subsets, and both classifiers are trained to perform binary classification—distinguishing normal images from anomalous ones based on filename-derived labels. A variety of evaluation metrics, including confusion matrices, ROC-AUC curves, accuracy scores, and classification reports, are employed to assess model performance. Additionally, visualizations such as heatmaps, prediction distributions, and bar charts are provided to better interpret results. Initial results indicate that both classifiers struggle to clearly distinguish between classes, yielding accuracy values slightly above random chance (51%–53%) and AUC scores around 0.5. However, when Random Forest is fine-tuned, it significantly outperforms Logistic Regression, achieving an AUC of 0.92 and higher accuracy. The feature importance plot from the Random Forest model highlights that all four GLCM features contribute nearly equally to predictions, with correlation being the most influential. Overall, the study confirms that GLCM-based features provide a viable yet limited basis for anomaly detection in biomedical images. While Random Forest shows stronger generalization than Logistic Regression, results suggest that these traditional models alone are insufficient. The paper concludes with recommendations to enhance performance using deep learning embeddings, improved preprocessing techniques, or hybrid models that integrate multiple feature types for more robust detection capabilities.