Application of Machine Learning Algorithmic Models for the Authentication of Albanian Mono-Cultivar Olive Oils
Main Article Content
Abstract
Distinguished for its nutritional benefits and high economic value, olive oil has faced issues with adulteration and fraud. As production increases, the need to identify olive oil by specific cultivars and regions has become more pressing. Analyzing chemical data related to the origin and olive cultivars will facilitate advanced quality control and authenticity practices for Albanian olive oil, enhancing its competitiveness as an organic product. While traditional empirical methods have been relied upon to detect olive oil fraud and evaluate quality, this study pioneers a modern approach using machine learning algorithms to differentiate authentic products from counterfeits. Establishing effective mechanisms and best practices to trace product origins and quality indicators will raise awareness about the risks of adulteration to both consumers' health and the broader food industry. To enhance the precision of origin predictions, data pre-processing steps—especially the normalization process following the isolation of independent features from the target variable—are crucial for distance-based algorithms like kNN, which improve accuracy. Furthermore, performance metrics for all algorithms were evaluated, including k-Nearest Neighbors, Logistic Regression, Support Vector Machines, implementation hyperparameter tuning techniques, and the best-performing model. Applying supervised machine learning methods to categorize Albanian Olive Oils (OO) according to their chemical composition aids in identifying their geographical and cultivar origin. Our results indicate an accuracy of 88.88%, constrained by the limitations of the current dataset; however, we intend to expand the dataset in the future.