Optimizing Biogas Production with Machine Learning: A Comparative Study of Predictive Models.
Main Article Content
Abstract
The prediction of biogas production is essential for optimizing operational conditions, enhancing process efficiency, and supporting sustainable energy systems. Traditional biogas yield prediction methods struggle to capture the nonlinear and complex interactions among influential factors such as feedstock composition, temperature, pH, and retention time. Machine learning (ML) models provide a promising alternative by analyzing patterns in historical data to make accurate, data-driven predictions. This study evaluates the effectiveness of six ML models Linear Regression (LR), Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), k-nearest Neighbors (k-NN), and Artificial Neural Networks (ANNs) for predicting biogas production on dataset of an experiment performed in 5 years from January 1, 2019, to October 30, 2024. Each model's performance was assessed using common evaluation metrics for regression analysis, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Squared (R²) Score, to compare their accuracy, robustness, and suitability for biogas data, which often involves nonlinear relationships and multivariate interactions. The findings demonstrate that DT and RF outperform simpler approaches in terms of accuracy with of 0.999 and 0.998 respectively, making them ideal for complex biogas prediction tasks. This study underscores the potential of ML models in optimizing biogas production systems and contributes to developing efficient, scalable solutions for renewable energy management.