A Multi-Class Classification based Machine Learning Approach for Predicting Liver Cirrhosis Outcomes
Main Article Content
Abstract
This study presents a high-accuracy machine learning framework for predicting liver cirrhosis outcomes using clinical data obtained from Kaggle's open-access repository. Analysing 18 key biomarkers including Bilirubin, Albumin, Prothrombin time, and Platelets across 2500 patient records, we developed an optimized Random Forest classifier that achieved exceptional performance in tri-class outcome prediction. The model demonstrated 98.92% overall accuracy, with class-specific metrics showing outstanding discrimination: Alive (precision=0.99, recall=0.99), Deceased (precision=0.99, recall=0.99), and Transplant cases (precision=0.99, recall=0.97). Feature importance analysis from the Kaggle-derived data identified copper (mean decrease impurity=0.125) and Bilirubin (0.175) and Albumin (0.100) as top predictors, validating known clinical biomarkers of hepatic dysfunction. The robust performance across all outcome categories, particularly for transplant candidates (F1-score=0.98), suggests strong potential for clinical decision support. Our methodology employed rigorous data preprocessing including median imputation for missing values and SMOTE for class balancing, while maintaining reproducibility through open dataset utilization. These results demonstrate that machine learning models trained on publicly available clinical data can achieve hospital-grade predictive accuracy for cirrhosis outcomes, with implications for resource allocation and treatment planning in hepatology practice.