Analyzing Technology Access Inequality in California’s Homeless Population Using Machine Learning Techniques

Main Article Content

Md Refadul Hoque, Jannat Ara, Lisa Chambugong, Arifa Siddiqua, Samina Ahmed, Iftekhar Hossain, Nasrin Akter Tohfa

Abstract

The research analyzes technology access disparity on a county level throughout California through an interpretable machine-learning framework uniting supervised classification, explainable AI, and unsupervised clustering. With publicly available broadband and socioeconomic variables of the dataset Older 58 counties in California, using a dichotomous outcome model of lower and higher-access counties and evaluating several algorithms (logistic regression, SVM with RBF kernel, random forest, gradient boosting, and extra trees), the paper uses a dataset of publicly available broadband and socioeconomic variables entitled U.S. Broadband Availability. Results of comparative analysis indicate that logistic regression is the most effective one in general (accuracy about 0.611, F1-score of about 0.462), which also has a moderate discriminative ability (ROC-AUC about 0.59), with the best measure of precision-recall (average precision about 0.66).  Reliability is also evaluated in this study by calculation and threshold-sensitivity analysis to bring out trade-offs pertinent to policy targeting. Explainability analyses (feature importance, SHAP, and partial dependence) reveal that structural socioeconomic aspects, particularly poverty, unemployment, population dynamics, and social assistance dependency, are core contributors to broadband exclusion, whereas direct household-level technology indicators have a secondary role. Last, the distinct profiles of the K-means clustering (K = 3) clearly distinguish broadband-access profiles (well-connected, transitional, and severely underserved), which then inform more customized intervention strategies.

Article Details

Section
Articles