A novel equitable machine-learning framework for chronic disease prediction in underserved U.S communities

Justine Aku Azigi 1, * and Abdullahi Abdulkareem 2

1 Department of Computer Science, University of Ghana, Ghana.
2 Department of Agriculture, University of Ilorin, Nigeria.
 
Research Article
World Journal of Advanced Research and Reviews, 2024, 24(01), 2858-2866
Article DOI: 10.30574/wjarr.2024.24.1.2995
 
Publication history: 
Received on 20 August 2024; revised on 21 October 2024; accepted on 28 October 2024
 
Abstract: 
Chronic diseases such as diabetes, heart disease, and chronic respiratory illness remain leading causes of morbidity and mortality in the United States, disproportionately affecting low-income and minority communities. This study develops and evaluates equitable machine-learning models to predict chronic disease risk using the 2020 Behavioral Risk Factor Surveillance System (BRFSS) dataset, which contains over 300,000 adult health records across all 50 states. After data cleaning and feature engineering, we trained logistic regression, random forest, and XGBoost classifiers to predict diabetes as a proxy for chronic disease. Model performance was assessed using accuracy, F1 score, and area under the ROC curve (AUC), alongside fairness metrics disaggregated by race, income, and education. The Random Forest model achieved high predictive performance (AUC ≈ 0.80) while revealing notable disparities in predicted risk across demographic groups. To address this, we implemented fairness-aware post-processing that reduced bias without significantly reducing accuracy. The findings demonstrate that equitable AI systems can enhance early chronic-disease prediction, promote health equity, and support data-driven public-health initiatives. These results demonstrate how equitable AI systems can advance early chronic-disease detection and align with national efforts toward responsible public-health analytics and data modernization.
 
Keywords: 
Diabetes Prediction; Feature Engineering; Preventive Healthcare; Chronic Conditions; Machine Learning; Health Indicators
 
Full text article in PDF: 
Share this