Introduction
The purpose of the present study is to demonstrate the importance of accounting for health inequities in orthopaedic machine learning (ML) models through the development of a racially equitable Random Forest (RF) algorithm that predicts overnight stay for knee arthroscopy patients who self-identified as White, Black or African American, or Asian. We hypothesize that a racially equitable ML model will exhibit high levels of performance irrespective of self-identified patient race.
Methods
This retrospective study queried the National Surgical Quality Improvement Program (NSQIP) database for patients ≥18 years old who underwent knee arthroscopy between 2011 and 2020. The study population consisted of patients who self-identified as White, Black or African American, or Asian and had complete demographic, intraoperative, and total hospital length of stay (LOS) data. Patients with a LOS ≥1 day and <1 day represented the overnight stay and same-day discharge cohorts, respectively. Predictive analysis was performed using a RF algorithm, and model performance was assessed using area under the ROC curve (AUC), accuracy, and F1-score. To optimize performance, feature selection, hyperparameter tuning, the synthetic minority over-sampling technique (SMOTE), and undersampling were employed.
Results
Of the 73,771 patients who met inclusion criteria, 7.6% (n=5,593) required overnight stay. When stratified by race, most patients self-identified as White (80.4%), followed by Black or African American (14.0%), and Asian (5.6%). The RF model showed high levels of performance by AUC (mean=0.95±0.03) and accuracy (mean=0.96±0.01) across all included self-identified races. The F1-score was excellent for Asian patients (0.92) and good for Black or African American (0.73) and White patients (0.72).
Conclusions
The present study demonstrates a racially equitable RF algorithm that performs with high accuracy for patients who self-identified as White, Black or African American, or Asian when stratified by patient race. Stratifying ML performance by patient race is an imperative step for ensuring that prognostic orthopaedic models are racially equitable and perform with high degrees of accuracy for all patients.