Machine Learning Can Mitigate Racial Inequities In Predictive Orthopaedic Models

Machine Learning Can Mitigate Racial Inequities In Predictive Orthopaedic Models

Jonathan S. Lee, BA, UNITED STATES Megan Wei , MS, UNITED STATES Stephen M. Gillinov, AB, UNITED STATES Bilal Siddiq, BS, UNITED STATES Kieran Sinclair Dowley, BA, UNITED STATES Nathan J. Cherian, MD, UNITED STATES Jeffrey S Mun , BA, UNITED STATES Srish S. Chenna, BSE, UNITED STATES Scott D. Martin, MD, UNITED STATES

Massachusetts General Hospital , Boston , MA, UNITED STATES


2025 Congress   ePoster Presentation   2025 Congress   Not yet rated

 

Anatomic Location

Diagnosis / Condition

Treatment / Technique


Summary: A racially equitable Random Forest algorithm that performs with high accuracy for patients who self-identified as White, Black or African American, or Asian when stratified by patient race can be created.


Introduction

The purpose of the present study is to demonstrate the importance of accounting for health inequities in orthopaedic machine learning (ML) models through the development of a racially equitable Random Forest (RF) algorithm that predicts overnight stay for knee arthroscopy patients who self-identified as White, Black or African American, or Asian. We hypothesize that a racially equitable ML model will exhibit high levels of performance irrespective of self-identified patient race.

Methods

This retrospective study queried the National Surgical Quality Improvement Program (NSQIP) database for patients ≥18 years old who underwent knee arthroscopy between 2011 and 2020. The study population consisted of patients who self-identified as White, Black or African American, or Asian and had complete demographic, intraoperative, and total hospital length of stay (LOS) data. Patients with a LOS ≥1 day and <1 day represented the overnight stay and same-day discharge cohorts, respectively. Predictive analysis was performed using a RF algorithm, and model performance was assessed using area under the ROC curve (AUC), accuracy, and F1-score. To optimize performance, feature selection, hyperparameter tuning, the synthetic minority over-sampling technique (SMOTE), and undersampling were employed.

Results

Of the 73,771 patients who met inclusion criteria, 7.6% (n=5,593) required overnight stay. When stratified by race, most patients self-identified as White (80.4%), followed by Black or African American (14.0%), and Asian (5.6%). The RF model showed high levels of performance by AUC (mean=0.95±0.03) and accuracy (mean=0.96±0.01) across all included self-identified races. The F1-score was excellent for Asian patients (0.92) and good for Black or African American (0.73) and White patients (0.72).

Conclusions

The present study demonstrates a racially equitable RF algorithm that performs with high accuracy for patients who self-identified as White, Black or African American, or Asian when stratified by patient race. Stratifying ML performance by patient race is an imperative step for ensuring that prognostic orthopaedic models are racially equitable and perform with high degrees of accuracy for all patients.