2025 ISAKOS Biennial Congress ePoster
A Machine Learning-Powered Decision Tool to Predict the Risk of Knee Arthroplasty in Patients with Osteoarthritis Using Plain Radiographs and Routine Clinical Data
Ahmed Elgebaly, MD, PhD, Ottershaw, Surrey UNITED KINGDOM
Atef Abdelrahman Hassan, MD, Cairo, Cairo EGYPT
Rawad K M Hammad, MD, Aylesbury, Buckinghamshire UNITED KINGDOM
Hassan Abdalla, MD, London UNITED KINGDOM
Mohamed A. Imam, MD, MSc, DSportMed, ELD (Oxon), PhD, FRCS, London UNITED KINGDOM
University of East London, London, London, UNITED KINGDOM
FDA Status Not Applicable
Summary
A machine learning-powered decision tool using clinical and imaging data accurately predicts the risk of knee replacement in osteoarthritis patients, with Random Forest and XGBoost models showing the best performance.
ePosters will be available shortly before Congress
Abstract
Introduction
Osteoarthritis (OA) stands as the most prevalent degenerative joint condition, profoundly affecting patients' mobility and quality of life while also imposing a considerable economic burden worldwide. As the global population ages and obesity rates increase, the incidence of OA is on the rise, leading to an anticipated surge in the demand for knee arthroplasty (KA). Although KA is highly effective, predicting which patients will require this surgery remains a significant challenge in clinical practice. Accurate prediction tools are needed to better inform treatment decisions and optimize patient outcomes. This study aimed to develop a practical machine learning (ML)-powered decision tool to accurately KA risk using a combination of routine clinical data and data from plain radiographs.
Methods
Data of 4,796 patients (12,813 knees) were retrieved from the Osteoarthritis Initiative (OAI) dataset, who were followed for at least five years. A wide range of demographic and clinical variables (such as age, body mass index, functional scores, and 12-Item Short Form Survey) and plain radiographic features (including Kellgren-Lawrence (K-L) grade and Osteoarthritis Research Society International [OARSI] grades) were selected. Missing values were addressed through median imputation for numerical variables and mode imputation for categorical variables. Categorical variables were then encoded using label encoding to facilitate model training. Feature selection was performed using mutual information (MI) scores, identifying the top features most relevant to the target variable. Nine ML models were evaluated using a train-test split (80% training, 20% testing) to predict the risk of 5-year KA. Class imbalance was addressed using the synthetic minority over-sampling technique. Model performance was assessed using precision and area under the receiver operating characteristic curve (AUC).
Results
The dataset comprised 12,813 entries with 52 variables, including clinical, demographic, and radiological features. Across the models, Random Forest (RF) and Gradient Boosting (XGBoost) models emerged as the most reliable for predicting KA. The RF had an accuracy of 94.62%, a precision of 95.24%, and AUC-ROC of 0.96 (95% confidence interval [CI] 0.94 to 0.97). Likewise, the XGBoost had an accuracy of 92.9%, but with a lower precision (73.3%). Nevertheless, the AUC-ROC for this model was 0.89 (95% CI 0.88 to 0.91). The Logistic Regression model achieved a high overall accuracy of 92.74%. However, its ability to detect knee replacement cases (the minority class) was limited, with a precision of 62.9%. The model's AUC-ROC was 0.84 (95% CI 0.81 to 0.88). Other models had lower performance metrics.
Conclusion
The ML-powered decision tool has a clinically relevant utility for predicting the risk of KA in patients with OA using routine clinical data and plain radiographs. RF and XGBoost emerged as the most effective for predicting KA, balancing sensitivity and specificity. The findings suggest that ML-based decision tools can potentially aid in the early identification of OA patients who may require KA.