2023 ISAKOS Biennial Congress Paper
Ceiling Effect of the Combined Norwegian and Danish Knee Ligament Registers Limits Anterior Cruciate Ligament Reconstruction Outcome Prediction
R. Kyle Martin, MD, FRCSC, St. Cloud, MN UNITED STATES
Solvejg Wastvedt, BA, Minneapolis UNITED STATES
Ayoosh Pareek, MD, New York, NY UNITED STATES
Andreas Persson, MD, Oslo NORWAY
Havard Visnes, MD, PT, PhD, Kristiansand NORWAY
Anne Marie Fenstad, MSc, Bergen NORWAY
Gilbert Moatshe, MD, PhD, Oslo NORWAY
Julian Wolfson, PhD, Minneapolis, MN UNITED STATES
Martin Lind, MD, PhD, Prof., Aarhus N DENMARK
Lars Engebretsen, MD, PhD, Oslo NORWAY
University of Minnesota, Minneapolis, MN, UNITED STATES
FDA Status Not Applicable
Summary
Machine learning analysis of nearly 63,000 patients in the Norwegian and Danish knee ligament registers enabled prediction of revision ACL reconstruction risk with moderate accuracy, however, accuracy was similar to a previously developed model based on 25,000 patients suggesting a ceiling effect of the current registers.
Abstract
Background
Clinical tools based on machine learning analysis now exist for outcome prediction following primary anterior cruciate ligament reconstruction (ACLR). Relying partly on data volume, a general principle is that more data may lead to improved model accuracy. The purpose of this study was to apply machine learning to a combined dataset comprised of the Norwegian (NKLR) and Danish (DKLR) knee ligament registers with the aim of producing an algorithm that can predict subsequent revision surgery with improved accuracy relative to a previously published model developed using only the NKLR. The hypothesis was that the additional patient data would result in an algorithm that is more accurate.
Methods
Machine learning analysis was performed on the combined DKLR and NKLR. The primary outcome was the probability of revision ACL reconstruction within 1, 2, and 5 years. Data were split randomly into training sets (75%) and test sets (25%). Four machine learning models intended for this type of data were tested: Cox Lasso, survival random forest, and gradient boosted regression (GBM), and super learner. Model performance was evaluated by calculating concordance and calibration using methods adapted for censored data. Concordance measures the proportion of pairs of observations in which predicted ranking of survival probabilities corresponds to actual ranking. Calibration is a measure of the accuracy of predicted probabilities that compares expected to actual outcomes.
Results
After data cleaning, the combined registry population consisted of 62,955 patients. Revision surgery occurred in 5.1% of patients during an average follow-up time of 7.6 ± 4.5 years. The three non-parametric models – random survival forest, GBM, and super learner – had concordance in the moderate range (0.67, 95% CI 0.64-0.70) at all follow-up times. All three were also well calibrated, except for the random survival forest at 5 years (p<0.001). The Cox lasso performed more poorly. Multiply imputed data did not show notable differences from the complete case analysis.
Conclusion
Machine learning analysis of the combined registers enabled the prediction of subsequent revision surgery risk after primary ACLR with moderate accuracy. The most important finding of this study, however, was that this analysis of nearly 63,000 patients yielded similar prediction accuracy as a previous study of approximately 25,000 patients. This suggests a so-called ceiling effect of the registries has been reached and that simply adding more patients to the database is unlikely to appreciably improve prediction accuracy. This information can be used to inform further evolution of the knee ligament registries regarding data collection. The present study suggests that for an improvement in our ability to predict outcome based on knee ligament registry data, an evolution in the variables collected would be required. This represents a significant challenge as the balance between optimal variable collection and surgeon compliance is a delicate one - data collection must be streamlined to avoid survey fatigue and the addition of variables to the registry must be carefully considered, weighing the added value against the additional onus on the surgeons which may affect compliance.
Level of Evidence: III