2023 ISAKOS Biennial Congress Paper
Unsupervised Machine Learning of the Combined Danish and Norwegian Knee Ligament Registers Identifies Five Discrete Patient Groups With Differing ACL Revision Rates
R. Kyle Martin, MD, FRCSC, St. Cloud, MN UNITED STATES
Solvejg Wastvedt, BA, Minneapolis UNITED STATES
Ayoosh Pareek, MD, New York, NY UNITED STATES
Andreas Persson, MD, Oslo NORWAY
Havard Visnes, MD, PT, PhD, Kristiansand NORWAY
Anne Marie Fenstad, MSc, Bergen NORWAY
Gilbert Moatshe, MD, PhD, Oslo NORWAY
Julian Wolfson, PhD, Minneapolis, MN UNITED STATES
Martin Lind, MD, PhD, Prof., Aarhus N DENMARK
Lars Engebretsen, MD, PhD, Oslo NORWAY
University of Minnesota, Minneapolis, MN, UNITED STATES
FDA Status Not Applicable
Summary
Unsupervised machine learning analysis of the combined Danish and Norwegian knee ligament registers enables quick risk stratification into high, medium, or low risk categories for future patients undergoing ACL reconstruction with hamstring or BTB.
Abstract
Purpose
Most of the machine learning applications within the orthopaedic literature to date have utilized a “supervised” learning approach aimed at making predictions and classifications based on labeled variables within a dataset. In contrast, “unsupervised” learning represents a machine learning technique that allows the computer to independently find patterns in a dataset without a pre-specified outcome. The purpose of this study was to apply unsupervised machine learning to the combined Danish and Norwegian Knee Ligament Registers (KLR) with the goal of detecting distinct subgroups within the dataset. The hypothesis was that this analysis would identify groups of patients with differing rates of subsequent anterior cruciate ligament reconstruction (ACLR) revision that could be used to categorize a future patient and quickly estimate their revision risk in the clinical setting.
Methods
A type of unsupervised learning known as clustering was performed on the complete case KLR data. Since the KRL contain both continuous and categorical variables, a clustering method known as “k-Prototypes” that accommodates both continuous and categorical data was used. As with many clustering methods, k-Prototypes requires pre-specification of the number of clusters. A combination of data-driven methods and domain knowledge were used to arrive at a target of five clusters. After performing the unsupervised learning, clinically relevant characteristics of each cluster were obtained using variable summaries and surgeons’ domain knowledge. Kaplan-Meier survival curve was created for the clusters.
Results
Five patient clusters were identified with the following characteristics: Cluster 1 (revision rate = 9.8%) patients were young (22 ± 7 years), female (73%), with Hamstring Tendon (HT) autograft (89%). Cluster 2 (revision rate = 6.5%) patients were young (24 ± 9 years), male (67%), with HT (91%). Cluster 3 (revision rate = 4.3%) patients were older (38 ± 9 years) undergoing HT reconstruction (93%). Cluster 4 (revision rate = 4.2%) patients received patellar tendon (BTB) autograft (86%) with low baseline Knee Injury and Osteoarthritis Outcome Score (KOOS) Sports score (21.3 ± 14.8). Cluster 5 (revision rate = 4.7%) patients received BTB (85%) and had higher baseline KOOS Sports scores (66.8 ±16.8).
Discussion
Unsupervised learning enabled the identification of five distinct patient subgroups among patients undergoing ACLR with either HT or BTB autograft. Each grouping was associated with its own rate of subsequent ACLR revision. Patients can be classified into one of the five clusters based on only four variables: sex, age, graft choice (HT or BTB autograft), and preoperative KOOS sports subscale score. Patients receiving other graft choices do not fit into the current model, which was limited by the relatively low number of patients within the dataset that received alternate grafts. The resulting groupings will enable quick risk stratification for future patients undergoing ACLR with HT or BTB in the clinical setting. Patients in Cluster 1 are considered high risk (9.8%), Cluster 2 patients are medium risk (6.5%), while patients in Clusters 3-5 are considered low risk (4.2-4.7%) for experiencing subsequent revision ACLR.
Level of Evidence: III