2025 ISAKOS Biennial Congress In-Person Poster
Thickness at the Medial Side in Patellar Cutting Leads to Inferior Clinical Outcomes After Total Knee Arthroplasty
Sanna Haaland, Medical Student, Bergen NORWAY
Eivind Inderhaug, MD, PhD, MPH, Bergen NORWAY
Mario Hevesi, MD, PhD, Rochester, MN UNITED STATES
Yining Lu, MD, Rochester, Minnesota UNITED STATES
Jonathan D. Hughes, MD, PhD, Allison Park, Pennsylvania UNITED STATES
Duhyun Ro, MD, Prof. , Jongno-Gu, Seoul KOREA, REPUBLIC OF
Sung Eun Kim, MD, Seoul KOREA, REPUBLIC OF
Anne Marie Fenstad, MSc, Bergen NORWAY
Linjun Yang, PhD, Rochester, Minnesota UNITED STATES
R. Kyle Martin, MD, FRCSC, St. Cloud, MN UNITED STATES
University of Minnesota, St Cloud, Minnesota, UNITED STATES
FDA Status Not Applicable
Summary
A comparative study evaluating the accuracy and agreement of two independent AI algorithms for automated posterior tibial slope measurement on lateral knee radiographs.
Abstract
Background
The posterior tibial slope (PTS) is a crucial factor in assessing knee joint function, especially in orthopedic surgery and sports medicine. Accurate PTS measurement on lateral knee radiographs is essential for diagnosing knee issues and planning surgeries like ACL reconstruction. Traditionally, PTS measurements are done manually, which can be time-consuming and prone to variability. With advances in artificial intelligence (AI), automated PTS measurement methods have emerged, offering potential improvements in efficiency and consistency. However, these models have not yet been externally validated for clinical use. The purpose of this study was to assess the external validity of two novel AI models trained to automatically measure PTS, evaluating their accuracy and agreement versus manual measurement. The hypothesis was that the level of agreement would be high for the two models (>90% agreement).
Methods
Lateral knee radiographs from Norway and the USA were included in the study. Images that were short, malrotated, or of poor quality were excluded. All images were first measured manually by an experienced clinician using the two-circle method to determine the PTS. The images were then analyzed using two different independent models for automatic PTS measurement: one developed at Mayo Clinic, USA, while the other one was developed in Seoul, South Korea. The results were categorized based on the difference in PTS measurements between the models and the manual measurements: less than 2.5 degrees difference was classified as "agreement," 2.5-5 degrees as "moderate agreement," and above 5 degrees as "disagreement." Subsequently, the two different models were compared using Wilcoxon signed-rank test to assess the significance of the differences, with a null hypothesis that the median difference between the manual measurements and each AI model was zero.
Results
In total, 1,372 images were included in the study. The mean PTS (manual measurement) was 10.0° (SD 3.0°, Range 1.0° – 23.5°). The comparison of PTS measurements between manual methods and the two AI-based models demonstrated varying degrees of agreement. The South Korean model showed the highest level of agreement (76,4%), with the least disagreement (4.6%) and only 7.4% were unable to be measured. The Mayo model showed a lower level of agreement (39.4%) and a higher level of disagreement (27.5%), along with a higher rate of unmeasurable images (10.8%). The Wilcoxon signed-rank test found that the differences were statistically significant (p<0.05).
Conclusion
This study found that while both the Mayo and South Korean AI models could automatically calculate the PTS on most of the lateral radiographs, the level of agreement with manual measurement was only 76% for the best performing model. These results suggest that while AI models are promising for clinical application, further refinement is needed to ensure consistency with manual measurements, particularly in cases where precise PTS measurement is critical. This study also highlights the importance of appropriate external validation prior to the widespread adoption and clinical implementation of new AI models in orthopaedic surgery.