2025 ISAKOS Congress in Munich, Germany

2025 ISAKOS Biennial Congress Paper

 

External Validation Of A Novel Landmark-Based Deep Learning Automated Tibial Slope Measurement Algorithm Applied On Short Radiographs From ACL Injured Patients

R. Kyle Martin, MD, FRCSC, St. Cloud, MN UNITED STATES
Sanna Haaland, Medical Student, Bergen NORWAY
Andreas Persson, MD, PhD, Oslo NORWAY
Sung Eun Kim, MD, Seoul KOREA, REPUBLIC OF
ByeongYeong Ryu, MD, Seoul KOREA, REPUBLIC OF
Jun Woo Nam, B.Eng., Seoul KOREA, REPUBLIC OF
Duhyun Ro, MD, Prof. , Jongno-Gu, Seoul KOREA, REPUBLIC OF
Eivind Inderhaug, MD, PhD, MPH, Bergen NORWAY

Helse Bergen, Bergen, NORWAY

FDA Status Not Applicable

Summary

The external validity of a novel deep learning tool to automatically calculate posterior tibial slope was assessed using a set of short lateral knee radiographs in ACL deficient patients.

Abstract

Background

Deep learning algorithms can aid medical decision-making by performing routine tasks without any human bias. Reading of standardized radiographs lends itself well for an automated approach. Posterior tibial slope is increasingly recognized as a factor in lower leg biomechanics that can affect outcomes of knee ligament and knee replacement surgery. Therefore, tibial slope measurements should be easily calculated and reproducible in the clinical setting. The current study aims to externally validate a novel deep learning model developed for posterior tibial slope readings by applying an independent dataset of radiographs from a different country. The reliability of the model was tested through comparison with human measurements. The hypothesis was that a computerized approach would yield a reliability that was similar to human analyses.

Methods

A consecutive series of lateral knee radiographs from patients undergoing ACL reconstruction were eligible for inclusion. Two independent experienced clinicians individually assessed the tibial slope measurement to establish the intra-rater reliability. All images were then processed by the newly developed model for the automated readings. Intra-rater and inter-rater reliability were thereafter established between readers and between manual and the automated readings measured by inter- and intra-class correlation coefficients (ICC). Time required for tibial slope measurement using each method was also compared. Extreme differences between the two methods were analysed for potential errors.

Results

Two-hundred and eighty-nine radiographs were included in the study and therefore analyzed both by the manual and the automated method. The manually measured mean tibial slope was 9.7° (SD 2.7°, Range 3.0° – 19.1°). The inter-rater and intra-rater measurement between the independent measurers for the two-circle method was 0.86 and 0.92. The intra-rater agreement of the deep learning model was 1.00, while an ICC between 0.73 and 0.8 was found for the inter-rater reliability of the model when compared with the manual measurements. Time required for a manual reading was 52.5 seconds while automated readings took 28.2 seconds.

Conclusion

In this external validation of a newly developed model for automated readings of tibial slope measures, the model demonstrated perfect intra-rater reliability and good inter-rater reliability. Although the model needs further refinement in reporting the tibial slope as compared to the gold standard manual measurement, it demonstrates the elimination of human error with repeat readings and less time requirement when compared to human effort.