Summary
The developed artificial intelligence allowed for completely autonomous comprehensive analysis of the leg alignment on long leg radiographs with a high precision and reliability comparable to orthopedic surgeons
Abstract
Background
A comprehensive analysis of the leg alignment is paramount for the determination of an evidence-based treatment plan and the preoperative planning across a wide range of knee pathologies. A deep learning (DL) model that performs an automated analysis of the leg alignment on x-rays could accelerate the process currently performed by orthopedic surgeons (OS) and increase accuracy and reliability of preoperative planning. The purpose of this study was to train annd validate a DL model for an automated assessment of the leg alignment on anterior posterior (a.p.) long leg radiographs (LRR) and compare the performance to OS in an internal validation study.
Materials And Methods
At the authors’ institution, a total of 594 patients (mean age 41.1±13.2 years, 182 female, 388 left side), who underwent corrective osteotomy, were enrolled. On a.p. LLRs, alignment analysis and placement of landmarks was performed by two OS (OS1 and OS2), serving as ground truth. Measurements included the mechanical femorotibial angle (mFA-mTA), lateral distal femoral angle (mLDFA), medial proximal tibia angle (mMPTA), lateral distal talus angle (mLDTA), joint line convergence angle (JLCA) and anatomical angle (AMA). The data set was split 60%(n=399)/10%(n=59)/30%(n=136) for training, validation, hold-out testing. Twelve networks - each specialized on an anatomical region – were synthesized and angles were calculated. The model was based on a COCO pretrained Mask-R CNN-ResNeXt-101 implemented in PyTorch. The mean difference of the individual angles and the interreader reliability as quantified by the intraclass correlation (ICC) between the DL model and the ground truth were measured in the hold-out test set and to the performance of OS1 and OS2 to evaluate the performance of the DL model.
Results
The mean difference in ° and the ICC between the DL model and the ground truth were 0.14° ± 0.11° and 1.0 [0.99, 1.0] for mFA-mTA, 0.66° ± 0.73° and 0.99 [0.98, 0.99] for mLPFA, 0.51° ± 0.81° and 0.95 [0.93, 0.97] for mLDFA, 0.63° ± 0.57° for mMPTA, 0.93° ± 0.86° and 0.95 [0.94, 0.97] for mLDTA, 0.86° ± 1.06° and 0.55 [0.42, 0.66] for JLCA and 0.34° ± 0.51° and 0.87 [0.82, 0.91] for AMA, respectively. In comparison, mean difference in ° and the ICC between OS1 and OS2 were 0.07° ± 0.07° and 1.0 [1.0, 1.0] for mFA-mTA, 0.23° ± 0.28° and 0.97 [0.96, 0.98] for mLDFA, 0.19° ± 0.19° and 0.98 [0.98, 0.99] for mMPTA, 0.25 ± 0.25° and 0.98 [0.98, 0.99] for mLDTA, 0.64° ± 0.93° and 0.38 [0.22, 0.52] for JLCA and 0.14° ± 0.14° and 0.94 [0.92, 0.96]for AMA, respectively. The DL model outperformed the OS in the time required for the analysis (22.4±0.5s vs. 91.0±10.0s).
Conclusion
The developed DL model allowed for complete autonomous comprehensive analysis of the leg alignment on a.p. LLR with a high precision and reliability comparable to orthopedic surgeons, while DL model significantly outperformed human raters in the time taken for assessment.