2025 ISAKOS Congress in Munich, Germany

2025 ISAKOS Biennial Congress ePoster


The Educational Potential of ChatGPT: Assessing ChatGPT Responses to Common Patient Hip Arthroscopy Questions

Yasir AlShehri, MD, Vancouver, BC CANADA
Mark Owen Mcconkey, MD, FRCSC, West Vancouver, bc CANADA
Parth Lodhia, MD, FRCSC, New Westminster, BC CANADA

Department of Orthopaedics, Faculty of Medicine, The University of British Columbia, Vancouver, BC, CANADA

FDA Status Not Applicable

Summary

ChatGPT can provide satisfactory but occasionally inaccurate answers to common patient hip arthroscopy questions. It has the potential to be a useful tool for patients in the future.

Abstract

Introduction

ChatGPT (Generative Pre-trained Transformer) is web-based artificial intelligence chatbot that has the ability to create content with natural conversational flow using various techniques. It has gained immense popularity since its development, and there has been recent interest in its potential role in patient education. The purpose of this study was to assess the ability of ChatGPT to answer common patient questions regarding hip arthroscopy, and to analyze the accuracy and appropriateness of its responses.

Methods

Ten questions were selected from well-known patient education websites, and ChatGPT (version 3.5) responses to these questions were graded by two fellowship-trained hip preservation surgeons. Responses were analyzed, compared to the current literature, and graded from A to D (A being the highest, and D being the lowest) in a grading scale based on the accuracy and completeness of the response. If the grading differed between the two surgeons, a consensus was reached. Inter-rater agreement was calculated. The readability of responses was also assessed using the Flesch-Kincaid Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL).

Results

Responses received the following concensus grades: A (50%, n=5), B (30%, n=3), C (10%, n=1), D (10%, n=1) (Table 2). Inter-rater agreement based on initial individual grading was 30%. The mean FRES was 28.2 (SD±?9.2), corresponding to a college graduate level, ranging from 11.7 to 42.5. The mean FKGL was 14.4 (SD±1.8), ranging from 12.1 to 18, indicating a college student reading level.

Conclusion

ChatGPT can answer common patient questions regarding hip arthroscopy with satisfactory accuracy graded by two high-volume hip arthroscopists, however, incorrect information was identified in more than one instance. Caution must be observed when using ChatGPT for patient education related to hip arthroscopy.