Introduction: Interdisciplinary consultations are essential to decision-making for patients with congenital heart disease. The integration of artificial intelligence (AI) and natural language processing into medical practice is rapidly accelerating, opening new avenues to diagnosis and treatment. The main objective of this study was to consult the AI-trained model Chat Generative Pre-Trained Transformer (ChatGPT) regarding cases discussed during a cardiovascular surgery conference (CSC) at a single tertiary center and compare the ChatGPT suggestions with CSC expert consensus results.
Methods: In total, 37 cases discussed at a single CSC were retrospectively identified. Clinical information comprised deidentified data from the last electrocardiogram, echocardiogram, intensive care unit progress note (or cardiology clinic note if outpatient), as well as a patient summary. The diagnosis was removed from the summary and possible treatment options were deleted from all notes. ChatGPT (version 4.0) was asked to summarize the case, identify diagnoses, and recommend surgical procedures and timing of surgery. The responses of ChatGPT were compared with the results of the CSC.
Results: Of the 37 cases uploaded to ChatGPT, 45.9% (n = 17) were considered to be less complex cases, with only 1 treatment option, and 54.1% (n = 20) were considered more complex, with several treatment options. ChatGPT correctly provided a detailed and systematically written summary for each case within 10 to 15 seconds. ChatGPT correctly identified diagnoses for approximately 94.5% (n = 35) cases. The surgical intervention plan matched the group decision for approximately 40.5% (n = 15) cases; however, it differed in 27% cases. In 23 of 37 cases, timing of surgery was the same between CSC group and ChatGPT. Overall, the match between ChatGPT responses and CSC decisions for diagnosis was 94.5%, surgical intervention was 40.5%, and timing of surgery was 62.2%. However, within complex cases, we have 25% agreement for surgical intervention and 67% for timing of surgery.
Conclusions: ChatGPT can be used as an augmentative tool for surgical conferences to systematically summarize large amounts of patient data from electronic health records and clinical notes in seconds. In addition, our study points out the potential of ChatGPT as an AI-based decision support tool in surgery, particularly for less-complex cases. The discrepancy, particularly in complex cases, emphasizes on the need for caution when using ChatGPT in decision-making for the complex cases in pediatric cardiovascular surgery. There is little doubt that the public will soon use this comparative tool.