Objective: Literature reviews are essential to the scientific process and allow clinician researchers to advance general knowledge. The purpose of this study was to evaluate if the artificial intelligence (AI) programs ChatGPT and Perplexity.AI can perform an orthopedic surgery literature review.
Materials and methods: Five different search topics of varying specificity within orthopedic surgery were chosen for each search arm to investigate. A consolidated list of unique articles for each search topic was recorded for the experimental AI search arms and compared with the results of the control arm of two independent reviewers. Articles in the experimental arms were examined by the two independent reviewers for relevancy and validity.
Results: ChatGPT was able to identify a total of 61 unique articles. Four articles were not relevant to the search topic and 51 articles were deemed to be fraudulent, resulting in 6 valid articles. Perplexity.AI was able to identify a total of 43 unique articles. Nineteen were not relevant to the search topic but all articles were able to be verified, resulting in 24 valid articles. The control arm was able to identify 132 articles. Success rates for ChatGPT and Perplexity. AI were 4.6% (6 of 132) and 18.2% (24 of 132), respectively.
Conclusion: The current iteration of ChatGPT cannot perform a reliable literature review, and Perplexity.AI is only able to perform a limited review of the medical literature. Any utilization of these open AI programs should be done with caution and human quality assurance to promote responsible use and avoid the risk of using fabricated search results. [Orthopedics. 2024;47(3):e125-e130.].