Background
Artificial intelligence (AI)–powered large language models like ChatGPT are increasingly used by the public to access health information. These platforms may be particularly appealing for high-risk conditions such as substance use disorder (SUD), where anonymity and nonjudgmental responses are valued. Despite growing interest in AI-assisted health education, limited research has assessed the quality of ChatGPT’s content when it comes to accuracy and completeness on complex behavioral health topics. This study evaluated the accuracy and clinical consistency of ChatGPT’s responses to SUD-related questions compared to national health guidelines.
Methods
This descriptive study, using a content analysis approach, analyzed ChatGPT 3.5’s and 5’s responses to 14 clinically relevant SUD-related questions, drawn from over 200 FAQs sourced from six leading U.S. health organizations in comparison to the top SUD questions asked by US adults using ChatGPT. Each response was independently assessed by a multidisciplinary team for accuracy, clarity, and appropriateness using an evidence-informed rating system. Responses were categorized as excellent, satisfactory requiring minimal clarification, satisfactory requiring moderate clarification, or unsatisfactory. Discrepancies were resolved through consensus.
Results
Among the 14 responses, 3 were rated excellent, 9 were satisfactory requiring minimal clarification, and 2 were satisfactory requiring moderate clarification. None were rated unsatisfactory. ChatGPT responses were generally accurate for straightforward questions but lacked clinical nuance and specificity in more complex scenarios, particularly regarding individualized care recommendations, withdrawal management, and treatment planning.
Conclusion
As AI becomes more integrated into health information-seeking behaviors, continued evaluation of its role and potential impact in addiction medicine is essential.
扫码关注我们
求助内容:
应助结果提醒方式:
