Anna R Van Meter, Michael G Wheaton, Victoria E Cosgrove, Katerina Andreadis, Ronald E Robertson
{"title":"The Goldilocks Zone: Finding the right balance of user and institutional risk for suicide-related generative AI queries.","authors":"Anna R Van Meter, Michael G Wheaton, Victoria E Cosgrove, Katerina Andreadis, Ronald E Robertson","doi":"10.1371/journal.pdig.0000711","DOIUrl":null,"url":null,"abstract":"<p><p>Generative artificial intelligence (genAI) has potential to improve healthcare by reducing clinician burden and expanding services, among other uses. There is a significant gap between the need for mental health care and available clinicians in the United States-this makes it an attractive target for improved efficiency through genAI. Among the most sensitive mental health topics is suicide, and demand for crisis intervention has grown in recent years. We aimed to evaluate the quality of genAI tool responses to suicide-related queries. We entered 10 suicide-related queries into five genAI tools-ChatGPT 3.5, GPT-4, a version of GPT-4 safe for protected health information, Gemini, and Bing Copilot. The response to each query was coded on seven metrics including presence of a suicide hotline number, content related to evidence-based suicide interventions, supportive content, harmful content. Pooling across tools, most of the responses (79%) were supportive. Only 24% of responses included a crisis hotline number and only 4% included content consistent with evidence-based suicide prevention interventions. Harmful content was rare (5%); all such instances were delivered by Bing Copilot. Our results suggest that genAI developers have taken a very conservative approach to suicide-related content and constrained their models' responses to suggest support-seeking, but little else. Finding balance between providing much needed evidence-based mental health information without introducing excessive risk is within the capabilities of genAI developers. At this nascent stage of integrating genAI tools into healthcare systems, ensuring mental health parity should be the goal of genAI developers and healthcare organizations.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"4 1","pages":"e0000711"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11709298/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000711","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generative artificial intelligence (genAI) has potential to improve healthcare by reducing clinician burden and expanding services, among other uses. There is a significant gap between the need for mental health care and available clinicians in the United States-this makes it an attractive target for improved efficiency through genAI. Among the most sensitive mental health topics is suicide, and demand for crisis intervention has grown in recent years. We aimed to evaluate the quality of genAI tool responses to suicide-related queries. We entered 10 suicide-related queries into five genAI tools-ChatGPT 3.5, GPT-4, a version of GPT-4 safe for protected health information, Gemini, and Bing Copilot. The response to each query was coded on seven metrics including presence of a suicide hotline number, content related to evidence-based suicide interventions, supportive content, harmful content. Pooling across tools, most of the responses (79%) were supportive. Only 24% of responses included a crisis hotline number and only 4% included content consistent with evidence-based suicide prevention interventions. Harmful content was rare (5%); all such instances were delivered by Bing Copilot. Our results suggest that genAI developers have taken a very conservative approach to suicide-related content and constrained their models' responses to suggest support-seeking, but little else. Finding balance between providing much needed evidence-based mental health information without introducing excessive risk is within the capabilities of genAI developers. At this nascent stage of integrating genAI tools into healthcare systems, ensuring mental health parity should be the goal of genAI developers and healthcare organizations.