Enhancing AI Chatbot Responses in Health Care: The SMART Prompt Structure in Head and Neck Surgery.

IF 1.8 Q2 OTORHINOLARYNGOLOGY OTO Open Pub Date : 2025-01-16 eCollection Date: 2025-01-01 DOI:10.1002/oto2.70075

Luigi Angelo Vaira, Jerome R Lechien, Vincenzo Abbate, Guido Gabriele, Andrea Frosolini, Andrea De Vito, Antonino Maniaci, Miguel Mayo-Yáñez, Paolo Boscolo-Rizzo, Alberto Maria Saibene, Fabio Maglitto, Giovanni Salzano, Gianluigi Califano, Stefania Troise, Carlos Miguel Chiesa-Estomba, Giacomo De Riu

{"title":"Enhancing AI Chatbot Responses in Health Care: The SMART Prompt Structure in Head and Neck Surgery.","authors":"Luigi Angelo Vaira, Jerome R Lechien, Vincenzo Abbate, Guido Gabriele, Andrea Frosolini, Andrea De Vito, Antonino Maniaci, Miguel Mayo-Yáñez, Paolo Boscolo-Rizzo, Alberto Maria Saibene, Fabio Maglitto, Giovanni Salzano, Gianluigi Califano, Stefania Troise, Carlos Miguel Chiesa-Estomba, Giacomo De Riu","doi":"10.1002/oto2.70075","DOIUrl":null,"url":null,"abstract":"Objective: This study aims to evaluate the impact of prompt construction on the quality of artificial intelligence (AI) chatbot responses in the context of head and neck surgery.Study design: Observational and evaluative study.Setting: An international collaboration involving 16 researchers from 11 European centers specializing in head and neck surgery.Methods: A total of 24 questions, divided into clinical scenarios, theoretical questions, and patient inquiries, were developed. These questions were entered into ChatGPT-4o both with and without the use of a structured prompt format, known as SMART (Seeker, Mission, AI Role, Register, Targeted Question). The AI-generated responses were evaluated by experienced head and neck surgeons using the Quality Analysis of Medical Artificial Intelligence instrument (QAMAI), which assesses accuracy, clarity, relevance, completeness, source quality, and usefulness.Results: The responses generated using the SMART prompt scored significantly higher across all QAMAI dimensions compared to those without contextualized prompts. Median QAMAI scores for SMART prompts were 27.5 (interquartile range [IQR] 25-29) versus 24 (IQR 21.8-25) for unstructured prompts (P < .001). Clinical scenarios and patient inquiries showed the most significant improvements, while theoretical questions also benefited, but to a lesser extent. The AI's source quality improved notably with the SMART prompt, particularly in theoretical questions.Conclusion: This study suggests that the structured SMART prompt format significantly enhances the quality of AI chatbot responses in head and neck surgery. This approach improves the accuracy, relevance, and completeness of AI-generated information, underscoring the importance of well-constructed prompts in clinical applications. Further research is warranted to explore the applicability of SMART prompts across different medical specialties and AI platforms.","PeriodicalId":19697,"journal":{"name":"OTO Open","volume":"9 1","pages":"e70075"},"PeriodicalIF":1.8000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736147/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"OTO Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oto2.70075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: This study aims to evaluate the impact of prompt construction on the quality of artificial intelligence (AI) chatbot responses in the context of head and neck surgery.

Study design: Observational and evaluative study.

Setting: An international collaboration involving 16 researchers from 11 European centers specializing in head and neck surgery.

Methods: A total of 24 questions, divided into clinical scenarios, theoretical questions, and patient inquiries, were developed. These questions were entered into ChatGPT-4o both with and without the use of a structured prompt format, known as SMART (Seeker, Mission, AI Role, Register, Targeted Question). The AI-generated responses were evaluated by experienced head and neck surgeons using the Quality Analysis of Medical Artificial Intelligence instrument (QAMAI), which assesses accuracy, clarity, relevance, completeness, source quality, and usefulness.

Results: The responses generated using the SMART prompt scored significantly higher across all QAMAI dimensions compared to those without contextualized prompts. Median QAMAI scores for SMART prompts were 27.5 (interquartile range [IQR] 25-29) versus 24 (IQR 21.8-25) for unstructured prompts (P < .001). Clinical scenarios and patient inquiries showed the most significant improvements, while theoretical questions also benefited, but to a lesser extent. The AI's source quality improved notably with the SMART prompt, particularly in theoretical questions.

Conclusion: This study suggests that the structured SMART prompt format significantly enhances the quality of AI chatbot responses in head and neck surgery. This approach improves the accuracy, relevance, and completeness of AI-generated information, underscoring the importance of well-constructed prompts in clinical applications. Further research is warranted to explore the applicability of SMART prompts across different medical specialties and AI platforms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

增强AI聊天机器人在医疗保健中的响应：头颈外科中的SMART提示结构。

目的：本研究旨在评估提示构建对头颈部手术背景下人工智能（AI）聊天机器人响应质量的影响。研究设计：观察性和评价性研究。背景：来自11个欧洲头颈外科中心的16名研究人员参与了一项国际合作。方法：共24个问题，分为临床情景、理论问题和患者询问。这些问题在chatgpt - 40中输入，有或没有使用结构化的提示格式，称为SMART （Seeker, Mission, AI Role, Register, Targeted Question）。由经验丰富的头颈外科医生使用医疗人工智能仪器质量分析（QAMAI）对人工智能生成的回答进行评估，评估准确性、清晰度、相关性、完整性、来源质量和有用性。结果：与没有情境化提示的回答相比，使用SMART提示生成的回答在所有QAMAI维度上得分明显更高。SMART提示的QAMAI得分中位数为27.5分（四分位数范围[IQR] 25-29），非结构化提示的QAMAI得分中位数为24分（IQR 21.8-25）。(P)结论：本研究表明，结构化SMART提示格式显著提高了头颈部手术中AI聊天机器人的反应质量。这种方法提高了人工智能生成信息的准确性、相关性和完整性，强调了构建良好的提示在临床应用中的重要性。有必要进一步研究SMART提示符在不同医学专业和人工智能平台上的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊