当前有关泌尿科患者信息的聊天机器人有多有用?比较十大最受欢迎聊天机器人关于女性尿失禁的回答。

IF 3.5 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES Journal of Medical Systems Pub Date : 2024-11-13 DOI:10.1007/s10916-024-02125-4
Arzu Malak, Mehmet Fatih Şahin
{"title":"当前有关泌尿科患者信息的聊天机器人有多有用?比较十大最受欢迎聊天机器人关于女性尿失禁的回答。","authors":"Arzu Malak, Mehmet Fatih Şahin","doi":"10.1007/s10916-024-02125-4","DOIUrl":null,"url":null,"abstract":"<p><p>This research evaluates the readability and quality of patient information material about female urinary incontinence (fUI) in ten popular artificial intelligence (AI) supported chatbots. We used the most recent versions of 10 widely-used chatbots, including OpenAI's GPT-4, Claude-3 Sonnet, Grok 1.5, Mistral Large 2, Google Palm 2, Meta's Llama 3, HuggingChat v0.8.4, Microsoft's Copilot, Gemini Advanced, and Perplexity. Prompts were created to generate texts about UI, stress type UI, urge type UI, and mix type UI. The modified Ensuring Quality Information for Patients (EQIP) technique and QUEST (Quality Evaluating Scoring Tool) were used to assess the quality, and the average of 8 well-known readability formulas, which is Average Reading Level Consensus (ARLC), were used to evaluate readability. When comparing the average scores, there were significant differences in the mean mQEIP and QUEST scores across ten chatbots (p = 0.049 and p = 0.018). Gemini received the greatest mean scores for mEQIP and QUEST, whereas Grok had the lowest values. The chatbots exhibited significant differences in mean ARLC, word count, and sentence count (p = 0.047, p = 0.001, and p = 0.001, respectively). For readability, Grok is the easiest to read, while Mistral is highly complex to understand. AI-supported chatbot technology needs to be improved in terms of readability and quality of patient information regarding female UI.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"48 1","pages":"102"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How Useful are Current Chatbots Regarding Urology Patient Information? Comparison of the Ten Most Popular Chatbots' Responses About Female Urinary Incontinence.\",\"authors\":\"Arzu Malak, Mehmet Fatih Şahin\",\"doi\":\"10.1007/s10916-024-02125-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This research evaluates the readability and quality of patient information material about female urinary incontinence (fUI) in ten popular artificial intelligence (AI) supported chatbots. We used the most recent versions of 10 widely-used chatbots, including OpenAI's GPT-4, Claude-3 Sonnet, Grok 1.5, Mistral Large 2, Google Palm 2, Meta's Llama 3, HuggingChat v0.8.4, Microsoft's Copilot, Gemini Advanced, and Perplexity. Prompts were created to generate texts about UI, stress type UI, urge type UI, and mix type UI. The modified Ensuring Quality Information for Patients (EQIP) technique and QUEST (Quality Evaluating Scoring Tool) were used to assess the quality, and the average of 8 well-known readability formulas, which is Average Reading Level Consensus (ARLC), were used to evaluate readability. When comparing the average scores, there were significant differences in the mean mQEIP and QUEST scores across ten chatbots (p = 0.049 and p = 0.018). Gemini received the greatest mean scores for mEQIP and QUEST, whereas Grok had the lowest values. The chatbots exhibited significant differences in mean ARLC, word count, and sentence count (p = 0.047, p = 0.001, and p = 0.001, respectively). For readability, Grok is the easiest to read, while Mistral is highly complex to understand. AI-supported chatbot technology needs to be improved in terms of readability and quality of patient information regarding female UI.</p>\",\"PeriodicalId\":16338,\"journal\":{\"name\":\"Journal of Medical Systems\",\"volume\":\"48 1\",\"pages\":\"102\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Systems\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10916-024-02125-4\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10916-024-02125-4","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

本研究评估了十种流行的人工智能(AI)支持聊天机器人中有关女性尿失禁(fUI)的患者信息资料的可读性和质量。我们使用了 10 个广泛使用的聊天机器人的最新版本,包括 OpenAI 的 GPT-4、Claude-3 Sonnet、Grok 1.5、Mistral Large 2、Google Palm 2、Meta's Llama 3、HuggingChat v0.8.4、Microsoft's Copilot、Gemini Advanced 和 Perplexity。我们创建了提示来生成有关用户界面、压力型用户界面、冲动型用户界面和混合型用户界面的文本。使用修改后的 "确保患者信息质量(EQIP)"技术和 QUEST(质量评估评分工具)来评估质量,并使用 8 个著名的可读性公式的平均值,即平均阅读水平共识(ARLC)来评估可读性。在比较平均得分时,十个聊天机器人的 mQEIP 和 QUEST 平均得分存在显著差异(p = 0.049 和 p = 0.018)。Gemini 的 mEQIP 和 QUEST 平均得分最高,而 Grok 的得分最低。聊天机器人在平均 ARLC、字数和句数方面表现出显著差异(分别为 p = 0.047、p = 0.001 和 p = 0.001)。就可读性而言,Grok 最容易阅读,而 Mistral 则非常复杂难懂。在女性用户界面方面,人工智能支持的聊天机器人技术需要在可读性和患者信息质量方面加以改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
How Useful are Current Chatbots Regarding Urology Patient Information? Comparison of the Ten Most Popular Chatbots' Responses About Female Urinary Incontinence.

This research evaluates the readability and quality of patient information material about female urinary incontinence (fUI) in ten popular artificial intelligence (AI) supported chatbots. We used the most recent versions of 10 widely-used chatbots, including OpenAI's GPT-4, Claude-3 Sonnet, Grok 1.5, Mistral Large 2, Google Palm 2, Meta's Llama 3, HuggingChat v0.8.4, Microsoft's Copilot, Gemini Advanced, and Perplexity. Prompts were created to generate texts about UI, stress type UI, urge type UI, and mix type UI. The modified Ensuring Quality Information for Patients (EQIP) technique and QUEST (Quality Evaluating Scoring Tool) were used to assess the quality, and the average of 8 well-known readability formulas, which is Average Reading Level Consensus (ARLC), were used to evaluate readability. When comparing the average scores, there were significant differences in the mean mQEIP and QUEST scores across ten chatbots (p = 0.049 and p = 0.018). Gemini received the greatest mean scores for mEQIP and QUEST, whereas Grok had the lowest values. The chatbots exhibited significant differences in mean ARLC, word count, and sentence count (p = 0.047, p = 0.001, and p = 0.001, respectively). For readability, Grok is the easiest to read, while Mistral is highly complex to understand. AI-supported chatbot technology needs to be improved in terms of readability and quality of patient information regarding female UI.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Medical Systems
Journal of Medical Systems 医学-卫生保健
CiteScore
11.60
自引率
1.90%
发文量
83
审稿时长
4.8 months
期刊介绍: Journal of Medical Systems provides a forum for the presentation and discussion of the increasingly extensive applications of new systems techniques and methods in hospital clinic and physician''s office administration; pathology radiology and pharmaceutical delivery systems; medical records storage and retrieval; and ancillary patient-support systems. The journal publishes informative articles essays and studies across the entire scale of medical systems from large hospital programs to novel small-scale medical services. Education is an integral part of this amalgamation of sciences and selected articles are published in this area. Since existing medical systems are constantly being modified to fit particular circumstances and to solve specific problems the journal includes a special section devoted to status reports on current installations.
期刊最新文献
Garbage In, Garbage Out? Negative Impact of Physiological Waveform Artifacts in a Hospital Clinical Data Warehouse. 21st Century Cures Act and Information Blocking: How Have Different Specialties Responded? Self-Supervised Learning for Near-Wild Cognitive Workload Estimation. Electronic Health Records Sharing Based on Consortium Blockchain. Large Language Models in Healthcare: An Urgent Call for Ongoing, Rigorous Validation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1