比较 ChatGPT、Claude 和 Bard 在支持近视防控方面的性能。

IF 2.4 3区医学 Q2 HEALTH CARE SCIENCES & SERVICES Journal of Multidisciplinary Healthcare Pub Date : 2024-08-13 eCollection Date: 2024-01-01 DOI:10.2147/JMDH.S473680

Yan Wang, Lihua Liang, Ran Li, Yihua Wang, Changfu Hao

{"title":"比较 ChatGPT、Claude 和 Bard 在支持近视防控方面的性能。","authors":"Yan Wang, Lihua Liang, Ran Li, Yihua Wang, Changfu Hao","doi":"10.2147/JMDH.S473680","DOIUrl":null,"url":null,"abstract":"Purpose: Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots-ChatGPT, Claude, and Bard-in responding to public health questions about myopia.Methods: Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.Results: The study's questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.Conclusion: Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.","PeriodicalId":16357,"journal":{"name":"Journal of Multidisciplinary Healthcare","volume":"17 ","pages":"3917-3929"},"PeriodicalIF":2.4000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11330241/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.\",\"authors\":\"Yan Wang, Lihua Liang, Ran Li, Yihua Wang, Changfu Hao\",\"doi\":\"10.2147/JMDH.S473680\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots-ChatGPT, Claude, and Bard-in responding to public health questions about myopia.Methods: Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.Results: The study's questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.Conclusion: Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.\",\"PeriodicalId\":16357,\"journal\":{\"name\":\"Journal of Multidisciplinary Healthcare\",\"volume\":\"17 \",\"pages\":\"3917-3929\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11330241/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Multidisciplinary Healthcare\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2147/JMDH.S473680\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multidisciplinary Healthcare","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/JMDH.S473680","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

目的：基于大型语言模型的聊天机器人正越来越多地应用于公共卫生领域。然而，聊天机器人回复的有效性一直存在争议，其在近视防控中的表现也未得到充分探讨。本研究旨在评估三个知名聊天机器人--ChatGPT、Claude 和 Bard 回答有关近视的公共卫生问题的有效性：方法：三个聊天机器人分别回答了 19 个有关近视的公共卫生问题（包括政策、基础知识和措施三个主题）。每个聊天机器人回答的全面性、准确性和相关性都由 4 名评分员进行独立评分：研究的问题经过了可靠的测试。所有 3 个聊天机器人的回答字数差异很大。从多到少的顺序依次是 ChatGPT、Bard 和 Claude。所有 3 个聊天机器人的综合得分都超过了 4 分（满分 5 分）。ChatGPT 在评估的各个方面都得分最高。但是，所有聊天机器人都有不足之处，如提供虚假回复：结论：聊天机器人在公共卫生领域显示出巨大潜力，其中以 ChatGPT 为最佳。未来要将聊天机器人用作公共卫生工具，就必须迅速制定使用和监控标准，并继续对聊天机器人进行研究、评估和改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.

Purpose: Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots-ChatGPT, Claude, and Bard-in responding to public health questions about myopia.

Methods: Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.

Results: The study's questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.

Conclusion: Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Multidisciplinary Healthcare Nursing-General Nursing

CiteScore

4.60

自引率

3.00%

发文量

287

审稿时长

16 weeks

期刊介绍： The Journal of Multidisciplinary Healthcare (JMDH) aims to represent and publish research in healthcare areas delivered by practitioners of different disciplines. This includes studies and reviews conducted by multidisciplinary teams as well as research which evaluates or reports the results or conduct of such teams or healthcare processes in general. The journal covers a very wide range of areas and we welcome submissions from practitioners at all levels and from all over the world. Good healthcare is not bounded by person, place or time and the journal aims to reflect this. The JMDH is published as an open-access journal to allow this wide range of practical, patient relevant research to be immediately available to practitioners who can access and use it immediately upon publication.