人工智能能取代生物化学家吗?一项将ChatGPT和Google Bard对甲状腺功能测试结果的解释与执业生物化学家进行比较的研究。

IF 2.1 4区 医学 Q3 MEDICAL LABORATORY TECHNOLOGY Annals of Clinical Biochemistry Pub Date : 2024-03-01 Epub Date: 2023-09-20 DOI:10.1177/00045632231203473
Emma Stevenson, Chelsey Walsh, Luke Hibberd
{"title":"人工智能能取代生物化学家吗?一项将ChatGPT和Google Bard对甲状腺功能测试结果的解释与执业生物化学家进行比较的研究。","authors":"Emma Stevenson, Chelsey Walsh, Luke Hibberd","doi":"10.1177/00045632231203473","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Public awareness of artificial intelligence (AI) is increasing and this novel technology is being used for a range of everyday tasks and more specialist clinical applications. On a background of increasing waits for GP appointments alongside patient access to laboratory test results through the NHS app, this study aimed to assess the accuracy and safety of two AI tools, ChatGPT and Google Bard, in providing interpretation of thyroid function test results as if posed by laboratory scientists or patients.</p><p><strong>Methods: </strong>Fifteen fictional cases were presented to a team of clinicians and clinical scientists to produce a consensus opinion. The cases were then presented to ChatGPT and Google Bard as though from healthcare providers and from patients. The responses were categorized as correct, partially correct or incorrect compared to consensus opinion and the advice assessed for safety to patients.</p><p><strong>Results: </strong>Of the 15 cases presented, ChatGPT and Google Bard correctly interpreted only 33.3% and 20.0% of cases, respectively. When queries were posed as a patient, 66.7% of ChatGPT responses were safe compared to 60.0% of Google Bard responses. Both AI tools were able to identify primary hypothyroidism and hyperthyroidism but failed to identify subclinical presentations, non-thyroidal illness or secondary hypothyroidism.</p><p><strong>Conclusions: </strong>This study has demonstrated that AI tools do not currently have the capacity to generate consistently correct interpretation and safe advice to patients and should not be used as an alternative to a consultation with a qualified medical professional. Available AI in its current form cannot replace human clinical knowledge in this scenario.</p>","PeriodicalId":8005,"journal":{"name":"Annals of Clinical Biochemistry","volume":" ","pages":"143-149"},"PeriodicalIF":2.1000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Can artificial intelligence replace biochemists? A study comparing interpretation of thyroid function test results by ChatGPT and Google Bard to practising biochemists.\",\"authors\":\"Emma Stevenson, Chelsey Walsh, Luke Hibberd\",\"doi\":\"10.1177/00045632231203473\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Public awareness of artificial intelligence (AI) is increasing and this novel technology is being used for a range of everyday tasks and more specialist clinical applications. On a background of increasing waits for GP appointments alongside patient access to laboratory test results through the NHS app, this study aimed to assess the accuracy and safety of two AI tools, ChatGPT and Google Bard, in providing interpretation of thyroid function test results as if posed by laboratory scientists or patients.</p><p><strong>Methods: </strong>Fifteen fictional cases were presented to a team of clinicians and clinical scientists to produce a consensus opinion. The cases were then presented to ChatGPT and Google Bard as though from healthcare providers and from patients. The responses were categorized as correct, partially correct or incorrect compared to consensus opinion and the advice assessed for safety to patients.</p><p><strong>Results: </strong>Of the 15 cases presented, ChatGPT and Google Bard correctly interpreted only 33.3% and 20.0% of cases, respectively. When queries were posed as a patient, 66.7% of ChatGPT responses were safe compared to 60.0% of Google Bard responses. Both AI tools were able to identify primary hypothyroidism and hyperthyroidism but failed to identify subclinical presentations, non-thyroidal illness or secondary hypothyroidism.</p><p><strong>Conclusions: </strong>This study has demonstrated that AI tools do not currently have the capacity to generate consistently correct interpretation and safe advice to patients and should not be used as an alternative to a consultation with a qualified medical professional. Available AI in its current form cannot replace human clinical knowledge in this scenario.</p>\",\"PeriodicalId\":8005,\"journal\":{\"name\":\"Annals of Clinical Biochemistry\",\"volume\":\" \",\"pages\":\"143-149\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Clinical Biochemistry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/00045632231203473\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/9/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICAL LABORATORY TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Clinical Biochemistry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/00045632231203473","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/20 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:公众对人工智能的认识正在提高,这项新技术正被用于一系列日常任务和更多的专业临床应用。在等待全科医生预约以及患者通过NHS应用程序获取实验室检测结果的情况下,这项研究旨在评估两种人工智能工具ChatGPT和Google Bard在解释甲状腺功能检测结果方面的准确性和安全性,就像实验室科学家或患者提出的那样。方法:将15个虚构的病例提交给临床医生和临床科学家团队,以达成一致意见。然后,这些病例被提交给ChatGPT和Google Bard,就像来自医疗保健提供者和患者一样。与一致意见和评估患者安全性的建议相比,反应被分为正确、部分正确或不正确。结果:在15例病例中,ChatGPT和Google Bard分别仅正确解释了33.3%和20.0%的病例。当以患者身份提出询问时,66.7%的ChatGPT回复是安全的,而Google Bard的回复是60.0%。两种人工智能工具都能够识别原发性甲状腺功能减退症和甲状腺功能亢进症,但未能识别亚临床表现、非甲状腺疾病或继发性甲状腺功能低下症。结论:这项研究表明,人工智能工具目前不具备为患者提供一致正确解释和安全建议的能力,不应被用作咨询合格医疗专业人员的替代方案。在这种情况下,现有形式的人工智能无法取代人类的临床知识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Can artificial intelligence replace biochemists? A study comparing interpretation of thyroid function test results by ChatGPT and Google Bard to practising biochemists.

Background: Public awareness of artificial intelligence (AI) is increasing and this novel technology is being used for a range of everyday tasks and more specialist clinical applications. On a background of increasing waits for GP appointments alongside patient access to laboratory test results through the NHS app, this study aimed to assess the accuracy and safety of two AI tools, ChatGPT and Google Bard, in providing interpretation of thyroid function test results as if posed by laboratory scientists or patients.

Methods: Fifteen fictional cases were presented to a team of clinicians and clinical scientists to produce a consensus opinion. The cases were then presented to ChatGPT and Google Bard as though from healthcare providers and from patients. The responses were categorized as correct, partially correct or incorrect compared to consensus opinion and the advice assessed for safety to patients.

Results: Of the 15 cases presented, ChatGPT and Google Bard correctly interpreted only 33.3% and 20.0% of cases, respectively. When queries were posed as a patient, 66.7% of ChatGPT responses were safe compared to 60.0% of Google Bard responses. Both AI tools were able to identify primary hypothyroidism and hyperthyroidism but failed to identify subclinical presentations, non-thyroidal illness or secondary hypothyroidism.

Conclusions: This study has demonstrated that AI tools do not currently have the capacity to generate consistently correct interpretation and safe advice to patients and should not be used as an alternative to a consultation with a qualified medical professional. Available AI in its current form cannot replace human clinical knowledge in this scenario.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Annals of Clinical Biochemistry
Annals of Clinical Biochemistry Biochemistry, Genetics and Molecular Biology-Clinical Biochemistry
CiteScore
5.20
自引率
4.50%
发文量
61
期刊介绍: Annals of Clinical Biochemistry is the fully peer reviewed international journal of the Association for Clinical Biochemistry and Laboratory Medicine. Annals of Clinical Biochemistry accepts papers that contribute to knowledge in all fields of laboratory medicine, especially those pertaining to the understanding, diagnosis and treatment of human disease. It publishes papers on clinical biochemistry, clinical audit, metabolic medicine, immunology, genetics, biotechnology, haematology, microbiology, computing and management where they have both biochemical and clinical relevance. Papers describing evaluation or implementation of commercial reagent kits or the performance of new analysers require substantial original information. Unless of exceptional interest and novelty, studies dealing with the redox status in various diseases are not generally considered within the journal''s scope. Studies documenting the association of single nucleotide polymorphisms (SNPs) with particular phenotypes will not normally be considered, given the greater strength of genome wide association studies (GWAS). Research undertaken in non-human animals will not be considered for publication in the Annals. Annals of Clinical Biochemistry is also the official journal of NVKC (de Nederlandse Vereniging voor Klinische Chemie) and JSCC (Japan Society of Clinical Chemistry).
期刊最新文献
Exploratory Study on Reference Intervals of Calprotectin and Pentraxin 3. Coefficients of variation analyses of internal quality control status for blood lead in China from 2015 to 2023. The effects of controlled acute psychological stress on serum cortisol and plasma metanephrine concentrations in healthy subjects. Suggested guide to using lactate gap as a surrogate marker in the diagnosis of ethylene glycol overdose. Simultaneous quantification of serum symmetric dimethylarginine, asymmetric dimethylarginine and creatinine for use in a routine clinical laboratory.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1