人工智能用于泌尿学研究:数据科学的圣杯还是错误信息的潘多拉魔盒?

IF 2.9 2区 医学 Q1 UROLOGY & NEPHROLOGY Journal of endourology Pub Date : 2024-08-01 Epub Date: 2024-05-15 DOI:10.1089/end.2023.0703
Ryan M Blake, Johnathan A Khusid
{"title":"人工智能用于泌尿学研究:数据科学的圣杯还是错误信息的潘多拉魔盒?","authors":"Ryan M Blake, Johnathan A Khusid","doi":"10.1089/end.2023.0703","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Introduction:</i></b> Artificial intelligence tools such as the large language models (LLMs) Bard and ChatGPT have generated significant research interest. Utilization of these LLMs to study the epidemiology of a target population could benefit urologists. We investigated whether Bard and ChatGPT can perform a large-scale calculation of the incidence and prevalence of kidney stone disease. <b><i>Materials and Methods:</i></b> We obtained reference values from two published studies, which used the National Health and Nutrition Examination Survey (NHANES) database to calculate the prevalence and incidence of kidney stone disease. We then tested the capability of Bard and ChatGPT to perform similar calculations using two different methods. First, we instructed the LLMs to access the data sets and independently perform the calculation. Second, we instructed the interfaces to generate a customized computer code, which could perform the calculation on downloaded data sets. <b><i>Results:</i></b> While ChatGPT denied the ability to access and perform calculations on the NHANES database, Bard intermittently claimed the ability to do so. Bard provided either accurate results or inaccurate and inconsistent results. For example, Bard's \"calculations\" for the incidence of kidney stones from 2015 to 2018 were 2.1% (95% CI 1.5-2.7), 1.75% (95% CI 1.6-1.9), and 0.8% (95% CI 0.7-0.9), while the published number was 2.1% (95% CI 1.5-2.7). Bard provided discrete mathematical details of its calculations, however, when prompted further, admitted to having obtained the numbers from online sources, including our chosen reference articles, rather than from a <i>de novo</i> calculation. Both LLMs were able to produce a code (Python) to use on the downloaded NHANES data sets, however, these would not readily execute. <b><i>Conclusions:</i></b> ChatGPT and Bard are currently incapable of performing epidemiologic calculations and lack transparency and accountability. Caution should be used, particularly with Bard, as claims of its capabilities were convincingly misleading, and results were inconsistent.</p>","PeriodicalId":15723,"journal":{"name":"Journal of endourology","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial Intelligence for Urology Research: The Holy Grail of Data Science or Pandora's Box of Misinformation?\",\"authors\":\"Ryan M Blake, Johnathan A Khusid\",\"doi\":\"10.1089/end.2023.0703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b><i>Introduction:</i></b> Artificial intelligence tools such as the large language models (LLMs) Bard and ChatGPT have generated significant research interest. Utilization of these LLMs to study the epidemiology of a target population could benefit urologists. We investigated whether Bard and ChatGPT can perform a large-scale calculation of the incidence and prevalence of kidney stone disease. <b><i>Materials and Methods:</i></b> We obtained reference values from two published studies, which used the National Health and Nutrition Examination Survey (NHANES) database to calculate the prevalence and incidence of kidney stone disease. We then tested the capability of Bard and ChatGPT to perform similar calculations using two different methods. First, we instructed the LLMs to access the data sets and independently perform the calculation. Second, we instructed the interfaces to generate a customized computer code, which could perform the calculation on downloaded data sets. <b><i>Results:</i></b> While ChatGPT denied the ability to access and perform calculations on the NHANES database, Bard intermittently claimed the ability to do so. Bard provided either accurate results or inaccurate and inconsistent results. For example, Bard's \\\"calculations\\\" for the incidence of kidney stones from 2015 to 2018 were 2.1% (95% CI 1.5-2.7), 1.75% (95% CI 1.6-1.9), and 0.8% (95% CI 0.7-0.9), while the published number was 2.1% (95% CI 1.5-2.7). Bard provided discrete mathematical details of its calculations, however, when prompted further, admitted to having obtained the numbers from online sources, including our chosen reference articles, rather than from a <i>de novo</i> calculation. Both LLMs were able to produce a code (Python) to use on the downloaded NHANES data sets, however, these would not readily execute. <b><i>Conclusions:</i></b> ChatGPT and Bard are currently incapable of performing epidemiologic calculations and lack transparency and accountability. Caution should be used, particularly with Bard, as claims of its capabilities were convincingly misleading, and results were inconsistent.</p>\",\"PeriodicalId\":15723,\"journal\":{\"name\":\"Journal of endourology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of endourology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1089/end.2023.0703\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/5/15 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"UROLOGY & NEPHROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of endourology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/end.2023.0703","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/15 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

引言 大型语言模型(LLMs)Bard 和 ChatGPT 等人工智能工具引起了人们极大的研究兴趣。利用这些大型语言模型研究目标人群的流行病学可使泌尿科医生受益匪浅。我们研究了 Bard 和 ChatGPT 能否大规模计算肾结石病的发病率和流行率。材料与方法 我们从两项已发表的研究中获得了参考值,这两项研究使用了美国国家健康与营养调查(NHANES)数据库来计算肾结石病的患病率和发病率。然后,我们用两种不同的方法测试了 Bard 和 ChatGPT 进行类似计算的能力。首先,我们指示 LLMs 访问数据集并独立进行计算。其次,我们指示界面生成定制的计算机代码,以便在下载的数据集上执行计算。结果 虽然 ChatGPT 否认自己有能力访问 NHANES 数据库并进行计算,但 Bard 却断断续续地声称自己有能力这样做。Bard 要么提供了准确的结果,要么提供了不准确和不一致的结果。例如,巴德公司对 2015-2018 年肾结石发病率的 "计算结果 "分别为 2.1%(95% CI:1.5-2.7)、1.75%(95% CI:1.6-1.9)和 0.8%(95% CI 0.7-0.9),而公布的数字为 2.1%(95% CI 1.5-2.7)。Bard 提供了其计算的离散数学细节,但在进一步询问时,Bard 承认这些数字是从网上来源获得的,包括我们选择的参考文献,而不是重新计算的结果。两位 LLM 都能在下载的 NHANES 数据集上生成代码(Python),但这些代码并不容易执行。结论 ChatGPT 和 Bard 目前无法进行流行病学计算,缺乏透明度和问责制。应谨慎使用 ChatGPT 和 Bard,尤其是 Bard,因为对其功能的宣称具有令人信服的误导性,而且结果也不一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Artificial Intelligence for Urology Research: The Holy Grail of Data Science or Pandora's Box of Misinformation?

Introduction: Artificial intelligence tools such as the large language models (LLMs) Bard and ChatGPT have generated significant research interest. Utilization of these LLMs to study the epidemiology of a target population could benefit urologists. We investigated whether Bard and ChatGPT can perform a large-scale calculation of the incidence and prevalence of kidney stone disease. Materials and Methods: We obtained reference values from two published studies, which used the National Health and Nutrition Examination Survey (NHANES) database to calculate the prevalence and incidence of kidney stone disease. We then tested the capability of Bard and ChatGPT to perform similar calculations using two different methods. First, we instructed the LLMs to access the data sets and independently perform the calculation. Second, we instructed the interfaces to generate a customized computer code, which could perform the calculation on downloaded data sets. Results: While ChatGPT denied the ability to access and perform calculations on the NHANES database, Bard intermittently claimed the ability to do so. Bard provided either accurate results or inaccurate and inconsistent results. For example, Bard's "calculations" for the incidence of kidney stones from 2015 to 2018 were 2.1% (95% CI 1.5-2.7), 1.75% (95% CI 1.6-1.9), and 0.8% (95% CI 0.7-0.9), while the published number was 2.1% (95% CI 1.5-2.7). Bard provided discrete mathematical details of its calculations, however, when prompted further, admitted to having obtained the numbers from online sources, including our chosen reference articles, rather than from a de novo calculation. Both LLMs were able to produce a code (Python) to use on the downloaded NHANES data sets, however, these would not readily execute. Conclusions: ChatGPT and Bard are currently incapable of performing epidemiologic calculations and lack transparency and accountability. Caution should be used, particularly with Bard, as claims of its capabilities were convincingly misleading, and results were inconsistent.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of endourology
Journal of endourology 医学-泌尿学与肾脏学
CiteScore
5.50
自引率
14.80%
发文量
254
审稿时长
1 months
期刊介绍: Journal of Endourology, JE Case Reports, and Videourology are the leading peer-reviewed journal, case reports publication, and innovative videojournal companion covering all aspects of minimally invasive urology research, applications, and clinical outcomes. The leading journal of minimally invasive urology for over 30 years, Journal of Endourology is the essential publication for practicing surgeons who want to keep up with the latest surgical technologies in endoscopic, laparoscopic, robotic, and image-guided procedures as they apply to benign and malignant diseases of the genitourinary tract. This flagship journal includes the companion videojournal Videourology™ with every subscription. While Journal of Endourology remains focused on publishing rigorously peer reviewed articles, Videourology accepts original videos containing material that has not been reported elsewhere, except in the form of an abstract or a conference presentation. Journal of Endourology coverage includes: The latest laparoscopic, robotic, endoscopic, and image-guided techniques for treating both benign and malignant conditions Pioneering research articles Controversial cases in endourology Techniques in endourology with accompanying videos Reviews and epochs in endourology Endourology survey section of endourology relevant manuscripts published in other journals.
期刊最新文献
Comparative Analysis of Safety and Efficacy Between Anterior and Posterior Calyceal Entry in Supine Percutaneous Nephrolithotomy. Does Incision Location Matter? Analysis of Single-Port Cosmesis in Urologic Reconstructive Surgery. Bio of Pankaj N Maheshwari, MS, DNB, MCh, FRCS. Digital Flexible Ureteroscope: Evaluating Factors Responsible for Damage and Implementing a Mandatory Certification Program for Usage. Impact of Residual Stone Fragments on Risk of Unplanned Stone Events Following Percutaneous Nephrolithotomy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1