人工智能自动化卫生经济模型:评估大型语言模型潜在应用的案例研究》。

IF 2 Q2 ECONOMICS PharmacoEconomics Open Pub Date : 2024-03-01 Epub Date: 2024-02-10 DOI:10.1007/s41669-024-00477-8
Tim Reason, William Rawlinson, Julia Langham, Andy Gimblett, Bill Malcolm, Sven Klijn
{"title":"人工智能自动化卫生经济模型:评估大型语言模型潜在应用的案例研究》。","authors":"Tim Reason, William Rawlinson, Julia Langham, Andy Gimblett, Bill Malcolm, Sven Klijn","doi":"10.1007/s41669-024-00477-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Current generation large language models (LLMs) such as Generative Pre-Trained Transformer 4 (GPT-4) have achieved human-level performance on many tasks including the generation of computer code based on textual input. This study aimed to assess whether GPT-4 could be used to automatically programme two published health economic analyses.</p><p><strong>Methods: </strong>The two analyses were partitioned survival models evaluating interventions in non-small cell lung cancer (NSCLC) and renal cell carcinoma (RCC). We developed prompts which instructed GPT-4 to programme the NSCLC and RCC models in R, and which provided descriptions of each model's methods, assumptions and parameter values. The results of the generated scripts were compared to the published values from the original, human-programmed models. The models were replicated 15 times to capture variability in GPT-4's output.</p><p><strong>Results: </strong>GPT-4 fully replicated the NSCLC model with high accuracy: 100% (15/15) of the artificial intelligence (AI)-generated NSCLC models were error-free or contained a single minor error, and 93% (14/15) were completely error-free. GPT-4 closely replicated the RCC model, although human intervention was required to simplify an element of the model design (one of the model's fifteen input calculations) because it used too many sequential steps to be implemented in a single prompt. With this simplification, 87% (13/15) of the AI-generated RCC models were error-free or contained a single minor error, and 60% (9/15) were completely error-free. Error-free model scripts replicated the published incremental cost-effectiveness ratios to within 1%.</p><p><strong>Conclusion: </strong>This study provides a promising indication that GPT-4 can have practical applications in the automation of health economic model construction. Potential benefits include accelerated model development timelines and reduced costs of development. Further research is necessary to explore the generalisability of LLM-based automation across a larger sample of models.</p>","PeriodicalId":19770,"journal":{"name":"PharmacoEconomics Open","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10884386/pdf/","citationCount":"0","resultStr":"{\"title\":\"Artificial Intelligence to Automate Health Economic Modelling: A Case Study to Evaluate the Potential Application of Large Language Models.\",\"authors\":\"Tim Reason, William Rawlinson, Julia Langham, Andy Gimblett, Bill Malcolm, Sven Klijn\",\"doi\":\"10.1007/s41669-024-00477-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Current generation large language models (LLMs) such as Generative Pre-Trained Transformer 4 (GPT-4) have achieved human-level performance on many tasks including the generation of computer code based on textual input. This study aimed to assess whether GPT-4 could be used to automatically programme two published health economic analyses.</p><p><strong>Methods: </strong>The two analyses were partitioned survival models evaluating interventions in non-small cell lung cancer (NSCLC) and renal cell carcinoma (RCC). We developed prompts which instructed GPT-4 to programme the NSCLC and RCC models in R, and which provided descriptions of each model's methods, assumptions and parameter values. The results of the generated scripts were compared to the published values from the original, human-programmed models. The models were replicated 15 times to capture variability in GPT-4's output.</p><p><strong>Results: </strong>GPT-4 fully replicated the NSCLC model with high accuracy: 100% (15/15) of the artificial intelligence (AI)-generated NSCLC models were error-free or contained a single minor error, and 93% (14/15) were completely error-free. GPT-4 closely replicated the RCC model, although human intervention was required to simplify an element of the model design (one of the model's fifteen input calculations) because it used too many sequential steps to be implemented in a single prompt. With this simplification, 87% (13/15) of the AI-generated RCC models were error-free or contained a single minor error, and 60% (9/15) were completely error-free. Error-free model scripts replicated the published incremental cost-effectiveness ratios to within 1%.</p><p><strong>Conclusion: </strong>This study provides a promising indication that GPT-4 can have practical applications in the automation of health economic model construction. Potential benefits include accelerated model development timelines and reduced costs of development. Further research is necessary to explore the generalisability of LLM-based automation across a larger sample of models.</p>\",\"PeriodicalId\":19770,\"journal\":{\"name\":\"PharmacoEconomics Open\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10884386/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PharmacoEconomics Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s41669-024-00477-8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/2/10 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PharmacoEconomics Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41669-024-00477-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/2/10 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0

摘要

背景:新一代大型语言模型(LLMs),如生成预训练转换器 4(GPT-4),在许多任务中都达到了人类水平,包括根据文本输入生成计算机代码。本研究旨在评估 GPT-4 能否用于自动编制两份已发布的卫生经济分析报告:这两项分析是评估非小细胞肺癌(NSCLC)和肾细胞癌(RCC)干预措施的分区生存模型。我们开发了一些提示,指导 GPT-4 在 R 语言中对 NSCLC 和 RCC 模型进行编程,并对每个模型的方法、假设和参数值进行了说明。生成脚本的结果与人类编程的原始模型的公布值进行了比较。这些模型被复制了 15 次,以捕捉 GPT-4 输出中的变化:GPT-4完全复制了NSCLC模型,准确率很高:人工智能(AI)生成的NSCLC模型100%(15/15)没有错误或包含一个小错误,93%(14/15)完全没有错误。GPT-4 密切复制了 RCC 模型,不过需要人工干预来简化模型设计中的一个元素(模型 15 项输入计算中的一项),因为它使用了太多的连续步骤,无法在单个提示中实现。经过简化后,87%(13/15)的人工智能生成的 RCC 模型没有错误或只包含一个小错误,60%(9/15)的模型完全没有错误。无差错模型脚本复制了已公布的增量成本效益比,误差在 1%以内:这项研究为 GPT-4 在卫生经济模型构建自动化方面的实际应用提供了良好的迹象。潜在的益处包括加快模型开发时间并降低开发成本。有必要开展进一步研究,探讨基于 LLM 的自动化在更大样本模型中的通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Artificial Intelligence to Automate Health Economic Modelling: A Case Study to Evaluate the Potential Application of Large Language Models.

Background: Current generation large language models (LLMs) such as Generative Pre-Trained Transformer 4 (GPT-4) have achieved human-level performance on many tasks including the generation of computer code based on textual input. This study aimed to assess whether GPT-4 could be used to automatically programme two published health economic analyses.

Methods: The two analyses were partitioned survival models evaluating interventions in non-small cell lung cancer (NSCLC) and renal cell carcinoma (RCC). We developed prompts which instructed GPT-4 to programme the NSCLC and RCC models in R, and which provided descriptions of each model's methods, assumptions and parameter values. The results of the generated scripts were compared to the published values from the original, human-programmed models. The models were replicated 15 times to capture variability in GPT-4's output.

Results: GPT-4 fully replicated the NSCLC model with high accuracy: 100% (15/15) of the artificial intelligence (AI)-generated NSCLC models were error-free or contained a single minor error, and 93% (14/15) were completely error-free. GPT-4 closely replicated the RCC model, although human intervention was required to simplify an element of the model design (one of the model's fifteen input calculations) because it used too many sequential steps to be implemented in a single prompt. With this simplification, 87% (13/15) of the AI-generated RCC models were error-free or contained a single minor error, and 60% (9/15) were completely error-free. Error-free model scripts replicated the published incremental cost-effectiveness ratios to within 1%.

Conclusion: This study provides a promising indication that GPT-4 can have practical applications in the automation of health economic model construction. Potential benefits include accelerated model development timelines and reduced costs of development. Further research is necessary to explore the generalisability of LLM-based automation across a larger sample of models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.50
自引率
0.00%
发文量
64
审稿时长
8 weeks
期刊介绍: PharmacoEconomics - Open focuses on applied research on the economic implications and health outcomes associated with drugs, devices and other healthcare interventions. The journal includes, but is not limited to, the following research areas:Economic analysis of healthcare interventionsHealth outcomes researchCost-of-illness studiesQuality-of-life studiesAdditional digital features (including animated abstracts, video abstracts, slide decks, audio slides, instructional videos, infographics, podcasts and animations) can be published with articles; these are designed to increase the visibility, readership and educational value of the journal’s content. In addition, articles published in PharmacoEconomics -Open may be accompanied by plain language summaries to assist readers who have some knowledge of, but not in-depth expertise in, the area to understand important medical advances.All manuscripts are subject to peer review by international experts. Letters to the Editor are welcomed and will be considered for publication.
期刊最新文献
Costs of Adverse Events in Patients with Advanced or Metastatic Renal Cell Carcinoma with First-Line Treatment. Digital Versus Paper-Based Consent from the UK NHS Perspective: A Micro-costing Analysis. Correction: Cost-Effectiveness of Dupilumab and Oral Janus Kinase Inhibitors for the Treatment of Moderate-to-Severe Atopic Dermatitis in Singapore. Publisher Correction: Health Technology Assessment Reports for Non-Oncology Medications in Canada from 2018 to 2022: Methodological Critiques on Manufacturers' Submissions and a Comparison Between Manufacturer and Canadian Agency for Drugs and Technologies in Health (CADTH) Analyses. Comparison of Two Financial Incentives to Encourage the Use of Adalimumab Biosimilars: Results of a French Experiment Close to Clinicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1