评估在一个流行的、可公开访问的人工智能模型（GPT-4）上获得的有关心血管疾病预防的错误信息

IF 4.3 Q1 CARDIAC & CARDIOVASCULAR SYSTEMS American journal of preventive cardiology Pub Date : 2024-09-01 DOI:10.1016/j.ajpc.2024.100806

Ashish Sarraju MD

{"title":"评估在一个流行的、可公开访问的人工智能模型（GPT-4）上获得的有关心血管疾病预防的错误信息","authors":"Ashish Sarraju MD","doi":"10.1016/j.ajpc.2024.100806","DOIUrl":null,"url":null,"abstract":"<div><h3>Therapeutic Area</h3><div>Other: Artificial intelligence; Misinformation</div></div><div><h3>Background</h3><div>Misinformation regarding CVD prevention is prevalent on the internet and on social media. Chat-based artificial intelligence (AI) models such as ChatGPT have gained over 100 million users, are publicly accessible, and may provide appropriate information for simple CVD prevention topics. Whether these public AI models may propagate misinformation regarding CVD prevention is uncertain.</div></div><div><h3>Methods</h3><div>This study was performed in March 2024 using the subscription-based version of GPT-4 (OpenAI, USA). Prompts regarding six CVD prevention topics (statin therapy and muscle-side effects, dementia, and liver disease; fish oil; supplements; and low-density lipoprotein-cholesterol and heart disease) were posed. Prompts were framed in two tones: a neutral tone and a misinformation-prompting tone. The misinformation-prompting tone requested specific arguments and scientific references to support misinformation. Each tone and topic was prompted in a different chatbot instance. Each response was reviewed by a board-certified cardiologist specializing in preventive cardiology at a tertiary care center. If a response had multiple bullet-points with individual scientific references, each bullet-point was graded separately. Responses were graded as appropriate (accurate content and references), borderline (minor inaccuracies or references published >20 years ago), or inappropriate (inaccurate content and/or references, including non-existent references).</div></div><div><h3>Results</h3><div>For the six prompts posed with a neutral tone, all responses lacked scientific references and were graded as appropriate (100%). For all six prompts posed with a misinformation-prompting tone, each response consisted of multiple discrete bullet-points with a scientific reference for each individual point. Of 31 bullet-points across the six topics obtained using a misinformation-prompting tone, 32.2% (10/31) were graded as appropriate, 19.4% (6/31) were graded as borderline, and 48.4% (15/31) were graded as inappropriate.</div></div><div><h3>Conclusions</h3><div>In this exploratory study, GPT-4 – a popular and publicly accessible chat-based AI model – was easily prompted to support CVD prevention misinformation. Misinformation-supporting arguments and scientific references were inappropriate due to inaccurate content and/or references nearly 50% of the time. Robust research efforts and policies are needed to study and prevent AI-enabled propagation of misinformation regarding CVD prevention.</div></div>","PeriodicalId":72173,"journal":{"name":"American journal of preventive cardiology","volume":"19 ","pages":"Article 100806"},"PeriodicalIF":4.3000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EVALUATING MISINFORMATION REGARDING CARDIOVASCULAR DISEASE PREVENTION OBTAINED ON A POPULAR, PUBLICLY ACCESSIBLE ARTIFICIAL INTELLIGENCE MODEL (GPT-4)\",\"authors\":\"Ashish Sarraju MD\",\"doi\":\"10.1016/j.ajpc.2024.100806\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Therapeutic Area</h3><div>Other: Artificial intelligence; Misinformation</div></div><div><h3>Background</h3><div>Misinformation regarding CVD prevention is prevalent on the internet and on social media. Chat-based artificial intelligence (AI) models such as ChatGPT have gained over 100 million users, are publicly accessible, and may provide appropriate information for simple CVD prevention topics. Whether these public AI models may propagate misinformation regarding CVD prevention is uncertain.</div></div><div><h3>Methods</h3><div>This study was performed in March 2024 using the subscription-based version of GPT-4 (OpenAI, USA). Prompts regarding six CVD prevention topics (statin therapy and muscle-side effects, dementia, and liver disease; fish oil; supplements; and low-density lipoprotein-cholesterol and heart disease) were posed. Prompts were framed in two tones: a neutral tone and a misinformation-prompting tone. The misinformation-prompting tone requested specific arguments and scientific references to support misinformation. Each tone and topic was prompted in a different chatbot instance. Each response was reviewed by a board-certified cardiologist specializing in preventive cardiology at a tertiary care center. If a response had multiple bullet-points with individual scientific references, each bullet-point was graded separately. Responses were graded as appropriate (accurate content and references), borderline (minor inaccuracies or references published >20 years ago), or inappropriate (inaccurate content and/or references, including non-existent references).</div></div><div><h3>Results</h3><div>For the six prompts posed with a neutral tone, all responses lacked scientific references and were graded as appropriate (100%). For all six prompts posed with a misinformation-prompting tone, each response consisted of multiple discrete bullet-points with a scientific reference for each individual point. Of 31 bullet-points across the six topics obtained using a misinformation-prompting tone, 32.2% (10/31) were graded as appropriate, 19.4% (6/31) were graded as borderline, and 48.4% (15/31) were graded as inappropriate.</div></div><div><h3>Conclusions</h3><div>In this exploratory study, GPT-4 – a popular and publicly accessible chat-based AI model – was easily prompted to support CVD prevention misinformation. Misinformation-supporting arguments and scientific references were inappropriate due to inaccurate content and/or references nearly 50% of the time. Robust research efforts and policies are needed to study and prevent AI-enabled propagation of misinformation regarding CVD prevention.</div></div>\",\"PeriodicalId\":72173,\"journal\":{\"name\":\"American journal of preventive cardiology\",\"volume\":\"19 \",\"pages\":\"Article 100806\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of preventive cardiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666667724001740\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of preventive cardiology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666667724001740","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

治疗领域其他：人工智能；错误信息背景有关心血管疾病预防的错误信息在互联网和社交媒体上十分普遍。基于聊天的人工智能（AI）模型（如 ChatGPT）已拥有超过 1 亿用户，可公开访问，并可为简单的心血管疾病预防话题提供适当的信息。这些公开的人工智能模型是否会传播有关心血管疾病预防的错误信息尚不确定。方法本研究于 2024 年 3 月使用订阅版 GPT-4 （OpenAI，美国）进行。研究人员就六个心血管疾病预防主题（他汀类药物治疗和肌肉副作用、痴呆症和肝病；鱼油；补充剂；低密度脂蛋白胆固醇和心脏病）提出了提示。提示语有两种语调：中性语调和错误信息提示语调。错误信息提示语气要求提供支持错误信息的具体论据和科学参考资料。每种语气和主题都在不同的聊天机器人实例中提示。每条回复都由一家三级医疗中心的预防心脏病专业认证心脏病专家进行审核。如果一个回复有多个要点，并附有单独的科学参考文献，则每个要点单独评分。结果对于以中性语气提出的六条提示，所有回答都缺乏科学参考文献，被评为适当（100%）。对于以错误信息提示语气提出的所有六条提示，每条回答都由多个不连续的要点组成，每个要点都有科学参考文献。在使用错误信息提示语气获得的六个主题的 31 个要点中，32.2%（10/31）被评为适当，19.4%（6/31）被评为边缘，48.4%（15/31）被评为不适当。由于内容和/或引用不准确，近 50% 的时间里错误信息支持的论据和科学引用都是不恰当的。需要开展大量研究工作并制定相关政策，以研究和防止人工智能传播有关心血管疾病预防的错误信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

EVALUATING MISINFORMATION REGARDING CARDIOVASCULAR DISEASE PREVENTION OBTAINED ON A POPULAR, PUBLICLY ACCESSIBLE ARTIFICIAL INTELLIGENCE MODEL (GPT-4)

Therapeutic Area

Other: Artificial intelligence; Misinformation

Background

Misinformation regarding CVD prevention is prevalent on the internet and on social media. Chat-based artificial intelligence (AI) models such as ChatGPT have gained over 100 million users, are publicly accessible, and may provide appropriate information for simple CVD prevention topics. Whether these public AI models may propagate misinformation regarding CVD prevention is uncertain.

Methods

This study was performed in March 2024 using the subscription-based version of GPT-4 (OpenAI, USA). Prompts regarding six CVD prevention topics (statin therapy and muscle-side effects, dementia, and liver disease; fish oil; supplements; and low-density lipoprotein-cholesterol and heart disease) were posed. Prompts were framed in two tones: a neutral tone and a misinformation-prompting tone. The misinformation-prompting tone requested specific arguments and scientific references to support misinformation. Each tone and topic was prompted in a different chatbot instance. Each response was reviewed by a board-certified cardiologist specializing in preventive cardiology at a tertiary care center. If a response had multiple bullet-points with individual scientific references, each bullet-point was graded separately. Responses were graded as appropriate (accurate content and references), borderline (minor inaccuracies or references published >20 years ago), or inappropriate (inaccurate content and/or references, including non-existent references).

Results

For the six prompts posed with a neutral tone, all responses lacked scientific references and were graded as appropriate (100%). For all six prompts posed with a misinformation-prompting tone, each response consisted of multiple discrete bullet-points with a scientific reference for each individual point. Of 31 bullet-points across the six topics obtained using a misinformation-prompting tone, 32.2% (10/31) were graded as appropriate, 19.4% (6/31) were graded as borderline, and 48.4% (15/31) were graded as inappropriate.

Conclusions

In this exploratory study, GPT-4 – a popular and publicly accessible chat-based AI model – was easily prompted to support CVD prevention misinformation. Misinformation-supporting arguments and scientific references were inappropriate due to inaccurate content and/or references nearly 50% of the time. Robust research efforts and policies are needed to study and prevent AI-enabled propagation of misinformation regarding CVD prevention.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊