{"title":"Comprehensive analysis of responses from ChatGPT to consumer inquiries regarding over-the-counter medications.","authors":"K Kiyomiya, T Aomori, H Ohtani","doi":"10.1691/ph.2024.3628","DOIUrl":null,"url":null,"abstract":"<p><p><i>Background:</i> The use of generative artificial intelligence (AI) applications such as ChatGPT is becoming increasingly popular. In Japan, consumers can purchase most over-the-counter (OTC) drugs without having to consult a pharmacist, so they may ask generative AI applications which OTC drugs they should purchase. This study aimed to systematically evaluate responses from ChatGPT to consumer inquiries about various OTC drugs. <i>Methods:</i> We selected 22 popular OTC drugs and 12 typical consumer characteristics, including physical and disease conditions and concomitant medications. We input a total of 264 questions (<i>i. e.</i>, all combinations of drugs and characteristics) to ChatGPT in Japanese, asking whether it is safe for consumers with each characteristic to take these OTC drugs. We used the generic name for 10 of the 22 drugs and the brand name for the remaining 12. Responses were evaluated based on the following three criteria: 1) coherence between the question and response, 2) scientific correctness, and 3) appropriateness of the instructed actions. When we received a response that satisfied all three criteria, we input the exact same question on a different day to assess reproducibility. <i>Results:</i> The proportions of ChatGPT's answers that satisfied criteria 1, 2, and 3 were 79.5%, 54.5%, and 49.6%, respectively. However, the proportion of responses that satisfied all three criteria was only 20.8% (55/264); 61.8% (34/55) of these responses were reproduced when the same question was input again on a different day. Compared with questions using generic names, those using brand names resulted in lower coherence and scientific correctness. Among the 12 characteristics, the appropriateness of the instructed actions tended to be lower in responses to questions about driving and concomitant medications. <i>Conclusions:</i> Our study revealed that ChatGPT was less accurate in its responses and less consistent in its instructed actions compared with the package inserts. Our findings suggest that Japanese consumers should not consult ChatGPT regarding OTC medications, especially when using brand names.</p>","PeriodicalId":20145,"journal":{"name":"Pharmazie","volume":"79 1","pages":"24-28"},"PeriodicalIF":1.5000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmazie","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1691/ph.2024.3628","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The use of generative artificial intelligence (AI) applications such as ChatGPT is becoming increasingly popular. In Japan, consumers can purchase most over-the-counter (OTC) drugs without having to consult a pharmacist, so they may ask generative AI applications which OTC drugs they should purchase. This study aimed to systematically evaluate responses from ChatGPT to consumer inquiries about various OTC drugs. Methods: We selected 22 popular OTC drugs and 12 typical consumer characteristics, including physical and disease conditions and concomitant medications. We input a total of 264 questions (i. e., all combinations of drugs and characteristics) to ChatGPT in Japanese, asking whether it is safe for consumers with each characteristic to take these OTC drugs. We used the generic name for 10 of the 22 drugs and the brand name for the remaining 12. Responses were evaluated based on the following three criteria: 1) coherence between the question and response, 2) scientific correctness, and 3) appropriateness of the instructed actions. When we received a response that satisfied all three criteria, we input the exact same question on a different day to assess reproducibility. Results: The proportions of ChatGPT's answers that satisfied criteria 1, 2, and 3 were 79.5%, 54.5%, and 49.6%, respectively. However, the proportion of responses that satisfied all three criteria was only 20.8% (55/264); 61.8% (34/55) of these responses were reproduced when the same question was input again on a different day. Compared with questions using generic names, those using brand names resulted in lower coherence and scientific correctness. Among the 12 characteristics, the appropriateness of the instructed actions tended to be lower in responses to questions about driving and concomitant medications. Conclusions: Our study revealed that ChatGPT was less accurate in its responses and less consistent in its instructed actions compared with the package inserts. Our findings suggest that Japanese consumers should not consult ChatGPT regarding OTC medications, especially when using brand names.
期刊介绍:
The journal DiePharmazie publishs reviews, experimental studies, letters to the editor, as well as book reviews.
The following fields of pharmacy are covered:
Pharmaceutical and medicinal chemistry;
Pharmaceutical analysis and drug control;
Pharmaceutical technolgy;
Biopharmacy (biopharmaceutics, pharmacokinetics, biotransformation);
Experimental and clinical pharmacology;
Pharmaceutical biology (pharmacognosy);
Clinical pharmacy;
History of pharmacy.