Zilma Silveira Nogueira Reis MD, PhD , Adriana Silvina Pagano MA, PhD , Isaias Jose Ramos de Oliveira MSc , Cristiane dos Santos Dias MD, PhD , Eura Martins Lage MD, PhD , Erico Franco Mineiro PhD , Glaucia Miranda Varella Pereira PhD , Igor de Carvalho Gomes MSc , Vinicius Araujo Basilio MS , Ricardo João Cruz-Correia PhD , Davi dos Reis de Jesus BCS , Antônio Pereira de Souza Júnior MS , Leonardo Chaves Dutra da Rocha PhD
{"title":"评估大语言模型支持的用药指导:迈向综合模式的第一步","authors":"Zilma Silveira Nogueira Reis MD, PhD , Adriana Silvina Pagano MA, PhD , Isaias Jose Ramos de Oliveira MSc , Cristiane dos Santos Dias MD, PhD , Eura Martins Lage MD, PhD , Erico Franco Mineiro PhD , Glaucia Miranda Varella Pereira PhD , Igor de Carvalho Gomes MSc , Vinicius Araujo Basilio MS , Ricardo João Cruz-Correia PhD , Davi dos Reis de Jesus BCS , Antônio Pereira de Souza Júnior MS , Leonardo Chaves Dutra da Rocha PhD","doi":"10.1016/j.mcpdig.2024.09.006","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To assess the support of large language models (LLMs) in generating clearer and more personalized medication instructions to enhance e-prescription.</div></div><div><h3>Patients and Methods</h3><div>We established patient-centered guidelines for adequate, acceptable, and personalized directions to enhance e-prescription. A dataset comprising 104 outpatient scenarios, with an array of medications, administration routes, and patient conditions, was developed following the Brazilian national e-prescribing standard. Three prompts were submitted to a closed-source LLM. The first prompt involved a generic command, the second one was calibrated for content enhancement and personalization, and the third one requested bias mitigation. The third prompt was submitted to an open-source LLM. Outputs were assessed using automated metrics and human evaluation. We conducted the study between March 1, 2024 and September 10, 2024.</div></div><div><h3>Results</h3><div>Adequacy scores of our closed-source LLM’s output showed the third prompt outperforming the first and second one. Full and partial acceptability was achieved in 94.3% of texts with the third prompt. Personalization was rated highly, especially with the second and third prompts. The 2 LLMs showed similar adequacy results. Lack of scientific evidence and factual errors were infrequent and unrelated to a particular prompt or LLM. The frequency of hallucinations was different for each LLM and concerned prescriptions issued upon symptom manifestation and medications requiring dosage adjustment or involving intermittent use. Gender bias was found in our closed-source LLM’s output for the first and second prompts, with the third one being bias-free. The second LLM’s output was bias-free.</div></div><div><h3>Conclusion</h3><div>This study demonstrates the potential of LLM-supported generation to produce prescription directions and improve communication between health professionals and patients within the e-prescribing system.</div></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 4","pages":"Pages 632-644"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Large Language Model–Supported Instructions for Medication Use: First Steps Toward a Comprehensive Model\",\"authors\":\"Zilma Silveira Nogueira Reis MD, PhD , Adriana Silvina Pagano MA, PhD , Isaias Jose Ramos de Oliveira MSc , Cristiane dos Santos Dias MD, PhD , Eura Martins Lage MD, PhD , Erico Franco Mineiro PhD , Glaucia Miranda Varella Pereira PhD , Igor de Carvalho Gomes MSc , Vinicius Araujo Basilio MS , Ricardo João Cruz-Correia PhD , Davi dos Reis de Jesus BCS , Antônio Pereira de Souza Júnior MS , Leonardo Chaves Dutra da Rocha PhD\",\"doi\":\"10.1016/j.mcpdig.2024.09.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>To assess the support of large language models (LLMs) in generating clearer and more personalized medication instructions to enhance e-prescription.</div></div><div><h3>Patients and Methods</h3><div>We established patient-centered guidelines for adequate, acceptable, and personalized directions to enhance e-prescription. A dataset comprising 104 outpatient scenarios, with an array of medications, administration routes, and patient conditions, was developed following the Brazilian national e-prescribing standard. Three prompts were submitted to a closed-source LLM. The first prompt involved a generic command, the second one was calibrated for content enhancement and personalization, and the third one requested bias mitigation. The third prompt was submitted to an open-source LLM. Outputs were assessed using automated metrics and human evaluation. We conducted the study between March 1, 2024 and September 10, 2024.</div></div><div><h3>Results</h3><div>Adequacy scores of our closed-source LLM’s output showed the third prompt outperforming the first and second one. Full and partial acceptability was achieved in 94.3% of texts with the third prompt. Personalization was rated highly, especially with the second and third prompts. The 2 LLMs showed similar adequacy results. Lack of scientific evidence and factual errors were infrequent and unrelated to a particular prompt or LLM. The frequency of hallucinations was different for each LLM and concerned prescriptions issued upon symptom manifestation and medications requiring dosage adjustment or involving intermittent use. Gender bias was found in our closed-source LLM’s output for the first and second prompts, with the third one being bias-free. The second LLM’s output was bias-free.</div></div><div><h3>Conclusion</h3><div>This study demonstrates the potential of LLM-supported generation to produce prescription directions and improve communication between health professionals and patients within the e-prescribing system.</div></div>\",\"PeriodicalId\":74127,\"journal\":{\"name\":\"Mayo Clinic Proceedings. Digital health\",\"volume\":\"2 4\",\"pages\":\"Pages 632-644\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mayo Clinic Proceedings. Digital health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949761224001032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mayo Clinic Proceedings. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949761224001032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluating Large Language Model–Supported Instructions for Medication Use: First Steps Toward a Comprehensive Model
Objective
To assess the support of large language models (LLMs) in generating clearer and more personalized medication instructions to enhance e-prescription.
Patients and Methods
We established patient-centered guidelines for adequate, acceptable, and personalized directions to enhance e-prescription. A dataset comprising 104 outpatient scenarios, with an array of medications, administration routes, and patient conditions, was developed following the Brazilian national e-prescribing standard. Three prompts were submitted to a closed-source LLM. The first prompt involved a generic command, the second one was calibrated for content enhancement and personalization, and the third one requested bias mitigation. The third prompt was submitted to an open-source LLM. Outputs were assessed using automated metrics and human evaluation. We conducted the study between March 1, 2024 and September 10, 2024.
Results
Adequacy scores of our closed-source LLM’s output showed the third prompt outperforming the first and second one. Full and partial acceptability was achieved in 94.3% of texts with the third prompt. Personalization was rated highly, especially with the second and third prompts. The 2 LLMs showed similar adequacy results. Lack of scientific evidence and factual errors were infrequent and unrelated to a particular prompt or LLM. The frequency of hallucinations was different for each LLM and concerned prescriptions issued upon symptom manifestation and medications requiring dosage adjustment or involving intermittent use. Gender bias was found in our closed-source LLM’s output for the first and second prompts, with the third one being bias-free. The second LLM’s output was bias-free.
Conclusion
This study demonstrates the potential of LLM-supported generation to produce prescription directions and improve communication between health professionals and patients within the e-prescribing system.