首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
Exploring Impediments Imposed by the Medical Device Regulation EU 2017/745 on Software as a Medical Device. 探索欧盟 2017/745 号医疗器械法规对软件作为医疗器械造成的阻碍。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-05 DOI: 10.2196/58080
Liga Svempe

In light of rapid technological advancements, the health care sector is undergoing significant transformation with the continuous emergence of novel digital solutions. Consequently, regulatory frameworks must continuously adapt to ensure their main goal to protect patients. In 2017, the new Medical Device Regulation (EU) 2017/745 (MDR) came into force, bringing more complex requirements for development, launch, and postmarket surveillance. However, the updated regulation considerably impacts the manufacturers, especially small- and medium-sized enterprises, and consequently, the accessibility of medical devices in the European Union market, as many manufacturers decide to either discontinue their products, postpone the launch of new innovative solutions, or leave the European Union market in favor of other regions such as the United States. This could lead to reduced health care quality and slower industry innovation efforts. Effective policy calibration and collaborative efforts are essential to mitigate these effects and promote ongoing advancements in health care technologies in the European Union market. This paper is a narrative review with the objective of exploring hindering factors to software as a medical device development, launch, and marketing brought by the new regulation. It exclusively focuses on the factors that engender obstacles. Related regulations, directives, and proposals were discussed for comparison and further analysis.

随着技术的飞速发展,医疗保健行业正在经历重大变革,新型数字解决方案不断涌现。因此,监管框架必须不断调整,以确保其保护患者的主要目标。2017 年,新的《医疗器械法规(欧盟)2017/745》(MDR)正式生效,为研发、上市和上市后监管带来了更复杂的要求。然而,更新后的法规极大地影响了制造商,尤其是中小型企业,进而影响了医疗器械在欧盟市场的可及性,因为许多制造商决定停止生产其产品,推迟推出新的创新解决方案,或退出欧盟市场,转而进入美国等其他地区。这可能会导致医疗质量下降,行业创新工作放缓。有效的政策调整和合作努力对于减轻这些影响和促进欧盟市场医疗保健技术的不断进步至关重要。本文是一篇叙述性综述,旨在探讨新法规对作为医疗设备的软件的开发、发布和营销带来的阻碍因素。本文主要关注造成障碍的因素。本文还讨论了相关法规、指令和建议,以便进行比较和进一步分析。
{"title":"Exploring Impediments Imposed by the Medical Device Regulation EU 2017/745 on Software as a Medical Device.","authors":"Liga Svempe","doi":"10.2196/58080","DOIUrl":"10.2196/58080","url":null,"abstract":"<p><p>In light of rapid technological advancements, the health care sector is undergoing significant transformation with the continuous emergence of novel digital solutions. Consequently, regulatory frameworks must continuously adapt to ensure their main goal to protect patients. In 2017, the new Medical Device Regulation (EU) 2017/745 (MDR) came into force, bringing more complex requirements for development, launch, and postmarket surveillance. However, the updated regulation considerably impacts the manufacturers, especially small- and medium-sized enterprises, and consequently, the accessibility of medical devices in the European Union market, as many manufacturers decide to either discontinue their products, postpone the launch of new innovative solutions, or leave the European Union market in favor of other regions such as the United States. This could lead to reduced health care quality and slower industry innovation efforts. Effective policy calibration and collaborative efforts are essential to mitigate these effects and promote ongoing advancements in health care technologies in the European Union market. This paper is a narrative review with the objective of exploring hindering factors to software as a medical device development, launch, and marketing brought by the new regulation. It exclusively focuses on the factors that engender obstacles. Related regulations, directives, and proposals were discussed for comparison and further analysis.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practical Applications of Large Language Models for Health Care Professionals and Scientists. 面向医疗保健专业人员和科学家的大型语言模型的实际应用。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-05 DOI: 10.2196/58478
Florian Reis, Christian Lenz, Manfred Gossen, Hans-Dieter Volk, Norman Michael Drzeniek

Unlabelled: With the popularization of large language models (LLMs), strategies for their effective and safe usage in health care and research have become increasingly pertinent. Despite the growing interest and eagerness among health care professionals and scientists to exploit the potential of LLMs, initial attempts may yield suboptimal results due to a lack of user experience, thus complicating the integration of artificial intelligence (AI) tools into workplace routine. Focusing on scientists and health care professionals with limited LLM experience, this viewpoint article highlights and discusses 6 easy-to-implement use cases of practical relevance. These encompass customizing translations, refining text and extracting information, generating comprehensive overviews and specialized insights, compiling ideas into cohesive narratives, crafting personalized educational materials, and facilitating intellectual sparring. Additionally, we discuss general prompting strategies and precautions for the implementation of AI tools in biomedicine. Despite various hurdles and challenges, the integration of LLMs into daily routines of physicians and researchers promises heightened workplace productivity and efficiency.

无标签:随着大型语言模型(LLMs)的普及,在医疗保健和研究中有效、安全地使用这些模型的策略变得越来越重要。尽管医疗保健专业人员和科学家对利用 LLMs 的潜力越来越感兴趣和渴望,但由于缺乏用户经验,最初的尝试可能会产生不理想的结果,从而使人工智能(AI)工具与日常工作的整合变得更加复杂。这篇文章的观点聚焦于具有有限 LLM 经验的科学家和医疗保健专业人员,重点介绍并讨论了 6 个易于实施的实用案例。这些案例包括定制翻译、完善文本和提取信息、生成全面概述和专业见解、将观点编译成连贯的叙述、制作个性化教育材料,以及促进智力比拼。此外,我们还讨论了在生物医学中实施人工智能工具的一般提示策略和注意事项。尽管存在各种障碍和挑战,但将 LLM 融入医生和研究人员的日常工作有望提高工作场所的生产力和效率。
{"title":"Practical Applications of Large Language Models for Health Care Professionals and Scientists.","authors":"Florian Reis, Christian Lenz, Manfred Gossen, Hans-Dieter Volk, Norman Michael Drzeniek","doi":"10.2196/58478","DOIUrl":"10.2196/58478","url":null,"abstract":"<p><strong>Unlabelled: </strong>With the popularization of large language models (LLMs), strategies for their effective and safe usage in health care and research have become increasingly pertinent. Despite the growing interest and eagerness among health care professionals and scientists to exploit the potential of LLMs, initial attempts may yield suboptimal results due to a lack of user experience, thus complicating the integration of artificial intelligence (AI) tools into workplace routine. Focusing on scientists and health care professionals with limited LLM experience, this viewpoint article highlights and discusses 6 easy-to-implement use cases of practical relevance. These encompass customizing translations, refining text and extracting information, generating comprehensive overviews and specialized insights, compiling ideas into cohesive narratives, crafting personalized educational materials, and facilitating intellectual sparring. Additionally, we discuss general prompting strategies and precautions for the implementation of AI tools in biomedicine. Despite various hurdles and challenges, the integration of LLMs into daily routines of physicians and researchers promises heightened workplace productivity and efficiency.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11391657/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Capabilities of Generative AI Tools in Understanding Medical Papers: Qualitative Study. 评估生成式人工智能工具在理解医学论文方面的能力:定性研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-04 DOI: 10.2196/59258
Seyma Handan Akyon, Fatih Cagatay Akyon, Ahmet Sefa Camyar, Fatih Hızlı, Talha Sari, Şamil Hızlı

Background: Reading medical papers is a challenging and time-consuming task for doctors, especially when the papers are long and complex. A tool that can help doctors efficiently process and understand medical papers is needed.

Objective: This study aims to critically assess and compare the comprehension capabilities of large language models (LLMs) in accurately and efficiently understanding medical research papers using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) checklist, which provides a standardized framework for evaluating key elements of observational study.

Methods: The study is a methodological type of research. The study aims to evaluate the understanding capabilities of new generative artificial intelligence tools in medical papers. A novel benchmark pipeline processed 50 medical research papers from PubMed, comparing the answers of 6 LLMs (GPT-3.5-Turbo, GPT-4-0613, GPT-4-1106, PaLM 2, Claude v1, and Gemini Pro) to the benchmark established by expert medical professors. Fifteen questions, derived from the STROBE checklist, assessed LLMs' understanding of different sections of a research paper.

Results: LLMs exhibited varying performance, with GPT-3.5-Turbo achieving the highest percentage of correct answers (n=3916, 66.9%), followed by GPT-4-1106 (n=3837, 65.6%), PaLM 2 (n=3632, 62.1%), Claude v1 (n=2887, 58.3%), Gemini Pro (n=2878, 49.2%), and GPT-4-0613 (n=2580, 44.1%). Statistical analysis revealed statistically significant differences between LLMs (P<.001), with older models showing inconsistent performance compared to newer versions. LLMs showcased distinct performances for each question across different parts of a scholarly paper-with certain models like PaLM 2 and GPT-3.5 showing remarkable versatility and depth in understanding.

Conclusions: This study is the first to evaluate the performance of different LLMs in understanding medical papers using the retrieval augmented generation method. The findings highlight the potential of LLMs to enhance medical research by improving efficiency and facilitating evidence-based decision-making. Further research is needed to address limitations such as the influence of question formats, potential biases, and the rapid evolution of LLM models.

背景:阅读医学论文对医生来说是一项具有挑战性且耗时的任务,尤其是当论文篇幅较长、内容复杂时。我们需要一种能帮助医生高效处理和理解医学论文的工具:本研究旨在使用 STROBE(加强流行病学中观察性研究的报告)核对表,批判性地评估和比较大型语言模型(LLM)在准确、高效地理解医学研究论文方面的理解能力:本研究属于方法论研究。研究旨在评估新的生成式人工智能工具对医学论文的理解能力。一种新型基准管道处理了来自 PubMed 的 50 篇医学研究论文,将 6 种 LLM(GPT-3.5-Turbo、GPT-4-0613、GPT-4-1106、PaLM 2、Claude v1 和 Gemini Pro)的答案与医学专家教授设定的基准进行了比较。从 STROBE 检查表中提取的 15 个问题评估了法学硕士对研究论文不同部分的理解:法学硕士的表现各不相同,GPT-3.5-Turbo的正确率最高(n=3916,66.9%),其次是GPT-4-1106(n=3837,65.6%)、PaLM 2(n=3632,62.1%)、Claude v1(n=2887,58.3%)、Gemini Pro(n=2878,49.2%)和GPT-4-0613(n=2580,44.1%)。统计分析显示,不同 LLM 之间存在显著的统计学差异(PConclusions:本研究首次使用检索增强生成法评估了不同 LLM 在理解医学论文方面的性能。研究结果凸显了 LLM 通过提高效率和促进循证决策来加强医学研究的潜力。还需要进一步的研究来解决一些局限性问题,如问题格式的影响、潜在的偏见以及 LLM 模型的快速演变。
{"title":"Evaluating the Capabilities of Generative AI Tools in Understanding Medical Papers: Qualitative Study.","authors":"Seyma Handan Akyon, Fatih Cagatay Akyon, Ahmet Sefa Camyar, Fatih Hızlı, Talha Sari, Şamil Hızlı","doi":"10.2196/59258","DOIUrl":"10.2196/59258","url":null,"abstract":"<p><strong>Background: </strong>Reading medical papers is a challenging and time-consuming task for doctors, especially when the papers are long and complex. A tool that can help doctors efficiently process and understand medical papers is needed.</p><p><strong>Objective: </strong>This study aims to critically assess and compare the comprehension capabilities of large language models (LLMs) in accurately and efficiently understanding medical research papers using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) checklist, which provides a standardized framework for evaluating key elements of observational study.</p><p><strong>Methods: </strong>The study is a methodological type of research. The study aims to evaluate the understanding capabilities of new generative artificial intelligence tools in medical papers. A novel benchmark pipeline processed 50 medical research papers from PubMed, comparing the answers of 6 LLMs (GPT-3.5-Turbo, GPT-4-0613, GPT-4-1106, PaLM 2, Claude v1, and Gemini Pro) to the benchmark established by expert medical professors. Fifteen questions, derived from the STROBE checklist, assessed LLMs' understanding of different sections of a research paper.</p><p><strong>Results: </strong>LLMs exhibited varying performance, with GPT-3.5-Turbo achieving the highest percentage of correct answers (n=3916, 66.9%), followed by GPT-4-1106 (n=3837, 65.6%), PaLM 2 (n=3632, 62.1%), Claude v1 (n=2887, 58.3%), Gemini Pro (n=2878, 49.2%), and GPT-4-0613 (n=2580, 44.1%). Statistical analysis revealed statistically significant differences between LLMs (P<.001), with older models showing inconsistent performance compared to newer versions. LLMs showcased distinct performances for each question across different parts of a scholarly paper-with certain models like PaLM 2 and GPT-3.5 showing remarkable versatility and depth in understanding.</p><p><strong>Conclusions: </strong>This study is the first to evaluate the performance of different LLMs in understanding medical papers using the retrieval augmented generation method. The findings highlight the potential of LLMs to enhance medical research by improving efficiency and facilitating evidence-based decision-making. Further research is needed to address limitations such as the influence of question formats, potential biases, and the rapid evolution of LLM models.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11411230/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142127503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transforming Health Care Through Chatbots for Medical History-Taking and Future Directions: Comprehensive Systematic Review. 通过病史采集聊天机器人改变医疗保健和未来方向:全面系统综述。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-29 DOI: 10.2196/56628
Michael Hindelang, Sebastian Sitaru, Alexander Zink
<p><strong>Background: </strong>The integration of artificial intelligence and chatbot technology in health care has attracted significant attention due to its potential to improve patient care and streamline history-taking. As artificial intelligence-driven conversational agents, chatbots offer the opportunity to revolutionize history-taking, necessitating a comprehensive examination of their impact on medical practice.</p><p><strong>Objective: </strong>This systematic review aims to assess the role, effectiveness, usability, and patient acceptance of chatbots in medical history-taking. It also examines potential challenges and future opportunities for integration into clinical practice.</p><p><strong>Methods: </strong>A systematic search included PubMed, Embase, MEDLINE (via Ovid), CENTRAL, Scopus, and Open Science and covered studies through July 2024. The inclusion and exclusion criteria for the studies reviewed were based on the PICOS (participants, interventions, comparators, outcomes, and study design) framework. The population included individuals using health care chatbots for medical history-taking. Interventions focused on chatbots designed to facilitate medical history-taking. The outcomes of interest were the feasibility, acceptance, and usability of chatbot-based medical history-taking. Studies not reporting on these outcomes were excluded. All study designs except conference papers were eligible for inclusion. Only English-language studies were considered. There were no specific restrictions on study duration. Key search terms included "chatbot*," "conversational agent*," "virtual assistant," "artificial intelligence chatbot," "medical history," and "history-taking." The quality of observational studies was classified using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) criteria (eg, sample size, design, data collection, and follow-up). The RoB 2 (Risk of Bias) tool assessed areas and the levels of bias in randomized controlled trials (RCTs).</p><p><strong>Results: </strong>The review included 15 observational studies and 3 RCTs and synthesized evidence from different medical fields and populations. Chatbots systematically collect information through targeted queries and data retrieval, improving patient engagement and satisfaction. The results show that chatbots have great potential for history-taking and that the efficiency and accessibility of the health care system can be improved by 24/7 automated data collection. Bias assessments revealed that of the 15 observational studies, 5 (33%) studies were of high quality, 5 (33%) studies were of moderate quality, and 5 (33%) studies were of low quality. Of the RCTs, 2 had a low risk of bias, while 1 had a high risk.</p><p><strong>Conclusions: </strong>This systematic review provides critical insights into the potential benefits and challenges of using chatbots for medical history-taking. The included studies showed that chatbots can increase patient
背景:由于人工智能和聊天机器人技术具有改善患者护理和简化病史采集的潜力,因此将其整合到医疗保健领域引起了广泛关注。作为人工智能驱动的对话代理,聊天机器人提供了彻底改变病史采集的机会,因此有必要全面研究其对医疗实践的影响:本系统综述旨在评估聊天机器人在病史采集中的作用、有效性、可用性和患者接受度。目的:本系统综述旨在评估聊天机器人在病史采集中的作用、有效性、可用性和患者接受度,并探讨将其融入临床实践的潜在挑战和未来机遇:系统性检索包括PubMed、Embase、MEDLINE(通过Ovid)、CENTRAL、Scopus和Open Science,涵盖截至2024年7月的研究。所审查研究的纳入和排除标准基于 PICOS(参与者、干预措施、比较者、结果和研究设计)框架。研究对象包括使用医疗聊天机器人采集病史的个人。干预措施侧重于旨在促进病史采集的聊天机器人。研究结果关注的是基于聊天机器人的病史采集的可行性、接受度和可用性。未报告这些结果的研究被排除在外。除会议论文外,所有研究设计均可纳入。仅考虑英语研究。对研究持续时间没有具体限制。关键搜索词包括 "聊天机器人*"、"对话代理*"、"虚拟助手"、"人工智能聊天机器人"、"病史 "和 "病史采集"。观察性研究的质量采用 STROBE(加强流行病学观察性研究报告)标准(如样本大小、设计、数据收集和随访)进行分类。RoB 2(偏倚风险)工具评估了随机对照试验(RCT)中存在偏倚的领域和程度:综述包括 15 项观察性研究和 3 项随机对照试验,并综合了来自不同医学领域和人群的证据。聊天机器人通过有针对性的查询和数据检索系统地收集信息,提高了患者的参与度和满意度。研究结果表明,聊天机器人在病史采集方面潜力巨大,全天候自动数据收集可提高医疗系统的效率和可及性。偏倚评估显示,在 15 项观察性研究中,5 项(33%)研究的质量较高,5 项(33%)研究的质量中等,5 项(33%)研究的质量较低。在随机对照研究中,2 项研究的偏倚风险较低,1 项研究的偏倚风险较高:本系统综述为了解使用聊天机器人采集病史的潜在益处和挑战提供了重要见解。纳入的研究表明,聊天机器人可以提高患者参与度、简化数据收集并改善医疗决策。要想有效地融入临床实践,关键是要设计用户友好的界面、确保强大的数据安全以及保持患者与医生之间的移情互动。未来的研究应侧重于完善聊天机器人算法、提高其情商,并将其应用扩展到不同的医疗环境,以充分发挥其在现代医学中的潜力:PERCORO CRD42023410312; www.crd.york.ac.uk/prospero.
{"title":"Transforming Health Care Through Chatbots for Medical History-Taking and Future Directions: Comprehensive Systematic Review.","authors":"Michael Hindelang, Sebastian Sitaru, Alexander Zink","doi":"10.2196/56628","DOIUrl":"10.2196/56628","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The integration of artificial intelligence and chatbot technology in health care has attracted significant attention due to its potential to improve patient care and streamline history-taking. As artificial intelligence-driven conversational agents, chatbots offer the opportunity to revolutionize history-taking, necessitating a comprehensive examination of their impact on medical practice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This systematic review aims to assess the role, effectiveness, usability, and patient acceptance of chatbots in medical history-taking. It also examines potential challenges and future opportunities for integration into clinical practice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;A systematic search included PubMed, Embase, MEDLINE (via Ovid), CENTRAL, Scopus, and Open Science and covered studies through July 2024. The inclusion and exclusion criteria for the studies reviewed were based on the PICOS (participants, interventions, comparators, outcomes, and study design) framework. The population included individuals using health care chatbots for medical history-taking. Interventions focused on chatbots designed to facilitate medical history-taking. The outcomes of interest were the feasibility, acceptance, and usability of chatbot-based medical history-taking. Studies not reporting on these outcomes were excluded. All study designs except conference papers were eligible for inclusion. Only English-language studies were considered. There were no specific restrictions on study duration. Key search terms included \"chatbot*,\" \"conversational agent*,\" \"virtual assistant,\" \"artificial intelligence chatbot,\" \"medical history,\" and \"history-taking.\" The quality of observational studies was classified using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) criteria (eg, sample size, design, data collection, and follow-up). The RoB 2 (Risk of Bias) tool assessed areas and the levels of bias in randomized controlled trials (RCTs).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The review included 15 observational studies and 3 RCTs and synthesized evidence from different medical fields and populations. Chatbots systematically collect information through targeted queries and data retrieval, improving patient engagement and satisfaction. The results show that chatbots have great potential for history-taking and that the efficiency and accessibility of the health care system can be improved by 24/7 automated data collection. Bias assessments revealed that of the 15 observational studies, 5 (33%) studies were of high quality, 5 (33%) studies were of moderate quality, and 5 (33%) studies were of low quality. Of the RCTs, 2 had a low risk of bias, while 1 had a high risk.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;This systematic review provides critical insights into the potential benefits and challenges of using chatbots for medical history-taking. The included studies showed that chatbots can increase patient ","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393511/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142115512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of an Electronic Health Record-Based Interruptive Alert Among Patients With Headaches Seen in Primary Care: Cluster Randomized Controlled Trial. 基于电子健康记录的中断警报对基层医疗机构头痛患者的影响:分组随机对照试验》。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-29 DOI: 10.2196/58456
Apoorva Pradhan, Eric A Wright, Vanessa A Hayduk, Juliana Berhane, Mallory Sponenberg, Leeann Webster, Hannah Anderson, Siyeon Park, Jove Graham, Scott Friedenberg

Background: Headaches, including migraines, are one of the most common causes of disability and account for nearly 20%-30% of referrals from primary care to neurology. In primary care, electronic health record-based alerts offer a mechanism to influence health care provider behaviors, manage neurology referrals, and optimize headache care.

Objective: This project aimed to evaluate the impact of an electronic alert implemented in primary care on patients' overall headache management.

Methods: We conducted a stratified cluster-randomized study across 38 primary care clinic sites between December 2021 to December 2022 at a large integrated health care delivery system in the United States. Clinics were stratified into 6 blocks based on region and patient-to-health care provider ratios and then 1:1 randomized within each block into either the control or intervention. Health care providers practicing at intervention clinics received an interruptive alert in the electronic health record. The primary end point was a change in headache burden, measured using the Headache Impact Test 6 scale, from baseline to 6 months. Secondary outcomes included changes in headache frequency and intensity, access to care, and resource use. We analyzed the difference-in-differences between the arms at follow-up at the individual patient level.

Results: We enrolled 203 adult patients with a confirmed headache diagnosis. At baseline, the average Headache Impact Test 6 scores in each arm were not significantly different (intervention: mean 63, SD 6.9; control: mean 61.8, SD 6.6; P=.21). We observed a significant reduction in the headache burden only in the intervention arm at follow-up (3.5 points; P=.009). The reduction in the headache burden was not statistically different between groups (difference-in-differences estimate -1.89, 95% CI -5 to 1.31; P=.25). Similarly, secondary outcomes were not significantly different between groups. Only 11.32% (303/2677) of alerts were acted upon.

Conclusions: The use of an interruptive electronic alert did not significantly improve headache outcomes. Low use of alerts by health care providers prompts future alterations of the alert and exploration of alternative approaches.

背景:头痛(包括偏头痛)是导致残疾的最常见原因之一,占初级保健向神经内科转诊的近 20%-30%。在初级保健中,基于电子健康记录的警报提供了一种机制来影响医疗服务提供者的行为、管理神经科转诊并优化头痛护理:本项目旨在评估基层医疗机构实施电子警报对患者整体头痛管理的影响:我们于 2021 年 12 月至 2022 年 12 月在美国一家大型综合医疗保健服务系统的 38 个初级保健诊所开展了一项分层分组随机研究。根据地区和患者与医疗服务提供者的比例,将诊所分为 6 个区块,然后在每个区块内按 1:1 随机分配到对照组或干预组。在干预诊所执业的医疗服务提供者会收到电子健康记录中的中断警报。主要终点是头痛负担从基线到 6 个月的变化,采用头痛影响测试 6 量表进行测量。次要结果包括头痛频率和强度的变化、获得护理的机会以及资源使用情况。我们分析了随访时两组患者在个体水平上的差异:我们招募了 203 名确诊头痛的成年患者。基线时,各组的头痛影响测试 6 平均得分无显著差异(干预组:平均 63 分,标准差 6.9 分;对照组:平均 61.8 分,标准差 6.6 分;P=.21)。我们观察到,只有干预组在随访时头痛负担明显减轻(3.5 分;P=.009)。各组间头痛负担的减轻程度并无统计学差异(差异估计值-1.89,95% CI -5至1.31;P=.25)。同样,各组之间的次要结果也没有明显差异。只有 11.32% 的警报(303/2677)被执行:结论:使用中断性电子警报并不能明显改善头痛的治疗效果。医疗服务提供者对警报的使用率较低,这促使他们在未来改变警报并探索其他方法。
{"title":"Impact of an Electronic Health Record-Based Interruptive Alert Among Patients With Headaches Seen in Primary Care: Cluster Randomized Controlled Trial.","authors":"Apoorva Pradhan, Eric A Wright, Vanessa A Hayduk, Juliana Berhane, Mallory Sponenberg, Leeann Webster, Hannah Anderson, Siyeon Park, Jove Graham, Scott Friedenberg","doi":"10.2196/58456","DOIUrl":"10.2196/58456","url":null,"abstract":"<p><strong>Background: </strong>Headaches, including migraines, are one of the most common causes of disability and account for nearly 20%-30% of referrals from primary care to neurology. In primary care, electronic health record-based alerts offer a mechanism to influence health care provider behaviors, manage neurology referrals, and optimize headache care.</p><p><strong>Objective: </strong>This project aimed to evaluate the impact of an electronic alert implemented in primary care on patients' overall headache management.</p><p><strong>Methods: </strong>We conducted a stratified cluster-randomized study across 38 primary care clinic sites between December 2021 to December 2022 at a large integrated health care delivery system in the United States. Clinics were stratified into 6 blocks based on region and patient-to-health care provider ratios and then 1:1 randomized within each block into either the control or intervention. Health care providers practicing at intervention clinics received an interruptive alert in the electronic health record. The primary end point was a change in headache burden, measured using the Headache Impact Test 6 scale, from baseline to 6 months. Secondary outcomes included changes in headache frequency and intensity, access to care, and resource use. We analyzed the difference-in-differences between the arms at follow-up at the individual patient level.</p><p><strong>Results: </strong>We enrolled 203 adult patients with a confirmed headache diagnosis. At baseline, the average Headache Impact Test 6 scores in each arm were not significantly different (intervention: mean 63, SD 6.9; control: mean 61.8, SD 6.6; P=.21). We observed a significant reduction in the headache burden only in the intervention arm at follow-up (3.5 points; P=.009). The reduction in the headache burden was not statistically different between groups (difference-in-differences estimate -1.89, 95% CI -5 to 1.31; P=.25). Similarly, secondary outcomes were not significantly different between groups. Only 11.32% (303/2677) of alerts were acted upon.</p><p><strong>Conclusions: </strong>The use of an interruptive electronic alert did not significantly improve headache outcomes. Low use of alerts by health care providers prompts future alterations of the alert and exploration of alternative approaches.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11376138/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142115511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Viability of Open Large Language Models for Clinical Documentation in German Health Care: Real-World Model Evaluation Study. 开放式大型语言模型在德国医疗保健临床文档中的可行性:真实世界模型评估研究
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-28 DOI: 10.2196/59617
Felix Heilmeyer, Daniel Böhringer, Thomas Reinhard, Sebastian Arens, Lisa Lyssenko, Christian Haverkamp

Background: The use of large language models (LLMs) as writing assistance for medical professionals is a promising approach to reduce the time required for documentation, but there may be practical, ethical, and legal challenges in many jurisdictions complicating the use of the most powerful commercial LLM solutions.

Objective: In this study, we assessed the feasibility of using nonproprietary LLMs of the GPT variety as writing assistance for medical professionals in an on-premise setting with restricted compute resources, generating German medical text.

Methods: We trained four 7-billion-parameter models with 3 different architectures for our task and evaluated their performance using a powerful commercial LLM, namely Anthropic's Claude-v2, as a rater. Based on this, we selected the best-performing model and evaluated its practical usability with 2 independent human raters on real-world data.

Results: In the automated evaluation with Claude-v2, BLOOM-CLP-German, a model trained from scratch on the German text, achieved the best results. In the manual evaluation by human experts, 95 (93.1%) of the 102 reports generated by that model were evaluated as usable as is or with only minor changes by both human raters.

Conclusions: The results show that even with restricted compute resources, it is possible to generate medical texts that are suitable for documentation in routine clinical practice. However, the target language should be considered in the model selection when processing non-English text.

背景:使用大型语言模型(LLMs)作为医疗专业人员的写作辅助工具是一种很有前途的方法,可以减少文档撰写所需的时间,但在许多司法管辖区,使用功能最强大的商业 LLM 解决方案可能会面临实际、道德和法律方面的挑战:在本研究中,我们评估了在计算资源有限的内部环境中使用 GPT 类型的非专有 LLM 作为医学专业人员写作辅助工具的可行性,并生成了德语医学文本:针对我们的任务,我们使用 3 种不同的架构训练了 4 个 70 亿参数模型,并使用功能强大的商用 LLM(即 Anthropic 的 Claude-v2)作为评分器评估了它们的性能。在此基础上,我们选出了表现最佳的模型,并由两名独立的人类评测员对其在真实世界数据中的实际可用性进行了评估:在使用 Claude-v2 进行的自动评估中,根据德语文本从头开始训练的 BLOOM-CLP-German 模型取得了最佳结果。在由人类专家进行的人工评估中,该模型生成的 102 份报告中有 95 份(93.1%)被两位人类评估员评为可用,或只需稍作修改即可使用:结果表明,即使计算资源有限,也有可能生成适合常规临床实践文档的医学文本。然而,在处理非英语文本时,在选择模型时应考虑目标语言。
{"title":"Viability of Open Large Language Models for Clinical Documentation in German Health Care: Real-World Model Evaluation Study.","authors":"Felix Heilmeyer, Daniel Böhringer, Thomas Reinhard, Sebastian Arens, Lisa Lyssenko, Christian Haverkamp","doi":"10.2196/59617","DOIUrl":"10.2196/59617","url":null,"abstract":"<p><strong>Background: </strong>The use of large language models (LLMs) as writing assistance for medical professionals is a promising approach to reduce the time required for documentation, but there may be practical, ethical, and legal challenges in many jurisdictions complicating the use of the most powerful commercial LLM solutions.</p><p><strong>Objective: </strong>In this study, we assessed the feasibility of using nonproprietary LLMs of the GPT variety as writing assistance for medical professionals in an on-premise setting with restricted compute resources, generating German medical text.</p><p><strong>Methods: </strong>We trained four 7-billion-parameter models with 3 different architectures for our task and evaluated their performance using a powerful commercial LLM, namely Anthropic's Claude-v2, as a rater. Based on this, we selected the best-performing model and evaluated its practical usability with 2 independent human raters on real-world data.</p><p><strong>Results: </strong>In the automated evaluation with Claude-v2, BLOOM-CLP-German, a model trained from scratch on the German text, achieved the best results. In the manual evaluation by human experts, 95 (93.1%) of the 102 reports generated by that model were evaluated as usable as is or with only minor changes by both human raters.</p><p><strong>Conclusions: </strong>The results show that even with restricted compute resources, it is possible to generate medical texts that are suitable for documentation in routine clinical practice. However, the target language should be considered in the model selection when processing non-English text.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373371/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142082750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of the World Health Organization Minimum Dataset for Emergency Medical Teams to Create Disaster Profiles for the Indonesian SATUSEHAT Platform Using Fast Healthcare Interoperability Resources: Development and Validation Study. 利用快速医疗保健互操作性资源为印度尼西亚 SATUSEHAT 平台实施世界卫生组织紧急医疗队最低数据集以创建灾难档案:开发与验证研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-28 DOI: 10.2196/59651
Hiro Putra Faisal, Masaharu Nakayama

Background: The National Disaster Management Agency (Badan Nasional Penanggulangan Bencana) handles disaster management in Indonesia as a health cluster by collecting, storing, and reporting information on the state of survivors and their health from various sources during disasters. Data were collected on paper and transferred to Microsoft Excel spreadsheets. These activities are challenging because there are no standards for data collection. The World Health Organization (WHO) introduced a standard for health data collection during disasters for emergency medical teams (EMTs) in the form of a minimum dataset (MDS). Meanwhile, the Ministry of Health of Indonesia launched the SATUSEHAT platform to integrate all electronic medical records in Indonesia based on Fast Healthcare Interoperability Resources (FHIR).

Objective: This study aims to implement the WHO EMT MDS to create a disaster profile for the SATUSEHAT platform using FHIR.

Methods: We extracted variables from 2 EMT MDS medical records-the WHO and Association of Southeast Asian Nations (ASEAN) versions-and the daily reporting form. We then performed a mapping process to match these variables with the FHIR resources and analyzed the gaps between the variables and base resources. Next, we conducted profiling to see if there were any changes in the selected resources and created extensions to fill the gap using the Forge application. Subsequently, the profile was implemented using an open-source FHIR server.

Results: The total numbers of variables extracted from the WHO EMT MDS, ASEAN EMT MDS, and daily reporting forms were 30, 32, and 46, with the percentage of variables matching FHIR resources being 100% (30/30), 97% (31/32), and 85% (39/46), respectively. From the 40 resources available in the FHIR ID core, we used 10, 14, and 9 for the WHO EMT MDS, ASEAN EMT MDS, and daily reporting form, respectively. Based on the gap analysis, we found 4 variables in the daily reporting form that were not covered by the resources. Thus, we created extensions to address this gap.

Conclusions: We successfully created a disaster profile that can be used as a disaster case for the SATUSEHAT platform. This profile may standardize health data collection during disasters.

背景:印度尼西亚国家灾害管理局(Badan Nasional Penanggulangan Bencana印度尼西亚国家灾害管理局(Badan Nasional Penanggulangan Bencana)通过收集、储存和报告灾害期间各种来源的幸存者状况及其健康信息,将灾害管理作为一个健康集群来处理。数据收集在纸上,然后转入 Microsoft Excel 电子表格。这些活动具有挑战性,因为没有数据收集标准。世界卫生组织(WHO)以最低数据集(MDS)的形式为紧急医疗队(EMTs)引入了灾难期间健康数据收集标准。与此同时,印度尼西亚卫生部启动了 SATUSEHAT 平台,以快速医疗互操作性资源(FHIR)为基础整合印度尼西亚的所有电子病历:本研究旨在实施世界卫生组织 EMT MDS,利用 FHIR 为 SATUSEHAT 平台创建灾难档案:我们从两个 EMT MDS 医疗记录(世卫组织和东南亚国家联盟(东盟)版本)和每日报告表中提取了变量。然后,我们进行了映射处理,将这些变量与 FHIR 资源相匹配,并分析了变量与基础资源之间的差距。接下来,我们进行了剖析,以了解所选资源是否有任何变化,并使用 Forge 应用程序创建了扩展来填补空白。随后,我们使用开源的 FHIR 服务器实施了剖析:从 WHO EMT MDS、ASEAN EMT MDS 和每日报告表中提取的变量总数分别为 30、32 和 46 个,与 FHIR 资源匹配的变量百分比分别为 100%(30/30)、97%(31/32)和 85%(39/46)。在 FHIR ID 核心的 40 个可用资源中,我们分别使用了 10、14 和 9 个资源用于 WHO EMT MDS、ASEAN EMT MDS 和每日报告表。根据差距分析,我们发现每日报告表中有 4 个变量未被资源涵盖。因此,我们创建了扩展功能来弥补这一不足:我们成功创建了一个灾难档案,可用作 SATUSEHAT 平台的灾难案例。该档案可使灾害期间的健康数据收集标准化。
{"title":"Implementation of the World Health Organization Minimum Dataset for Emergency Medical Teams to Create Disaster Profiles for the Indonesian SATUSEHAT Platform Using Fast Healthcare Interoperability Resources: Development and Validation Study.","authors":"Hiro Putra Faisal, Masaharu Nakayama","doi":"10.2196/59651","DOIUrl":"10.2196/59651","url":null,"abstract":"<p><strong>Background: </strong>The National Disaster Management Agency (Badan Nasional Penanggulangan Bencana) handles disaster management in Indonesia as a health cluster by collecting, storing, and reporting information on the state of survivors and their health from various sources during disasters. Data were collected on paper and transferred to Microsoft Excel spreadsheets. These activities are challenging because there are no standards for data collection. The World Health Organization (WHO) introduced a standard for health data collection during disasters for emergency medical teams (EMTs) in the form of a minimum dataset (MDS). Meanwhile, the Ministry of Health of Indonesia launched the SATUSEHAT platform to integrate all electronic medical records in Indonesia based on Fast Healthcare Interoperability Resources (FHIR).</p><p><strong>Objective: </strong>This study aims to implement the WHO EMT MDS to create a disaster profile for the SATUSEHAT platform using FHIR.</p><p><strong>Methods: </strong>We extracted variables from 2 EMT MDS medical records-the WHO and Association of Southeast Asian Nations (ASEAN) versions-and the daily reporting form. We then performed a mapping process to match these variables with the FHIR resources and analyzed the gaps between the variables and base resources. Next, we conducted profiling to see if there were any changes in the selected resources and created extensions to fill the gap using the Forge application. Subsequently, the profile was implemented using an open-source FHIR server.</p><p><strong>Results: </strong>The total numbers of variables extracted from the WHO EMT MDS, ASEAN EMT MDS, and daily reporting forms were 30, 32, and 46, with the percentage of variables matching FHIR resources being 100% (30/30), 97% (31/32), and 85% (39/46), respectively. From the 40 resources available in the FHIR ID core, we used 10, 14, and 9 for the WHO EMT MDS, ASEAN EMT MDS, and daily reporting form, respectively. Based on the gap analysis, we found 4 variables in the daily reporting form that were not covered by the resources. Thus, we created extensions to address this gap.</p><p><strong>Conclusions: </strong>We successfully created a disaster profile that can be used as a disaster case for the SATUSEHAT platform. This profile may standardize health data collection during disasters.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373372/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142082647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Effect of Electronic Health Record Data Quality on Identifying Patients With Type 2 Diabetes: Cross-Sectional Study. 评估电子健康记录数据质量对识别 2 型糖尿病患者的影响:横断面研究
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-27 DOI: 10.2196/56734
Priyanka Dua Sood, Star Liu, Harold Lehmann, Hadi Kharrazi
<p><strong>Background: </strong>Increasing and substantial reliance on electronic health records (EHRs) and data types (ie, diagnosis, medication, and laboratory data) demands assessment of their data quality as a fundamental approach, especially since there is a need to identify appropriate denominator populations with chronic conditions, such as type 2 diabetes (T2D), using commonly available computable phenotype definitions (ie, phenotypes).</p><p><strong>Objective: </strong>To bridge this gap, our study aims to assess how issues of EHR data quality and variations and robustness (or lack thereof) in phenotypes may have potential impacts in identifying denominator populations.</p><p><strong>Methods: </strong>Approximately 208,000 patients with T2D were included in our study, which used retrospective EHR data from the Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (ie, age, sex, race, and ethnicity), use of health care (inpatient and emergency room visits), and the average Charlson Comorbidity Index score of each phenotype. We then used different methods to induce or simulate data quality issues of completeness, accuracy, and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped diagnosis, medication, and laboratory codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a diagnosis or medication code with another code of the same data type and induced 2% incremental change from -100% to +10% in laboratory result values; and lastly, for timeliness, data were modeled for induced incremental shift of date records by 30 days to 365 days.</p><p><strong>Results: </strong>Less than a quarter (n=47,326, 23%) of the population overlapped across all phenotypes using EHRs. The population identified by each phenotype varied across all combinations of data types. Induced incompleteness identified fewer patients with each increment; for example, at 100% diagnostic incompleteness, the Chronic Conditions Data Warehouse phenotype identified zero patients, as its phenotypic characteristics included only diagnosis codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype, therefore resulting in fewer patients being identified with each incremental change.</p><p><strong>Conclusions: </strong>We used EHR data with diagnosis, medication, and laboratory data types from a large tertiary hospital system to understand T2D phenotypic differences and performance. We used induced data quality methods to learn how data quality issues may impact identification of the denominator populations upon which clinical (eg, clinical research and trials, population health evaluations) and financial or operational decisions are made. The novel results from our study may inform future a
背景:人们对电子健康记录(EHR)和数据类型(即诊断、用药和实验室数据)的依赖程度越来越高,这就要求对其数据质量进行评估,并将其作为一项基本方法,尤其是因为需要利用常用的可计算表型定义(即表型)来确定患有慢性疾病(如 2 型糖尿病)的适当分母人群:为了弥补这一差距,我们的研究旨在评估电子病历数据的质量和差异以及表型的稳健性(或缺乏稳健性)问题如何对确定分母人群产生潜在影响:我们的研究纳入了约 20.8 万名 T2D 患者,使用的是 2017-2019 年期间约翰霍普金斯医疗机构(JHMI)的回顾性 EHR 数据。我们的评估包括 4 种已发表的表型和 1 种来自霍普金斯大学专家小组的定义。我们对人口统计学(即年龄、性别、种族和民族)、医疗保健使用(住院和急诊就诊)以及每种表型的平均 Charlson 生病指数评分进行了描述性分析。然后,我们使用不同的方法分别诱导或模拟每种表型的数据完整性、准确性和及时性等数据质量问题。在诱导数据不完整性方面,我们的模型以10%的增量随机丢弃诊断、用药和实验室代码;在诱导数据不准确方面,我们的模型随机用相同数据类型的另一个代码替换诊断或用药代码,并诱导实验室结果值从-100%到+10%的2%的增量变化;最后,在及时性方面,我们对数据进行建模,诱导日期记录从30天到365天的增量变化:使用电子病历的人群中,只有不到四分之一(n=47 326,23%)的人与所有表型重叠。在所有数据类型组合中,每种表型所识别的人群各不相同。每递增一次,诱导不完整性识别出的患者人数就会减少;例如,当诊断不完整性达到 100% 时,慢性病数据仓库表型识别出的患者人数为零,因为其表型特征仅包括诊断代码。诱导的不准确性和及时性同样显示了每种表型的性能差异,因此导致每一次增量变化所识别的患者数量减少:我们使用了一家大型三级医院系统中包含诊断、用药和实验室数据类型的电子病历数据,以了解 T2D 表型的差异和性能。我们使用了诱导数据质量方法,以了解数据质量问题如何影响分母人群的识别,而临床(如临床研究和试验、人群健康评估)和财务或运营决策正是基于这些分母人群做出的。我们研究得出的新结果可能会为未来制定共同的 T2D 可计算表型定义提供参考,该定义可应用于临床信息学、慢性病管理以及医疗保健领域的其他全行业工作。
{"title":"Assessing the Effect of Electronic Health Record Data Quality on Identifying Patients With Type 2 Diabetes: Cross-Sectional Study.","authors":"Priyanka Dua Sood, Star Liu, Harold Lehmann, Hadi Kharrazi","doi":"10.2196/56734","DOIUrl":"10.2196/56734","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Increasing and substantial reliance on electronic health records (EHRs) and data types (ie, diagnosis, medication, and laboratory data) demands assessment of their data quality as a fundamental approach, especially since there is a need to identify appropriate denominator populations with chronic conditions, such as type 2 diabetes (T2D), using commonly available computable phenotype definitions (ie, phenotypes).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;To bridge this gap, our study aims to assess how issues of EHR data quality and variations and robustness (or lack thereof) in phenotypes may have potential impacts in identifying denominator populations.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Approximately 208,000 patients with T2D were included in our study, which used retrospective EHR data from the Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (ie, age, sex, race, and ethnicity), use of health care (inpatient and emergency room visits), and the average Charlson Comorbidity Index score of each phenotype. We then used different methods to induce or simulate data quality issues of completeness, accuracy, and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped diagnosis, medication, and laboratory codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a diagnosis or medication code with another code of the same data type and induced 2% incremental change from -100% to +10% in laboratory result values; and lastly, for timeliness, data were modeled for induced incremental shift of date records by 30 days to 365 days.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Less than a quarter (n=47,326, 23%) of the population overlapped across all phenotypes using EHRs. The population identified by each phenotype varied across all combinations of data types. Induced incompleteness identified fewer patients with each increment; for example, at 100% diagnostic incompleteness, the Chronic Conditions Data Warehouse phenotype identified zero patients, as its phenotypic characteristics included only diagnosis codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype, therefore resulting in fewer patients being identified with each incremental change.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;We used EHR data with diagnosis, medication, and laboratory data types from a large tertiary hospital system to understand T2D phenotypic differences and performance. We used induced data quality methods to learn how data quality issues may impact identification of the denominator populations upon which clinical (eg, clinical research and trials, population health evaluations) and financial or operational decisions are made. The novel results from our study may inform future a","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370182/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142074615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extraction of Substance Use Information From Clinical Notes: Generative Pretrained Transformer-Based Investigation. 从临床笔记中提取药物使用信息:基于 GPT 的研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-19 DOI: 10.2196/56243
Fatemeh Shah-Mohammadi, Joseph Finkelstein
<p><strong>Background: </strong>Understanding the multifaceted nature of health outcomes requires a comprehensive examination of the social, economic, and environmental determinants that shape individual well-being. Among these determinants, behavioral factors play a crucial role, particularly the consumption patterns of psychoactive substances, which have important implications on public health. The Global Burden of Disease Study shows a growing impact in disability-adjusted life years due to substance use. The successful identification of patients' substance use information equips clinical care teams to address substance-related issues more effectively, enabling targeted support and ultimately improving patient outcomes.</p><p><strong>Objective: </strong>Traditional natural language processing methods face limitations in accurately parsing diverse clinical language associated with substance use. Large language models offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of the generative pretrained transformer (GPT) model in specific GPT-3.5 for extracting tobacco, alcohol, and substance use information from patient discharge summaries in zero-shot and few-shot learning settings. This study contributes to the evolving landscape of health care informatics by showcasing the potential of advanced language models in extracting nuanced information critical for enhancing patient care.</p><p><strong>Methods: </strong>The main data source for analysis in this paper is Medical Information Mart for Intensive Care III data set. Among all notes in this data set, we focused on discharge summaries. Prompt engineering was undertaken, involving an iterative exploration of diverse prompts. Leveraging carefully curated examples and refined prompts, we investigate the model's proficiency through zero-shot as well as few-shot prompting strategies.</p><p><strong>Results: </strong>Results show GPT's varying effectiveness in identifying mentions of tobacco, alcohol, and substance use across learning scenarios. Zero-shot learning showed high accuracy in identifying substance use, whereas few-shot learning reduced accuracy but improved in identifying substance use status, enhancing recall and F<sub>1</sub>-score at the expense of lower precision.</p><p><strong>Conclusions: </strong>Excellence of zero-shot learning in precisely extracting text span mentioning substance use demonstrates its effectiveness in situations in which comprehensive recall is important. Conversely, few-shot learning offers advantages when accurately determining the status of substance use is the primary focus, even if it involves a trade-off in precision. The results contribute to enhancement of early detection and intervention strategies, tailor treatment plans with greater precision, and ultimately, contribute to a holistic understanding of patient health profiles. By integrating these artificial intelligence-driven method
背景:要了解健康结果的多面性,就必须全面研究影响个人福祉的社会、经济和环境决定因素。在这些决定因素中,行为因素起着至关重要的作用,尤其是精神活性物质的消费模式,对公共卫生有着重要影响。全球疾病负担研究》显示,药物使用对残疾调整生命年的影响越来越大。成功识别患者的药物使用信息能让临床护理团队更有效地解决药物相关问题,从而提供有针对性的支持,最终改善患者的治疗效果:传统的自然语言处理(NLP)方法在准确解析与药物使用相关的各种临床语言方面存在局限性。大型语言模型(LLM)通过适应不同的语言模式,有望克服这些挑战。本研究调查了生成式预训练转换器(GPT)模型的应用,特别是 GPT-3.5- 在零镜头和少镜头学习设置中从患者出院摘要中提取烟草、酒精和药物使用信息的应用。这项研究通过展示高级语言模型在提取对加强患者护理至关重要的细微信息方面的潜力,为不断发展的医疗保健信息学做出了贡献:本文分析的主要数据源是重症监护医学信息市场 III(MIMIC-III)数据集。在该数据集中的所有笔记中,我们重点关注出院摘要。我们进行了提示工程,包括对各种提示的反复探索。利用精心策划的示例和改进的提示,我们研究了该模型在零次和少量提示策略下的能力:所展示的结果凸显了 GPT 在提取提及烟草、酒精和药物使用的文本跨度时,在 "零 "和 "少 "两种学习场景下的截然不同的表现。在零次学习场景中,提取烟草、酒精和药物使用信息的准确率明显较高。然而,在少次学习的情况下,准确率则明显下降。相反,与零次学习相比,少次学习在设计物质使用状况方面有显著提高,召回率和 F1 分数也有显著提高。然而,这种提高的代价是,不仅在提取提及使用情况的文本跨度方面,而且在提取使用情况的精确度方面都有所下降:结论:零点学习在精确提取提及药物使用的文本跨度方面的卓越表现,证明了它在全面召回率非常重要的情况下的有效性。相反,当准确判断药物使用状况是主要重点时,即使需要在精确度上做出权衡,零点学习也具有优势。这些结果有助于加强早期检测和干预策略,更精确地定制治疗计划,并最终有助于全面了解患者的健康状况。通过将这些人工智能驱动的方法整合到电子健康记录系统中,临床医生可以即时、全面地了解药物使用情况,从而制定出不仅及时,而且更加个性化和有效的干预措施:
{"title":"Extraction of Substance Use Information From Clinical Notes: Generative Pretrained Transformer-Based Investigation.","authors":"Fatemeh Shah-Mohammadi, Joseph Finkelstein","doi":"10.2196/56243","DOIUrl":"10.2196/56243","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Understanding the multifaceted nature of health outcomes requires a comprehensive examination of the social, economic, and environmental determinants that shape individual well-being. Among these determinants, behavioral factors play a crucial role, particularly the consumption patterns of psychoactive substances, which have important implications on public health. The Global Burden of Disease Study shows a growing impact in disability-adjusted life years due to substance use. The successful identification of patients' substance use information equips clinical care teams to address substance-related issues more effectively, enabling targeted support and ultimately improving patient outcomes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;Traditional natural language processing methods face limitations in accurately parsing diverse clinical language associated with substance use. Large language models offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of the generative pretrained transformer (GPT) model in specific GPT-3.5 for extracting tobacco, alcohol, and substance use information from patient discharge summaries in zero-shot and few-shot learning settings. This study contributes to the evolving landscape of health care informatics by showcasing the potential of advanced language models in extracting nuanced information critical for enhancing patient care.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;The main data source for analysis in this paper is Medical Information Mart for Intensive Care III data set. Among all notes in this data set, we focused on discharge summaries. Prompt engineering was undertaken, involving an iterative exploration of diverse prompts. Leveraging carefully curated examples and refined prompts, we investigate the model's proficiency through zero-shot as well as few-shot prompting strategies.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Results show GPT's varying effectiveness in identifying mentions of tobacco, alcohol, and substance use across learning scenarios. Zero-shot learning showed high accuracy in identifying substance use, whereas few-shot learning reduced accuracy but improved in identifying substance use status, enhancing recall and F&lt;sub&gt;1&lt;/sub&gt;-score at the expense of lower precision.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Excellence of zero-shot learning in precisely extracting text span mentioning substance use demonstrates its effectiveness in situations in which comprehensive recall is important. Conversely, few-shot learning offers advantages when accurately determining the status of substance use is the primary focus, even if it involves a trade-off in precision. The results contribute to enhancement of early detection and intervention strategies, tailor treatment plans with greater precision, and ultimately, contribute to a holistic understanding of patient health profiles. By integrating these artificial intelligence-driven method","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11369538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating and Enhancing the Fitness-for-Purpose of Electronic Health Record Data: Qualitative Study on Current Practices and Pathway to an Automated Approach Within the Medical Informatics for Research and Care in University Medicine Consortium. 评估和加强电子健康记录数据的适用性:在大学医学研究与护理医学信息学联盟内对当前做法和自动化方法途径进行定性研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-08-19 DOI: 10.2196/57153
Gaetan Kamdje Wabo, Preetha Moorthy, Fabian Siegel, Susanne A Seuchter, Thomas Ganslandt

Background: Leveraging electronic health record (EHR) data for clinical or research purposes heavily depends on data fitness. However, there is a lack of standardized frameworks to evaluate EHR data suitability, leading to inconsistent quality in data use projects (DUPs). This research focuses on the Medical Informatics for Research and Care in University Medicine (MIRACUM) Data Integration Centers (DICs) and examines empirical practices on assessing and automating the fitness-for-purpose of clinical data in German DIC settings.

Objective: The study aims (1) to capture and discuss how MIRACUM DICs evaluate and enhance the fitness-for-purpose of observational health care data and examine the alignment with existing recommendations and (2) to identify the requirements for designing and implementing a computer-assisted solution to evaluate EHR data fitness within MIRACUM DICs.

Methods: A qualitative approach was followed using an open-ended survey across DICs of 10 German university hospitals affiliated with MIRACUM. Data were analyzed using thematic analysis following an inductive qualitative method.

Results: All 10 MIRACUM DICs participated, with 17 participants revealing various approaches to assessing data fitness, including the 4-eyes principle and data consistency checks such as cross-system data value comparison. Common practices included a DUP-related feedback loop on data fitness and using self-designed dashboards for monitoring. Most experts had a computer science background and a master's degree, suggesting strong technological proficiency but potentially lacking clinical or statistical expertise. Nine key requirements for a computer-assisted solution were identified, including flexibility, understandability, extendibility, and practicability. Participants used heterogeneous data repositories for evaluating data quality criteria and practical strategies to communicate with research and clinical teams.

Conclusions: The study identifies gaps between current practices in MIRACUM DICs and existing recommendations, offering insights into the complexities of assessing and reporting clinical data fitness. Additionally, a tripartite modular framework for fitness-for-purpose assessment was introduced to streamline the forthcoming implementation. It provides valuable input for developing and integrating an automated solution across multiple locations. This may include statistical comparisons to advanced machine learning algorithms for operationalizing frameworks such as the 3×3 data quality assessment framework. These findings provide foundational evidence for future design and implementation studies to enhance data quality assessments for specific DUPs in observational health care settings.

背景:将电子健康记录(EHR)数据用于临床或研究目的在很大程度上取决于数据的适用性。然而,由于缺乏评估电子健康记录数据适用性的标准化框架,导致数据使用项目(DUP)的质量不一致。本研究以大学医学研究与护理医学信息学(MIRACUM)数据集成中心(DIC)为重点,考察了德国 DIC 设置中临床数据适用性评估和自动化的经验做法:本研究旨在:(1)了解和讨论 MIRACUM DIC 如何评估和加强观察性医疗数据的合用性,并检查与现有建议的一致性;(2)确定设计和实施计算机辅助解决方案的要求,以评估 MIRACUM DIC 中电子病历数据的合用性:方法:采用定性方法,对隶属于 MIRACUM 的 10 家德国大学医院的 DIC 进行开放式调查。采用归纳定性方法对数据进行主题分析:结果:所有 10 家 MIRACUM DIC 都参与了调查,其中 17 位参与者揭示了评估数据合适性的各种方法,包括四眼原则和数据一致性检查(如跨系统数据值比较)。常见做法包括与 DUP 相关的数据适配性反馈回路,以及使用自行设计的仪表板进行监控。大多数专家拥有计算机科学背景和硕士学位,这表明他们具有很强的技术能力,但可能缺乏临床或统计方面的专业知识。他们对计算机辅助解决方案提出了九项关键要求,包括灵活性、可理解性、可扩展性和实用性。参与者使用异构数据存储库评估数据质量标准,并使用实用策略与研究和临床团队进行沟通:研究发现了 MIRACUM DIC 目前的做法与现有建议之间的差距,为评估和报告临床数据适宜性的复杂性提供了见解。此外,为简化即将实施的评估工作,还引入了一个三方模块化框架。它为开发和整合跨多个地点的自动化解决方案提供了宝贵的意见。这可能包括与先进的机器学习算法进行统计比较,以实现 3×3 数据质量评估框架等框架的可操作性。这些发现为未来的设计和实施研究提供了基础证据,以加强对观察性医疗环境中特定 DUP 的数据质量评估。
{"title":"Evaluating and Enhancing the Fitness-for-Purpose of Electronic Health Record Data: Qualitative Study on Current Practices and Pathway to an Automated Approach Within the Medical Informatics for Research and Care in University Medicine Consortium.","authors":"Gaetan Kamdje Wabo, Preetha Moorthy, Fabian Siegel, Susanne A Seuchter, Thomas Ganslandt","doi":"10.2196/57153","DOIUrl":"10.2196/57153","url":null,"abstract":"<p><strong>Background: </strong>Leveraging electronic health record (EHR) data for clinical or research purposes heavily depends on data fitness. However, there is a lack of standardized frameworks to evaluate EHR data suitability, leading to inconsistent quality in data use projects (DUPs). This research focuses on the Medical Informatics for Research and Care in University Medicine (MIRACUM) Data Integration Centers (DICs) and examines empirical practices on assessing and automating the fitness-for-purpose of clinical data in German DIC settings.</p><p><strong>Objective: </strong>The study aims (1) to capture and discuss how MIRACUM DICs evaluate and enhance the fitness-for-purpose of observational health care data and examine the alignment with existing recommendations and (2) to identify the requirements for designing and implementing a computer-assisted solution to evaluate EHR data fitness within MIRACUM DICs.</p><p><strong>Methods: </strong>A qualitative approach was followed using an open-ended survey across DICs of 10 German university hospitals affiliated with MIRACUM. Data were analyzed using thematic analysis following an inductive qualitative method.</p><p><strong>Results: </strong>All 10 MIRACUM DICs participated, with 17 participants revealing various approaches to assessing data fitness, including the 4-eyes principle and data consistency checks such as cross-system data value comparison. Common practices included a DUP-related feedback loop on data fitness and using self-designed dashboards for monitoring. Most experts had a computer science background and a master's degree, suggesting strong technological proficiency but potentially lacking clinical or statistical expertise. Nine key requirements for a computer-assisted solution were identified, including flexibility, understandability, extendibility, and practicability. Participants used heterogeneous data repositories for evaluating data quality criteria and practical strategies to communicate with research and clinical teams.</p><p><strong>Conclusions: </strong>The study identifies gaps between current practices in MIRACUM DICs and existing recommendations, offering insights into the complexities of assessing and reporting clinical data fitness. Additionally, a tripartite modular framework for fitness-for-purpose assessment was introduced to streamline the forthcoming implementation. It provides valuable input for developing and integrating an automated solution across multiple locations. This may include statistical comparisons to advanced machine learning algorithms for operationalizing frameworks such as the 3×3 data quality assessment framework. These findings provide foundational evidence for future design and implementation studies to enhance data quality assessments for specific DUPs in observational health care settings.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11369535/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142001479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1