首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
Comparative Study to Evaluate the Accuracy of Differential Diagnosis Lists Generated by Gemini Advanced, Gemini, and Bard for a Case Report Series Analysis: Cross-Sectional Study. 评估 Gemini Advanced、Gemini 和 Bard 为病例报告系列分析生成的鉴别诊断列表准确性的比较研究:横断面研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-10-02 DOI: 10.2196/63010
Takanobu Hirosawa, Yukinori Harada, Kazuki Tokumasu, Takahiro Ito, Tomoharu Suzuki, Taro Shimizu

Background: Generative artificial intelligence (GAI) systems by Google have recently been updated from Bard to Gemini and Gemini Advanced as of December 2023. Gemini is a basic, free-to-use model after a user's login, while Gemini Advanced operates on a more advanced model requiring a fee-based subscription. These systems have the potential to enhance medical diagnostics. However, the impact of these updates on comprehensive diagnostic accuracy remains unknown.

Objective: This study aimed to compare the accuracy of the differential diagnosis lists generated by Gemini Advanced, Gemini, and Bard across comprehensive medical fields using case report series.

Methods: We identified a case report series with relevant final diagnoses published in the American Journal Case Reports from January 2022 to March 2023. After excluding nondiagnostic cases and patients aged 10 years and younger, we included the remaining case reports. After refining the case parts as case descriptions, we input the same case descriptions into Gemini Advanced, Gemini, and Bard to generate the top 10 differential diagnosis lists. In total, 2 expert physicians independently evaluated whether the final diagnosis was included in the lists and its ranking. Any discrepancies were resolved by another expert physician. Bonferroni correction was applied to adjust the P values for the number of comparisons among 3 GAI systems, setting the corrected significance level at P value <.02.

Results: In total, 392 case reports were included. The inclusion rates of the final diagnosis within the top 10 differential diagnosis lists were 73% (286/392) for Gemini Advanced, 76.5% (300/392) for Gemini, and 68.6% (269/392) for Bard. The top diagnoses matched the final diagnoses in 31.6% (124/392) for Gemini Advanced, 42.6% (167/392) for Gemini, and 31.4% (123/392) for Bard. Gemini demonstrated higher diagnostic accuracy than Bard both within the top 10 differential diagnosis lists (P=.02) and as the top diagnosis (P=.001). In addition, Gemini Advanced achieved significantly lower accuracy than Gemini in identifying the most probable diagnosis (P=.002).

Conclusions: The results of this study suggest that Gemini outperformed Bard in diagnostic accuracy following the model update. However, Gemini Advanced requires further refinement to optimize its performance for future artificial intelligence-enhanced diagnostics. These findings should be interpreted cautiously and considered primarily for research purposes, as these GAI systems have not been adjusted for medical diagnostics nor approved for clinical use.

背景介绍最近,谷歌的生成式人工智能(GAI)系统从 Bard 升级为 Gemini 和 Gemini Advanced,截止日期为 2023 年 12 月。Gemini 是用户登录后免费使用的基本模式,而 Gemini Advanced 则是需要付费订阅的高级模式。这些系统具有提高医疗诊断水平的潜力。然而,这些更新对综合诊断准确性的影响仍是未知数:本研究旨在利用病例报告系列比较 Gemini Advanced、Gemini 和 Bard 生成的鉴别诊断列表在综合医疗领域的准确性:我们确定了 2022 年 1 月至 2023 年 3 月期间发表在《美国病例报告杂志》(American Journal Case Reports)上的具有相关最终诊断的病例报告系列。在排除非诊断性病例和年龄在 10 岁及以下的患者后,我们纳入了剩余的病例报告。将病例部分细化为病例描述后,我们将相同的病例描述输入到 Gemini Advanced、Gemini 和 Bard 中,生成前 10 位鉴别诊断列表。共有两名专家医师独立评估最终诊断是否包含在列表中及其排名。任何差异均由另一位专家医师解决。根据 3 个 GAI 系统之间的比较次数,对 P 值进行了 Bonferroni 校正,将校正后的显著性水平设定为 P 值 结果:共纳入 392 份病例报告。最终诊断在前 10 个鉴别诊断列表中的纳入率分别为:Gemini Advanced 73%(286/392)、Gemini 76.5%(300/392)和 Bard 68.6%(269/392)。Gemini Advanced 有 31.6%(124/392)、Gemini 有 42.6%(167/392)和 Bard 有 31.4%(123/392)的最高诊断与最终诊断相吻合。在前 10 个鉴别诊断列表中(P=.02),Gemini 的诊断准确率高于 Bard(P=.001)。此外,在确定最可能的诊断方面,Gemini Advanced 的准确性明显低于 Gemini(P=.002):本研究结果表明,模型更新后,双子座在诊断准确性方面优于巴德。然而,Gemini Advanced 还需要进一步改进,以优化其在未来人工智能增强诊断中的表现。由于这些 GAI 系统尚未针对医疗诊断进行调整,也未被批准用于临床,因此应谨慎解读这些研究结果,并将其主要用于研究目的。
{"title":"Comparative Study to Evaluate the Accuracy of Differential Diagnosis Lists Generated by Gemini Advanced, Gemini, and Bard for a Case Report Series Analysis: Cross-Sectional Study.","authors":"Takanobu Hirosawa, Yukinori Harada, Kazuki Tokumasu, Takahiro Ito, Tomoharu Suzuki, Taro Shimizu","doi":"10.2196/63010","DOIUrl":"https://doi.org/10.2196/63010","url":null,"abstract":"<p><strong>Background: </strong>Generative artificial intelligence (GAI) systems by Google have recently been updated from Bard to Gemini and Gemini Advanced as of December 2023. Gemini is a basic, free-to-use model after a user's login, while Gemini Advanced operates on a more advanced model requiring a fee-based subscription. These systems have the potential to enhance medical diagnostics. However, the impact of these updates on comprehensive diagnostic accuracy remains unknown.</p><p><strong>Objective: </strong>This study aimed to compare the accuracy of the differential diagnosis lists generated by Gemini Advanced, Gemini, and Bard across comprehensive medical fields using case report series.</p><p><strong>Methods: </strong>We identified a case report series with relevant final diagnoses published in the American Journal Case Reports from January 2022 to March 2023. After excluding nondiagnostic cases and patients aged 10 years and younger, we included the remaining case reports. After refining the case parts as case descriptions, we input the same case descriptions into Gemini Advanced, Gemini, and Bard to generate the top 10 differential diagnosis lists. In total, 2 expert physicians independently evaluated whether the final diagnosis was included in the lists and its ranking. Any discrepancies were resolved by another expert physician. Bonferroni correction was applied to adjust the P values for the number of comparisons among 3 GAI systems, setting the corrected significance level at P value <.02.</p><p><strong>Results: </strong>In total, 392 case reports were included. The inclusion rates of the final diagnosis within the top 10 differential diagnosis lists were 73% (286/392) for Gemini Advanced, 76.5% (300/392) for Gemini, and 68.6% (269/392) for Bard. The top diagnoses matched the final diagnoses in 31.6% (124/392) for Gemini Advanced, 42.6% (167/392) for Gemini, and 31.4% (123/392) for Bard. Gemini demonstrated higher diagnostic accuracy than Bard both within the top 10 differential diagnosis lists (P=.02) and as the top diagnosis (P=.001). In addition, Gemini Advanced achieved significantly lower accuracy than Gemini in identifying the most probable diagnosis (P=.002).</p><p><strong>Conclusions: </strong>The results of this study suggest that Gemini outperformed Bard in diagnostic accuracy following the model update. However, Gemini Advanced requires further refinement to optimize its performance for future artificial intelligence-enhanced diagnostics. These findings should be interpreted cautiously and considered primarily for research purposes, as these GAI systems have not been adjusted for medical diagnostics nor approved for clinical use.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing Information Biases Within Electronic Health Record Data to Improve the Examination of Epidemiologic Associations With Diabetes Prevalence Among Young Adults: Cross-Sectional Study. 解决电子健康记录数据中的信息偏差,改进对年轻人糖尿病患病率流行病学关联的研究:横断面研究
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-10-01 DOI: 10.2196/58085
Sarah Conderino, Rebecca Anthopolos, Sandra S Albrecht, Shannon M Farley, Jasmin Divers, Andrea R Titus, Lorna E Thorpe

Background: Electronic health records (EHRs) are increasingly used for epidemiologic research to advance public health practice. However, key variables are susceptible to missing data or misclassification within EHRs, including demographic information or disease status, which could affect the estimation of disease prevalence or risk factor associations.

Objective: In this paper, we applied methods from the literature on missing data and causal inference to assess whether we could mitigate information biases when estimating measures of association between potential risk factors and diabetes among a patient population of New York City young adults.

Methods: We estimated the odds ratio (OR) for diabetes by race or ethnicity and asthma status using EHR data from NYU Langone Health. Methods from the missing data and causal inference literature were then applied to assess the ability to control for misclassification of health outcomes in the EHR data. We compared EHR-based associations with associations observed from 2 national health surveys, the Behavioral Risk Factor Surveillance System (BRFSS) and the National Health and Nutrition Examination Survey, representing traditional public health surveillance systems.

Results: Observed EHR-based associations between race or ethnicity and diabetes were comparable to health survey-based estimates, but the association between asthma and diabetes was significantly overestimated (OREHR 3.01, 95% CI 2.86-3.18 vs ORBRFSS 1.23, 95% CI 1.09-1.40). Missing data and causal inference methods reduced information biases in these estimates, yielding relative differences from traditional estimates below 50% (ORMissingData 1.79, 95% CI 1.67-1.92 and ORCausal 1.42, 95% CI 1.34-1.51).

Conclusions: Findings suggest that without bias adjustment, EHR analyses may yield biased measures of association, driven in part by subgroup differences in health care use. However, applying missing data or causal inference frameworks can help control for and, importantly, characterize residual information biases in these estimates.

背景:电子健康记录(EHR)越来越多地用于流行病学研究,以促进公共卫生实践。然而,电子健康记录中的关键变量容易出现数据缺失或分类错误,包括人口统计信息或疾病状态,这可能会影响对疾病流行率或风险因素关联的估计:在本文中,我们应用了文献中有关缺失数据和因果推断的方法,以评估在估算纽约市年轻成人患者群体中潜在风险因素与糖尿病之间的关联时,我们是否能减轻信息偏差:我们利用纽约大学朗贡医疗中心的电子病历数据,按种族或民族以及哮喘状况估算了糖尿病的几率比(OR)。然后应用缺失数据和因果推断文献中的方法来评估控制 EHR 数据中健康结果分类错误的能力。我们将基于电子病历的关联与代表传统公共卫生监测系统的行为风险因素监测系统(BRFSS)和国家健康与营养调查(National Health and Nutrition Examination Survey)这两项国家健康调查中观察到的关联进行了比较:基于电子病历观察到的种族或民族与糖尿病之间的关系与基于健康调查的估计值相当,但哮喘与糖尿病之间的关系被明显高估(OREHR 3.01,95% CI 2.86-3.18 vs ORBRFSS 1.23,95% CI 1.09-1.40)。缺失数据和因果推断方法减少了这些估计值的信息偏差,与传统估计值的相对差异低于 50%(ORMissingData 1.79,95% CI 1.67-1.92 和 ORCausal 1.42,95% CI 1.34-1.51):研究结果表明,如果不进行偏倚调整,电子病历分析可能会产生偏倚的关联测量,部分原因是亚组在医疗保健使用方面存在差异。然而,应用缺失数据或因果推断框架有助于控制这些估计值中的残余信息偏差,重要的是,还有助于描述这些偏差的特征。
{"title":"Addressing Information Biases Within Electronic Health Record Data to Improve the Examination of Epidemiologic Associations With Diabetes Prevalence Among Young Adults: Cross-Sectional Study.","authors":"Sarah Conderino, Rebecca Anthopolos, Sandra S Albrecht, Shannon M Farley, Jasmin Divers, Andrea R Titus, Lorna E Thorpe","doi":"10.2196/58085","DOIUrl":"https://doi.org/10.2196/58085","url":null,"abstract":"<p><strong>Background: </strong>Electronic health records (EHRs) are increasingly used for epidemiologic research to advance public health practice. However, key variables are susceptible to missing data or misclassification within EHRs, including demographic information or disease status, which could affect the estimation of disease prevalence or risk factor associations.</p><p><strong>Objective: </strong>In this paper, we applied methods from the literature on missing data and causal inference to assess whether we could mitigate information biases when estimating measures of association between potential risk factors and diabetes among a patient population of New York City young adults.</p><p><strong>Methods: </strong>We estimated the odds ratio (OR) for diabetes by race or ethnicity and asthma status using EHR data from NYU Langone Health. Methods from the missing data and causal inference literature were then applied to assess the ability to control for misclassification of health outcomes in the EHR data. We compared EHR-based associations with associations observed from 2 national health surveys, the Behavioral Risk Factor Surveillance System (BRFSS) and the National Health and Nutrition Examination Survey, representing traditional public health surveillance systems.</p><p><strong>Results: </strong>Observed EHR-based associations between race or ethnicity and diabetes were comparable to health survey-based estimates, but the association between asthma and diabetes was significantly overestimated (OREHR 3.01, 95% CI 2.86-3.18 vs ORBRFSS 1.23, 95% CI 1.09-1.40). Missing data and causal inference methods reduced information biases in these estimates, yielding relative differences from traditional estimates below 50% (ORMissingData 1.79, 95% CI 1.67-1.92 and ORCausal 1.42, 95% CI 1.34-1.51).</p><p><strong>Conclusions: </strong>Findings suggest that without bias adjustment, EHR analyses may yield biased measures of association, driven in part by subgroup differences in health care use. However, applying missing data or causal inference frameworks can help control for and, importantly, characterize residual information biases in these estimates.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study. 通过 "一对一 "分类消除临床缩略语的歧义:算法开发与验证研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-10-01 DOI: 10.2196/56955
Sheng-Feng Sung, Ya-Han Hu, Chong-Yan Chen

Background: Electronic medical records store extensive patient data and serve as a comprehensive repository, including textual medical records like surgical and imaging reports. Their utility in clinical decision support systems is substantial, but the widespread use of ambiguous and unstandardized abbreviations in clinical documents poses challenges for natural language processing in clinical decision support systems. Efficient abbreviation disambiguation methods are needed for effective information extraction.

Objective: This study aims to enhance the one-to-all (OTA) framework for clinical abbreviation expansion, which uses a single model to predict multiple abbreviation meanings. The objective is to improve OTA by developing context-candidate pairs and optimizing word embeddings in Bidirectional Encoder Representations From Transformers (BERT), evaluating the model's efficacy in expanding clinical abbreviations using real data.

Methods: Three datasets were used: Medical Subject Headings Word Sense Disambiguation, University of Minnesota, and Chia-Yi Christian Hospital from Ditmanson Medical Foundation Chia-Yi Christian Hospital. Texts containing polysemous abbreviations were preprocessed and formatted for BERT. The study involved fine-tuning pretrained models, ClinicalBERT and BlueBERT, generating dataset pairs for training and testing based on Huang et al's method.

Results: BlueBERT achieved macro- and microaccuracies of 95.41% and 95.16%, respectively, on the Medical Subject Headings Word Sense Disambiguation dataset. It improved macroaccuracy by 0.54%-1.53% compared to two baselines, long short-term memory and deepBioWSD with random embedding. On the University of Minnesota dataset, BlueBERT recorded macro- and microaccuracies of 98.40% and 98.22%, respectively. Against the baselines of Word2Vec + support vector machine and BioWordVec + support vector machine, BlueBERT demonstrated a macroaccuracy improvement of 2.61%-4.13%.

Conclusions: This research preliminarily validated the effectiveness of the OTA method for abbreviation disambiguation in medical texts, demonstrating the potential to enhance both clinical staff efficiency and research effectiveness.

背景:电子病历存储了大量患者数据,是一个综合性的资料库,其中包括手术和成像报告等文本医疗记录。它们在临床决策支持系统中的作用非常大,但临床文档中广泛使用含糊不清和未标准化的缩写,这给临床决策支持系统中的自然语言处理带来了挑战。为了有效提取信息,需要高效的缩写消歧方法:本研究旨在改进用于临床缩写扩展的一对全(OTA)框架,该框架使用单一模型预测多个缩写的含义。其目的是通过开发上下文候选对和优化双向编码器变换器表征(BERT)中的词嵌入来改进 OTA,并使用真实数据评估该模型在扩展临床缩写方面的功效:方法:使用了三个数据集:方法:使用了三个数据集:医学主题词表词义消歧、明尼苏达大学和 Ditmanson 医学基金会嘉义基督教医院。包含多义缩写的文本经过预处理和格式化后用于 BERT。研究包括微调预训练模型、ClinicalBERT 和 BlueBERT,根据 Huang 等人的方法生成用于训练和测试的数据集对:在医学主题词词义消歧数据集上,BlueBERT 的宏观准确率和微观准确率分别达到 95.41% 和 95.16%。与两个基线(长短期记忆和随机嵌入的 deepBioWSD)相比,它的宏观准确率提高了 0.54%-1.53%。在明尼苏达大学的数据集上,BlueBERT 的宏观准确率和微观准确率分别达到了 98.40% 和 98.22%。与 Word2Vec + 支持向量机和 BioWordVec + 支持向量机的基线相比,BlueBERT 的宏观准确率提高了 2.61%-4.13% :这项研究初步验证了 OTA 方法在医学文本中进行缩写消歧的有效性,显示了其提高临床工作人员效率和研究效果的潜力。
{"title":"Disambiguating Clinical Abbreviations by One-to-All Classification: Algorithm Development and Validation Study.","authors":"Sheng-Feng Sung, Ya-Han Hu, Chong-Yan Chen","doi":"10.2196/56955","DOIUrl":"https://doi.org/10.2196/56955","url":null,"abstract":"<p><strong>Background: </strong>Electronic medical records store extensive patient data and serve as a comprehensive repository, including textual medical records like surgical and imaging reports. Their utility in clinical decision support systems is substantial, but the widespread use of ambiguous and unstandardized abbreviations in clinical documents poses challenges for natural language processing in clinical decision support systems. Efficient abbreviation disambiguation methods are needed for effective information extraction.</p><p><strong>Objective: </strong>This study aims to enhance the one-to-all (OTA) framework for clinical abbreviation expansion, which uses a single model to predict multiple abbreviation meanings. The objective is to improve OTA by developing context-candidate pairs and optimizing word embeddings in Bidirectional Encoder Representations From Transformers (BERT), evaluating the model's efficacy in expanding clinical abbreviations using real data.</p><p><strong>Methods: </strong>Three datasets were used: Medical Subject Headings Word Sense Disambiguation, University of Minnesota, and Chia-Yi Christian Hospital from Ditmanson Medical Foundation Chia-Yi Christian Hospital. Texts containing polysemous abbreviations were preprocessed and formatted for BERT. The study involved fine-tuning pretrained models, ClinicalBERT and BlueBERT, generating dataset pairs for training and testing based on Huang et al's method.</p><p><strong>Results: </strong>BlueBERT achieved macro- and microaccuracies of 95.41% and 95.16%, respectively, on the Medical Subject Headings Word Sense Disambiguation dataset. It improved macroaccuracy by 0.54%-1.53% compared to two baselines, long short-term memory and deepBioWSD with random embedding. On the University of Minnesota dataset, BlueBERT recorded macro- and microaccuracies of 98.40% and 98.22%, respectively. Against the baselines of Word2Vec + support vector machine and BioWordVec + support vector machine, BlueBERT demonstrated a macroaccuracy improvement of 2.61%-4.13%.</p><p><strong>Conclusions: </strong>This research preliminarily validated the effectiveness of the OTA method for abbreviation disambiguation in medical texts, demonstrating the potential to enhance both clinical staff efficiency and research effectiveness.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practical Aspects of Using Large Language Models to Screen Abstracts for Cardiovascular Drug Development: Cross-Sectional Study. 使用大型语言模型筛选心血管药物开发摘要的实用方面:横断面研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-30 DOI: 10.2196/64143
Jay G Ronquillo, Jamie Ye, Donal Gorman, Adina R Lemeshow, Stephen J Watt

Unlabelled: Cardiovascular drug development requires synthesizing relevant literature about indications, mechanisms, biomarkers, and outcomes. This short study investigates the performance, cost, and prompt engineering trade-offs of 3 large language models accelerating the literature screening process for cardiovascular drug development applications.

无标签:心血管药物开发需要综合有关适应症、机制、生物标记物和结果的相关文献。这项简短的研究调查了 3 个大型语言模型的性能、成本和及时工程权衡,这些模型可加速心血管药物开发应用的文献筛选过程。
{"title":"Practical Aspects of Using Large Language Models to Screen Abstracts for Cardiovascular Drug Development: Cross-Sectional Study.","authors":"Jay G Ronquillo, Jamie Ye, Donal Gorman, Adina R Lemeshow, Stephen J Watt","doi":"10.2196/64143","DOIUrl":"https://doi.org/10.2196/64143","url":null,"abstract":"<p><strong>Unlabelled: </strong>Cardiovascular drug development requires synthesizing relevant literature about indications, mechanisms, biomarkers, and outcomes. This short study investigates the performance, cost, and prompt engineering trade-offs of 3 large language models accelerating the literature screening process for cardiovascular drug development applications.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Better Semantic Interoperability of Data Element Repositories in Medicine: Analysis Study. 实现医学数据元素库更好的语义互操作性:分析研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-30 DOI: 10.2196/60293
Zhengyong Hu, Anran Wang, Yifan Duan, Jiayin Zhou, Wanfei Hu, Sizhu Wu

Background: Data element repositories facilitate high-quality medical data sharing by standardizing data and enhancing semantic interoperability. However, the application of repositories is confined to specific projects and institutions.

Objective: This study aims to explore potential issues and promote broader application of data element repositories within the medical field by evaluating and analyzing typical repositories.

Methods: Following the inclusion of 5 data element repositories through a literature review, a novel analysis framework consisting of 7 dimensions and 36 secondary indicators was constructed and used for evaluation and analysis.

Results: The study's results delineate the unique characteristics of different repositories and uncover specific issues in their construction. These issues include the absence of data reuse protocols and insufficient information regarding the application scenarios and efficacy of data elements. The repositories fully comply with only 45% (9/20) of the subprinciples for Findable and Reusable in the FAIR principle, while achieving a 90% (19/20 subprinciples) compliance rate for Accessible and 67% (10/15 subprinciples) for Interoperable.

Conclusions: The recommendations proposed in this study address the issues to improve the construction and application of repositories, offering valuable insights to data managers, computer experts, and other pertinent stakeholders.

背景:数据元素资源库通过数据标准化和增强语义互操作性,促进了高质量的医疗数据共享。然而,存储库的应用仅限于特定的项目和机构:本研究旨在通过评估和分析典型的数据元素库,探讨潜在的问题并促进数据元素库在医疗领域的更广泛应用:方法:通过文献综述纳入 5 个数据元素库后,构建了由 7 个维度和 36 个二级指标组成的新型分析框架,并用于评估和分析:研究结果:研究结果描述了不同资源库的独特性,并揭示了其建设过程中的具体问题。这些问题包括缺乏数据再利用协议,以及有关数据元素的应用场景和功效的信息不足。这些资料库只完全符合 FAIR 原则中可查找和可重用子原则的 45%(9/20),而符合可访问原则的比例为 90%(19/20 子原则),符合可互操作原则的比例为 67%(10/15 子原则):本研究提出的建议解决了改进资料库建设和应用的问题,为数据管理人员、计算机专家和其他相关利益方提供了有价值的见解。
{"title":"Toward Better Semantic Interoperability of Data Element Repositories in Medicine: Analysis Study.","authors":"Zhengyong Hu, Anran Wang, Yifan Duan, Jiayin Zhou, Wanfei Hu, Sizhu Wu","doi":"10.2196/60293","DOIUrl":"https://doi.org/10.2196/60293","url":null,"abstract":"<p><strong>Background: </strong>Data element repositories facilitate high-quality medical data sharing by standardizing data and enhancing semantic interoperability. However, the application of repositories is confined to specific projects and institutions.</p><p><strong>Objective: </strong>This study aims to explore potential issues and promote broader application of data element repositories within the medical field by evaluating and analyzing typical repositories.</p><p><strong>Methods: </strong>Following the inclusion of 5 data element repositories through a literature review, a novel analysis framework consisting of 7 dimensions and 36 secondary indicators was constructed and used for evaluation and analysis.</p><p><strong>Results: </strong>The study's results delineate the unique characteristics of different repositories and uncover specific issues in their construction. These issues include the absence of data reuse protocols and insufficient information regarding the application scenarios and efficacy of data elements. The repositories fully comply with only 45% (9/20) of the subprinciples for Findable and Reusable in the FAIR principle, while achieving a 90% (19/20 subprinciples) compliance rate for Accessible and 67% (10/15 subprinciples) for Interoperable.</p><p><strong>Conclusions: </strong>The recommendations proposed in this study address the issues to improve the construction and application of repositories, offering valuable insights to data managers, computer experts, and other pertinent stakeholders.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of the Observational Medical Outcomes Partnership Model in Electronic Medical Record Systems: Evaluation Study Using Factor Analysis and Decision-Making Trial and Evaluation Laboratory-Best-Worst Methods. 在电子病历系统中实施观察性医疗成果合作模式:使用因素分析和决策试验及评估实验室--最佳--最差方法的评估研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-27 DOI: 10.2196/58498
Ming Luo, Yu Gu, Feilong Zhou, Shaohong Chen

Background: Electronic medical record (EMR) systems are essential in health care for collecting and storing patient medical data. They provide critical information to doctors and caregivers, facilitating improved decision-making and patient care. Despite their significance, optimizing EMR systems is crucial for enhancing health care quality. Implementing the Observational Medical Outcomes Partnership (OMOP) shared data model represents a promising approach to improve EMR performance and overall health care outcomes.

Objective: This study aims to evaluate the effects of implementing the OMOP shared data model in EMR systems and to assess its impact on enhancing health care quality.

Methods: In this study, 3 distinct methodologies are used to explore various aspects of health care information systems. First, factor analysis is utilized to investigate the correlations between EMR systems and attitudes toward OMOP. Second, the best-worst method (BWM) is applied to determine the weights of criteria and subcriteria. Lastly, the decision-making trial and evaluation laboratory technique is used to illustrate the interactions and interdependencies among the identified criteria.

Results: In this research, we evaluated the AliHealth EMR system by surveying 98 users and practitioners to assess its effectiveness and user satisfaction. The study reveals that among all components, "EMR resolution" holds the highest importance with a weight of 0.31007783, highlighting its significant role in the evaluation. Conversely, "EMR ease of use" has the lowest weight of 0.1860467, indicating that stakeholders prioritize the resolution aspect over ease of use in their assessment of EMR systems.

Conclusions: The findings highlight that stakeholders prioritize certain aspects of EMR systems, with "EMR resolution" being the most valued component.

背景:电子病历系统(EMR)是医疗保健领域收集和存储病人医疗数据的重要工具。它们为医生和护理人员提供重要信息,有助于改进决策和病人护理。尽管其意义重大,但优化电子病历系统对提高医疗质量至关重要。实施观察性医疗结果伙伴关系(OMOP)共享数据模型是提高 EMR 性能和整体医疗结果的一种可行方法:本研究旨在评估在 EMR 系统中实施 OMOP 共享数据模型的效果,并评估其对提高医疗质量的影响:本研究采用三种不同的方法探讨医疗信息系统的各个方面。首先,利用因子分析来研究 EMR 系统与对 OMOP 的态度之间的相关性。其次,采用最佳-最差法(BWM)确定标准和次级标准的权重。最后,使用决策试验和评估实验室技术来说明已确定标准之间的相互作用和相互依存关系:在这项研究中,我们通过对 98 名用户和从业人员进行调查,评估了阿里健康 EMR 系统的有效性和用户满意度。研究显示,在所有组成部分中,"电子病历解析度 "的重要性最高,权重为 0.31007783,突出了其在评价中的重要作用。相反,"电子病历易用性 "的权重最低,为 0.1860467,这表明利益相关者在评估电子病历系统时,优先考虑的是分辨率而不是易用性:研究结果突出表明,利益相关者优先考虑电子病历系统的某些方面,其中 "电子病历分辨率 "是最受重视的组成部分。
{"title":"Implementation of the Observational Medical Outcomes Partnership Model in Electronic Medical Record Systems: Evaluation Study Using Factor Analysis and Decision-Making Trial and Evaluation Laboratory-Best-Worst Methods.","authors":"Ming Luo, Yu Gu, Feilong Zhou, Shaohong Chen","doi":"10.2196/58498","DOIUrl":"https://doi.org/10.2196/58498","url":null,"abstract":"<p><strong>Background: </strong>Electronic medical record (EMR) systems are essential in health care for collecting and storing patient medical data. They provide critical information to doctors and caregivers, facilitating improved decision-making and patient care. Despite their significance, optimizing EMR systems is crucial for enhancing health care quality. Implementing the Observational Medical Outcomes Partnership (OMOP) shared data model represents a promising approach to improve EMR performance and overall health care outcomes.</p><p><strong>Objective: </strong>This study aims to evaluate the effects of implementing the OMOP shared data model in EMR systems and to assess its impact on enhancing health care quality.</p><p><strong>Methods: </strong>In this study, 3 distinct methodologies are used to explore various aspects of health care information systems. First, factor analysis is utilized to investigate the correlations between EMR systems and attitudes toward OMOP. Second, the best-worst method (BWM) is applied to determine the weights of criteria and subcriteria. Lastly, the decision-making trial and evaluation laboratory technique is used to illustrate the interactions and interdependencies among the identified criteria.</p><p><strong>Results: </strong>In this research, we evaluated the AliHealth EMR system by surveying 98 users and practitioners to assess its effectiveness and user satisfaction. The study reveals that among all components, \"EMR resolution\" holds the highest importance with a weight of 0.31007783, highlighting its significant role in the evaluation. Conversely, \"EMR ease of use\" has the lowest weight of 0.1860467, indicating that stakeholders prioritize the resolution aspect over ease of use in their assessment of EMR systems.</p><p><strong>Conclusions: </strong>The findings highlight that stakeholders prioritize certain aspects of EMR systems, with \"EMR resolution\" being the most valued component.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Cohort Analytics Tool for Monitoring Progression Patterns in Cardiovascular Diseases: Advanced Stochastic Modeling Approach. 开发用于监测心血管疾病进展模式的队列分析工具:先进的随机建模方法
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-24 DOI: 10.2196/59392
Arindam Brahma, Samir Chatterjee, Kala Seal, Ben Fitzpatrick, Youyou Tao
<p><strong>Background: </strong>The World Health Organization (WHO) reported that cardiovascular diseases (CVDs) are the leading cause of death worldwide. CVDs are chronic, with complex progression patterns involving episodes of comorbidities and multimorbidities. When dealing with chronic diseases, physicians often adopt a "watchful waiting" strategy, and actions are postponed until information is available. Population-level transition probabilities and progression patterns can be revealed by applying time-variant stochastic modeling methods to longitudinal patient data from cohort studies. Inputs from CVD practitioners indicate that tools to generate and visualize cohort transition patterns have many impactful clinical applications. The resultant computational model can be embedded in digital decision support tools for clinicians. However, to date, no study has attempted to accomplish this for CVDs.</p><p><strong>Objective: </strong>This study aims to apply advanced stochastic modeling methods to uncover the transition probabilities and progression patterns from longitudinal episodic data of patient cohorts with CVD and thereafter use the computational model to build a digital clinical cohort analytics artifact demonstrating the actionability of such models.</p><p><strong>Methods: </strong>Our data were sourced from 9 epidemiological cohort studies by the National Heart Lung and Blood Institute and comprised chronological records of 1274 patients associated with 4839 CVD episodes across 16 years. We then used the continuous-time Markov chain method to develop our model, which offers a robust approach to time-variant transitions between disease states in chronic diseases.</p><p><strong>Results: </strong>Our study presents time-variant transition probabilities of CVD state changes, revealing patterns of CVD progression against time. We found that the transition from myocardial infarction (MI) to stroke has the fastest transition rate (mean transition time 3, SD 0 days, because only 1 patient had a MI-to-stroke transition in the dataset), and the transition from MI to angina is the slowest (mean transition time 1457, SD 1449 days). Congestive heart failure is the most probable first episode (371/840, 44.2%), followed by stroke (216/840, 25.7%). The resultant artifact is actionable as it can act as an eHealth cohort analytics tool, helping physicians gain insights into treatment and intervention strategies. Through expert panel interviews and surveys, we found 9 application use cases of our model.</p><p><strong>Conclusions: </strong>Past research does not provide actionable cohort-level decision support tools based on a comprehensive, 10-state, continuous-time Markov chain model to unveil complex CVD progression patterns from real-world patient data and support clinical decision-making. This paper aims to address this crucial limitation. Our stochastic model-embedded artifact can help clinicians in efficient disease monitoring and intervention deci
背景:世界卫生组织(WHO)报告称,心血管疾病(CVDs)是导致全球死亡的主要原因。心血管疾病是慢性病,其发展模式复杂,涉及并发症和多发病的发作。在处理慢性疾病时,医生往往采取 "观察等待 "策略,在获得信息之前推迟行动。将时变随机建模方法应用于队列研究中的患者纵向数据,可以揭示人群水平的转变概率和进展模式。心血管疾病从业人员提供的信息表明,生成和可视化队列过渡模式的工具在临床应用中具有很多影响力。由此产生的计算模型可嵌入临床医生的数字决策支持工具中。然而,迄今为止,还没有研究尝试为心血管疾病实现这一目标:本研究旨在应用先进的随机建模方法,从心血管疾病患者队列的纵向偶发数据中发现转归概率和进展模式,然后利用计算模型构建数字化临床队列分析工具,展示此类模型的可操作性:我们的数据来源于美国国家心肺血液研究所的 9 项流行病学队列研究,包括 1274 名患者 16 年间 4839 次心血管疾病发作的时间记录。然后,我们使用连续时间马尔可夫链方法建立了模型,该方法为慢性疾病中疾病状态之间的时变过渡提供了一种稳健的方法:我们的研究显示了心血管疾病状态变化的时变过渡概率,揭示了心血管疾病随时间发展的模式。我们发现,从心肌梗死(MI)到中风的转变速度最快(平均转变时间为 3 天,标度为 0 天,因为数据集中只有 1 名患者从心肌梗死转变为中风),而从心肌梗死到心绞痛的转变速度最慢(平均转变时间为 1457 天,标度为 1449 天)。充血性心力衰竭最有可能是首次发病(371/840,44.2%),其次是中风(216/840,25.7%)。由此产生的人工智能具有可操作性,因为它可以作为电子健康队列分析工具,帮助医生深入了解治疗和干预策略。通过专家小组访谈和调查,我们发现了模型的 9 个应用案例:过去的研究没有提供基于全面、10 状态、连续时间马尔可夫链模型的可操作队列级决策支持工具,以从真实世界的患者数据中揭示复杂的心血管疾病进展模式并支持临床决策。本文旨在解决这一关键的局限性。我们的随机模型嵌入式人工智能可以帮助临床医生在真实患者数据的客观数据驱动见解的指导下,进行高效的疾病监测和干预决策。此外,只需输入 3 个数据元素:合成患者标识符、病程名称和从基线日期算起的病程时间(以天为单位),所提出的模型就能揭示任何慢性疾病的进展模式。
{"title":"Development of a Cohort Analytics Tool for Monitoring Progression Patterns in Cardiovascular Diseases: Advanced Stochastic Modeling Approach.","authors":"Arindam Brahma, Samir Chatterjee, Kala Seal, Ben Fitzpatrick, Youyou Tao","doi":"10.2196/59392","DOIUrl":"10.2196/59392","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The World Health Organization (WHO) reported that cardiovascular diseases (CVDs) are the leading cause of death worldwide. CVDs are chronic, with complex progression patterns involving episodes of comorbidities and multimorbidities. When dealing with chronic diseases, physicians often adopt a \"watchful waiting\" strategy, and actions are postponed until information is available. Population-level transition probabilities and progression patterns can be revealed by applying time-variant stochastic modeling methods to longitudinal patient data from cohort studies. Inputs from CVD practitioners indicate that tools to generate and visualize cohort transition patterns have many impactful clinical applications. The resultant computational model can be embedded in digital decision support tools for clinicians. However, to date, no study has attempted to accomplish this for CVDs.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to apply advanced stochastic modeling methods to uncover the transition probabilities and progression patterns from longitudinal episodic data of patient cohorts with CVD and thereafter use the computational model to build a digital clinical cohort analytics artifact demonstrating the actionability of such models.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Our data were sourced from 9 epidemiological cohort studies by the National Heart Lung and Blood Institute and comprised chronological records of 1274 patients associated with 4839 CVD episodes across 16 years. We then used the continuous-time Markov chain method to develop our model, which offers a robust approach to time-variant transitions between disease states in chronic diseases.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Our study presents time-variant transition probabilities of CVD state changes, revealing patterns of CVD progression against time. We found that the transition from myocardial infarction (MI) to stroke has the fastest transition rate (mean transition time 3, SD 0 days, because only 1 patient had a MI-to-stroke transition in the dataset), and the transition from MI to angina is the slowest (mean transition time 1457, SD 1449 days). Congestive heart failure is the most probable first episode (371/840, 44.2%), followed by stroke (216/840, 25.7%). The resultant artifact is actionable as it can act as an eHealth cohort analytics tool, helping physicians gain insights into treatment and intervention strategies. Through expert panel interviews and surveys, we found 9 application use cases of our model.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Past research does not provide actionable cohort-level decision support tools based on a comprehensive, 10-state, continuous-time Markov chain model to unveil complex CVD progression patterns from real-world patient data and support clinical decision-making. This paper aims to address this crucial limitation. Our stochastic model-embedded artifact can help clinicians in efficient disease monitoring and intervention deci","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
State-of-the-Art Fast Healthcare Interoperability Resources (FHIR)-Based Data Model and Structure Implementations: Systematic Scoping Review. 基于快速医疗互操作性资源(FHIR)的数据模型和结构实施现状:系统范围审查。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-24 DOI: 10.2196/58445
Parinaz Tabari, Gennaro Costagliola, Mattia De Rosa, Martin Boeker

Background: Data models are crucial for clinical research as they enable researchers to fully use the vast amount of clinical data stored in medical systems. Standardized data and well-defined relationships between data points are necessary to guarantee semantic interoperability. Using the Fast Healthcare Interoperability Resources (FHIR) standard for clinical data representation would be a practical methodology to enhance and accelerate interoperability and data availability for research.

Objective: This research aims to provide a comprehensive overview of the state-of-the-art and current landscape in FHIR-based data models and structures. In addition, we intend to identify and discuss the tools, resources, limitations, and other critical aspects mentioned in the selected research papers.

Methods: To ensure the extraction of reliable results, we followed the instructions of the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist. We analyzed the indexed articles in PubMed, Scopus, Web of Science, IEEE Xplore, the ACM Digital Library, and Google Scholar. After identifying, extracting, and assessing the quality and relevance of the articles, we synthesized the extracted data to identify common patterns, themes, and variations in the use of FHIR-based data models and structures across different studies.

Results: On the basis of the reviewed articles, we could identify 2 main themes: dynamic (pipeline-based) and static data models. The articles were also categorized into health care use cases, including chronic diseases, COVID-19 and infectious diseases, cancer research, acute or intensive care, random and general medical notes, and other conditions. Furthermore, we summarized the important or common tools and approaches of the selected papers. These items included FHIR-based tools and frameworks, machine learning approaches, and data storage and security. The most common resource was "Observation" followed by "Condition" and "Patient." The limitations and challenges of developing data models were categorized based on the issues of data integration, interoperability, standardization, performance, and scalability or generalizability.

Conclusions: FHIR serves as a highly promising interoperability standard for developing real-world health care apps. The implementation of FHIR modeling for electronic health record data facilitates the integration, transmission, and analysis of data while also advancing translational research and phenotyping. Generally, FHIR-based exports of local data repositories improve data interoperability for systems and data warehouses across different settings. However, ongoing efforts to address existing limitations and challenges are essential for the successful implementation and integration of FHIR data models.

背景:数据模型对临床研究至关重要,因为它们能让研究人员充分利用医疗系统中存储的大量临床数据。标准化的数据和数据点之间定义明确的关系是保证语义互操作性的必要条件。使用快速医疗互操作性资源(FHIR)标准进行临床数据表示将是一种切实可行的方法,可提高和加快研究的互操作性和数据可用性:本研究旨在全面概述基于 FHIR 的数据模型和结构的最新技术和现状。此外,我们还打算确定并讨论所选研究论文中提到的工具、资源、局限性和其他关键方面:为确保提取可靠的结果,我们遵循了 PRISMA-ScR(系统综述和 Meta 分析首选报告项目扩展范围综述)核对表的说明。我们分析了 PubMed、Scopus、Web of Science、IEEE Xplore、ACM 数字图书馆和 Google Scholar 中的索引文章。在对文章的质量和相关性进行识别、提取和评估后,我们对提取的数据进行了综合,以确定不同研究中使用基于 FHIR 的数据模型和结构的共同模式、主题和差异:根据所审查的文章,我们确定了两大主题:动态数据模型(基于管道)和静态数据模型。我们还将文章按医疗保健用例进行了分类,包括慢性病、COVID-19 和传染病、癌症研究、急诊或重症监护、随机和一般医疗记录以及其他情况。此外,我们还总结了所选论文中重要或常见的工具和方法。这些项目包括基于 FHIR 的工具和框架、机器学习方法以及数据存储和安全。最常见的资源是 "观察",其次是 "病情 "和 "患者"。开发数据模型的局限性和挑战根据数据集成、互操作性、标准化、性能和可扩展性或通用性等问题进行了分类:结论:FHIR 是一种极具前景的互操作性标准,可用于开发真实世界的医疗保健应用程序。对电子健康记录数据实施 FHIR 建模有助于数据的整合、传输和分析,同时还能促进转化研究和表型分析。一般来说,基于 FHIR 的本地数据存储库输出可提高不同环境下系统和数据仓库的数据互操作性。然而,要成功实施和整合 FHIR 数据模型,就必须不断努力解决现有的局限性和挑战。
{"title":"State-of-the-Art Fast Healthcare Interoperability Resources (FHIR)-Based Data Model and Structure Implementations: Systematic Scoping Review.","authors":"Parinaz Tabari, Gennaro Costagliola, Mattia De Rosa, Martin Boeker","doi":"10.2196/58445","DOIUrl":"10.2196/58445","url":null,"abstract":"<p><strong>Background: </strong>Data models are crucial for clinical research as they enable researchers to fully use the vast amount of clinical data stored in medical systems. Standardized data and well-defined relationships between data points are necessary to guarantee semantic interoperability. Using the Fast Healthcare Interoperability Resources (FHIR) standard for clinical data representation would be a practical methodology to enhance and accelerate interoperability and data availability for research.</p><p><strong>Objective: </strong>This research aims to provide a comprehensive overview of the state-of-the-art and current landscape in FHIR-based data models and structures. In addition, we intend to identify and discuss the tools, resources, limitations, and other critical aspects mentioned in the selected research papers.</p><p><strong>Methods: </strong>To ensure the extraction of reliable results, we followed the instructions of the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist. We analyzed the indexed articles in PubMed, Scopus, Web of Science, IEEE Xplore, the ACM Digital Library, and Google Scholar. After identifying, extracting, and assessing the quality and relevance of the articles, we synthesized the extracted data to identify common patterns, themes, and variations in the use of FHIR-based data models and structures across different studies.</p><p><strong>Results: </strong>On the basis of the reviewed articles, we could identify 2 main themes: dynamic (pipeline-based) and static data models. The articles were also categorized into health care use cases, including chronic diseases, COVID-19 and infectious diseases, cancer research, acute or intensive care, random and general medical notes, and other conditions. Furthermore, we summarized the important or common tools and approaches of the selected papers. These items included FHIR-based tools and frameworks, machine learning approaches, and data storage and security. The most common resource was \"Observation\" followed by \"Condition\" and \"Patient.\" The limitations and challenges of developing data models were categorized based on the issues of data integration, interoperability, standardization, performance, and scalability or generalizability.</p><p><strong>Conclusions: </strong>FHIR serves as a highly promising interoperability standard for developing real-world health care apps. The implementation of FHIR modeling for electronic health record data facilitates the integration, transmission, and analysis of data while also advancing translational research and phenotyping. Generally, FHIR-based exports of local data repositories improve data interoperability for systems and data warehouses across different settings. However, ongoing efforts to address existing limitations and challenges are essential for the successful implementation and integration of FHIR data models.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated System to Capture Patient Symptoms From Multitype Japanese Clinical Texts: Retrospective Study. 从多语种日语临床文本中捕捉患者症状的自动化系统:回顾性研究
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-24 DOI: 10.2196/58977
Tomohiro Nishiyama, Ayane Yamaguchi, Peitao Han, Lis Weiji Kanashiro Pereira, Yuka Otsuki, Gabriel Herman Bernardim Andrade, Noriko Kudo, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki, Masahiro Takada, Masakazu Toi
<p><strong>Background: </strong>Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health record texts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adverse drug event (ADE) signals. As different aspects of a patient's status are stored in different types of documents, we propose an NLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes.</p><p><strong>Objective: </strong>This study aimed to investigate the system's performance in detecting ADEs by evaluating the results from multitype texts. The main objective is to detect adverse events accurately using an NLP system.</p><p><strong>Methods: </strong>We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our system performs 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documents from multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxel or docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regression analysis to compare their performance with each single type of document. All evaluations of detection rates with our system are performed 30 days after drug administration.</p><p><strong>Results: </strong>Our system underestimates by 13.3 percentage points (74.0%-60.7%), as the incidence of paclitaxel-induced PN was 60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient between the manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate among each type of document, the rate did not match the performance using all documents. The estimated median duration of PN with paclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number of events detected in each document was highest in the physician's progress notes, followed by the pharmacist's and nursing records.</p><p><strong>Conclusions: </strong>Considering the inherent cost that requires constant monitoring of the patient's condition, such as the treatment of PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning a new NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance. Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the data follow-up period. The results suggest that our m
背景:自然语言处理(NLP)技术可用于分析大量电子健康记录文本,这些文本包含各种类型的患者信息,如生活质量、治疗效果和药物不良事件(ADE)信号。由于患者状态的不同方面存储在不同类型的文档中,因此我们提出了一种 NLP 系统,该系统能够处理 6 种类型的文档:医生进度记录、出院摘要、放射科报告、放射性同位素报告、护理记录和药剂师进度记录:本研究旨在通过评估来自多类型文本的结果,研究该系统在检测 ADE 方面的性能。主要目的是利用 NLP 系统准确检测不良事件:我们使用了 2289 名乳腺癌患者用日语撰写的数据,其中包括用药数据、医生进展记录、出院总结、放射学报告、放射性同位素报告、护理记录和药剂师进展记录。我们的系统进行了 3 个处理过程:命名实体识别、症状规范化和汇总来自多个患者的多种类型文档。在所有乳腺癌患者中,分别有 103 名和 112 名周围神经病变患者接受了紫杉醇或多西他赛治疗。我们通过相关系数和回归分析来评估使用多种类型文档的效用,并将其与每种单一类型文档的性能进行比较。我们的系统对检测率的所有评估都是在用药 30 天后进行的:我们的系统低估了 13.3 个百分点(74.0%-60.7%),因为紫杉醇诱发 PN 的发生率为 60.7%,而之前基于人工提取的研究结果为 74.0%。虽然药剂师进度记录的检出率在各类文件中最高,但与所有文件的检出率并不匹配,人工提取结果与系统结果之间的皮尔逊相关系数为 0.87。使用紫杉醇进行 PN 的估计中位持续时间为 92 天,而之前报告的使用紫杉醇进行 PN 的中位持续时间为 727 天。每份文件中检测到的事件数量以医生的病程记录最多,其次是药剂师记录和护理记录:考虑到持续监测患者病情(如紫杉醇治疗)所需的固有成本,我们的系统具有一个显著的优势,即无需对新的 NLP 模型进行微调,就能立即估算出治疗时间。利用多类型文档比使用单类型文档更能提高检测性能。虽然对发病时间的估计相对准确,但持续时间可能受到数据跟踪期长度的影响。这些结果表明,我们使用不同类型数据的方法可以从临床文档中检测出更多的 ADE。
{"title":"Automated System to Capture Patient Symptoms From Multitype Japanese Clinical Texts: Retrospective Study.","authors":"Tomohiro Nishiyama, Ayane Yamaguchi, Peitao Han, Lis Weiji Kanashiro Pereira, Yuka Otsuki, Gabriel Herman Bernardim Andrade, Noriko Kudo, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki, Masahiro Takada, Masakazu Toi","doi":"10.2196/58977","DOIUrl":"10.2196/58977","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health record texts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adverse drug event (ADE) signals. As different aspects of a patient's status are stored in different types of documents, we propose an NLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to investigate the system's performance in detecting ADEs by evaluating the results from multitype texts. The main objective is to detect adverse events accurately using an NLP system.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our system performs 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documents from multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxel or docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regression analysis to compare their performance with each single type of document. All evaluations of detection rates with our system are performed 30 days after drug administration.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Our system underestimates by 13.3 percentage points (74.0%-60.7%), as the incidence of paclitaxel-induced PN was 60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient between the manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate among each type of document, the rate did not match the performance using all documents. The estimated median duration of PN with paclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number of events detected in each document was highest in the physician's progress notes, followed by the pharmacist's and nursing records.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Considering the inherent cost that requires constant monitoring of the patient's condition, such as the treatment of PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning a new NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance. Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the data follow-up period. The results suggest that our m","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Bias in Hospital Data: Automatic Preprocessing of Patient Pathways Algorithm Development and Validation Study. 评估医院数据的偏差:患者路径自动预处理算法开发与验证研究》。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-23 DOI: 10.2196/58978
Laura Uhl, Vincent Augusto, Benjamin Dalmas, Youenn Alexandre, Paolo Bercelli, Fanny Jardinaud, Saber Aloui

Background: The optimization of patient care pathways is crucial for hospital managers in the context of a scarcity of medical resources. Assuming unlimited capacities, the pathway of a patient would only be governed by pure medical logic to meet at best the patient's needs. However, logistical limitations (eg, resources such as inpatient beds) are often associated with delayed treatments and may ultimately affect patient pathways. This is especially true for unscheduled patients-when a patient in the emergency department needs to be admitted to another medical unit without disturbing the flow of planned hospitalizations.

Objective: In this study, we proposed a new framework to automatically detect activities in patient pathways that may be unrelated to patients' needs but rather induced by logistical limitations.

Methods: The scientific contribution lies in a method that transforms a database of historical pathways with bias into 2 databases: a labeled pathway database where each activity is labeled as relevant (related to a patient's needs) or irrelevant (induced by logistical limitations) and a corrected pathway database where each activity corresponds to the activity that would occur assuming unlimited resources. The labeling algorithm was assessed through medical expertise. In total, 2 case studies quantified the impact of our method of preprocessing health care data using process mining and discrete event simulation.

Results: Focusing on unscheduled patient pathways, we collected data covering 12 months of activity at the Groupe Hospitalier Bretagne Sud in France. Our algorithm had 87% accuracy and demonstrated its usefulness for preprocessing traces and obtaining a clean database. The 2 case studies showed the importance of our preprocessing step before any analysis. The process graphs of the processed data had, on average, 40% (SD 10%) fewer variants than the raw data. The simulation revealed that 30% of the medical units had >1 bed difference in capacity between the processed and raw data.

Conclusions: Patient pathway data reflect the actual activity of hospitals that is governed by medical requirements and logistical limitations. Before using these data, these limitations should be identified and corrected. We anticipate that our approach can be generalized to obtain unbiased analyses of patient pathways for other hospitals.

背景:在医疗资源稀缺的情况下,优化病人护理路径对医院管理者至关重要。假设医疗能力不受限制,那么病人的治疗路径只能由纯粹的医疗逻辑来决定,以最大限度地满足病人的需求。然而,后勤方面的限制(如住院床位等资源)往往与治疗延误有关,并可能最终影响病人的治疗路径。这对于计划外病人来说尤其如此--当急诊科的病人需要在不影响计划内住院流程的情况下被送往其他医疗单位时:在这项研究中,我们提出了一个新的框架,用于自动检测病人路径中可能与病人需求无关,而是由后勤限制引起的活动:科学贡献在于我们采用了一种方法,将有偏差的历史路径数据库转化为两个数据库:一个是标注路径数据库,其中每项活动都被标注为相关(与患者需求相关)或不相关(由后勤限制引起);另一个是校正路径数据库,其中每项活动都与假设资源无限时的活动相对应。通过医学专业知识对标记算法进行了评估。共有 2 个案例研究量化了我们利用流程挖掘和离散事件模拟预处理医疗数据的方法所产生的影响:我们收集了法国布列塔尼南方医院集团(Groupe Hospitalier Bretagne Sud)12 个月的活动数据,重点是计划外病人路径。我们的算法准确率为 87%,证明了其在预处理痕迹和获取干净数据库方面的实用性。这两个案例研究表明,在进行任何分析之前,我们的预处理步骤非常重要。与原始数据相比,处理后数据的流程图平均减少了 40%(标准差 10%)的变体。模拟结果显示,30% 的医疗单位在处理数据和原始数据之间的床位数相差超过 1 张:病人路径数据反映了医院的实际活动,而这些活动受到医疗要求和后勤限制的制约。在使用这些数据之前,应找出并纠正这些局限性。我们预计,我们的方法可以推广到其他医院,以获得无偏见的患者路径分析。
{"title":"Evaluating the Bias in Hospital Data: Automatic Preprocessing of Patient Pathways Algorithm Development and Validation Study.","authors":"Laura Uhl, Vincent Augusto, Benjamin Dalmas, Youenn Alexandre, Paolo Bercelli, Fanny Jardinaud, Saber Aloui","doi":"10.2196/58978","DOIUrl":"10.2196/58978","url":null,"abstract":"<p><strong>Background: </strong>The optimization of patient care pathways is crucial for hospital managers in the context of a scarcity of medical resources. Assuming unlimited capacities, the pathway of a patient would only be governed by pure medical logic to meet at best the patient's needs. However, logistical limitations (eg, resources such as inpatient beds) are often associated with delayed treatments and may ultimately affect patient pathways. This is especially true for unscheduled patients-when a patient in the emergency department needs to be admitted to another medical unit without disturbing the flow of planned hospitalizations.</p><p><strong>Objective: </strong>In this study, we proposed a new framework to automatically detect activities in patient pathways that may be unrelated to patients' needs but rather induced by logistical limitations.</p><p><strong>Methods: </strong>The scientific contribution lies in a method that transforms a database of historical pathways with bias into 2 databases: a labeled pathway database where each activity is labeled as relevant (related to a patient's needs) or irrelevant (induced by logistical limitations) and a corrected pathway database where each activity corresponds to the activity that would occur assuming unlimited resources. The labeling algorithm was assessed through medical expertise. In total, 2 case studies quantified the impact of our method of preprocessing health care data using process mining and discrete event simulation.</p><p><strong>Results: </strong>Focusing on unscheduled patient pathways, we collected data covering 12 months of activity at the Groupe Hospitalier Bretagne Sud in France. Our algorithm had 87% accuracy and demonstrated its usefulness for preprocessing traces and obtaining a clean database. The 2 case studies showed the importance of our preprocessing step before any analysis. The process graphs of the processed data had, on average, 40% (SD 10%) fewer variants than the raw data. The simulation revealed that 30% of the medical units had >1 bed difference in capacity between the processed and raw data.</p><p><strong>Conclusions: </strong>Patient pathway data reflect the actual activity of hospitals that is governed by medical requirements and logistical limitations. Before using these data, these limitations should be identified and corrected. We anticipate that our approach can be generalized to obtain unbiased analyses of patient pathways for other hospitals.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1