首页 > 最新文献

JMIR Medical Informatics最新文献

英文 中文
Development of a Cohort Analytics Tool for Monitoring Progression Patterns in Cardiovascular Diseases: Advanced Stochastic Modeling Approach. 开发用于监测心血管疾病进展模式的队列分析工具:先进的随机建模方法
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-24 DOI: 10.2196/59392
Arindam Brahma, Samir Chatterjee, Kala Seal, Ben Fitzpatrick, Youyou Tao
<p><strong>Background: </strong>The World Health Organization (WHO) reported that cardiovascular diseases (CVDs) are the leading cause of death worldwide. CVDs are chronic, with complex progression patterns involving episodes of comorbidities and multimorbidities. When dealing with chronic diseases, physicians often adopt a "watchful waiting" strategy, and actions are postponed until information is available. Population-level transition probabilities and progression patterns can be revealed by applying time-variant stochastic modeling methods to longitudinal patient data from cohort studies. Inputs from CVD practitioners indicate that tools to generate and visualize cohort transition patterns have many impactful clinical applications. The resultant computational model can be embedded in digital decision support tools for clinicians. However, to date, no study has attempted to accomplish this for CVDs.</p><p><strong>Objective: </strong>This study aims to apply advanced stochastic modeling methods to uncover the transition probabilities and progression patterns from longitudinal episodic data of patient cohorts with CVD and thereafter use the computational model to build a digital clinical cohort analytics artifact demonstrating the actionability of such models.</p><p><strong>Methods: </strong>Our data were sourced from 9 epidemiological cohort studies by the National Heart Lung and Blood Institute and comprised chronological records of 1274 patients associated with 4839 CVD episodes across 16 years. We then used the continuous-time Markov chain method to develop our model, which offers a robust approach to time-variant transitions between disease states in chronic diseases.</p><p><strong>Results: </strong>Our study presents time-variant transition probabilities of CVD state changes, revealing patterns of CVD progression against time. We found that the transition from myocardial infarction (MI) to stroke has the fastest transition rate (mean transition time 3, SD 0 days, because only 1 patient had a MI-to-stroke transition in the dataset), and the transition from MI to angina is the slowest (mean transition time 1457, SD 1449 days). Congestive heart failure is the most probable first episode (371/840, 44.2%), followed by stroke (216/840, 25.7%). The resultant artifact is actionable as it can act as an eHealth cohort analytics tool, helping physicians gain insights into treatment and intervention strategies. Through expert panel interviews and surveys, we found 9 application use cases of our model.</p><p><strong>Conclusions: </strong>Past research does not provide actionable cohort-level decision support tools based on a comprehensive, 10-state, continuous-time Markov chain model to unveil complex CVD progression patterns from real-world patient data and support clinical decision-making. This paper aims to address this crucial limitation. Our stochastic model-embedded artifact can help clinicians in efficient disease monitoring and intervention deci
背景:世界卫生组织(WHO)报告称,心血管疾病(CVDs)是导致全球死亡的主要原因。心血管疾病是慢性病,其发展模式复杂,涉及并发症和多发病的发作。在处理慢性疾病时,医生往往采取 "观察等待 "策略,在获得信息之前推迟行动。将时变随机建模方法应用于队列研究中的患者纵向数据,可以揭示人群水平的转变概率和进展模式。心血管疾病从业人员提供的信息表明,生成和可视化队列过渡模式的工具在临床应用中具有很多影响力。由此产生的计算模型可嵌入临床医生的数字决策支持工具中。然而,迄今为止,还没有研究尝试为心血管疾病实现这一目标:本研究旨在应用先进的随机建模方法,从心血管疾病患者队列的纵向偶发数据中发现转归概率和进展模式,然后利用计算模型构建数字化临床队列分析工具,展示此类模型的可操作性:我们的数据来源于美国国家心肺血液研究所的 9 项流行病学队列研究,包括 1274 名患者 16 年间 4839 次心血管疾病发作的时间记录。然后,我们使用连续时间马尔可夫链方法建立了模型,该方法为慢性疾病中疾病状态之间的时变过渡提供了一种稳健的方法:我们的研究显示了心血管疾病状态变化的时变过渡概率,揭示了心血管疾病随时间发展的模式。我们发现,从心肌梗死(MI)到中风的转变速度最快(平均转变时间为 3 天,标度为 0 天,因为数据集中只有 1 名患者从心肌梗死转变为中风),而从心肌梗死到心绞痛的转变速度最慢(平均转变时间为 1457 天,标度为 1449 天)。充血性心力衰竭最有可能是首次发病(371/840,44.2%),其次是中风(216/840,25.7%)。由此产生的人工智能具有可操作性,因为它可以作为电子健康队列分析工具,帮助医生深入了解治疗和干预策略。通过专家小组访谈和调查,我们发现了模型的 9 个应用案例:过去的研究没有提供基于全面、10 状态、连续时间马尔可夫链模型的可操作队列级决策支持工具,以从真实世界的患者数据中揭示复杂的心血管疾病进展模式并支持临床决策。本文旨在解决这一关键的局限性。我们的随机模型嵌入式人工智能可以帮助临床医生在真实患者数据的客观数据驱动见解的指导下,进行高效的疾病监测和干预决策。此外,只需输入 3 个数据元素:合成患者标识符、病程名称和从基线日期算起的病程时间(以天为单位),所提出的模型就能揭示任何慢性疾病的进展模式。
{"title":"Development of a Cohort Analytics Tool for Monitoring Progression Patterns in Cardiovascular Diseases: Advanced Stochastic Modeling Approach.","authors":"Arindam Brahma, Samir Chatterjee, Kala Seal, Ben Fitzpatrick, Youyou Tao","doi":"10.2196/59392","DOIUrl":"10.2196/59392","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;The World Health Organization (WHO) reported that cardiovascular diseases (CVDs) are the leading cause of death worldwide. CVDs are chronic, with complex progression patterns involving episodes of comorbidities and multimorbidities. When dealing with chronic diseases, physicians often adopt a \"watchful waiting\" strategy, and actions are postponed until information is available. Population-level transition probabilities and progression patterns can be revealed by applying time-variant stochastic modeling methods to longitudinal patient data from cohort studies. Inputs from CVD practitioners indicate that tools to generate and visualize cohort transition patterns have many impactful clinical applications. The resultant computational model can be embedded in digital decision support tools for clinicians. However, to date, no study has attempted to accomplish this for CVDs.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aims to apply advanced stochastic modeling methods to uncover the transition probabilities and progression patterns from longitudinal episodic data of patient cohorts with CVD and thereafter use the computational model to build a digital clinical cohort analytics artifact demonstrating the actionability of such models.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;Our data were sourced from 9 epidemiological cohort studies by the National Heart Lung and Blood Institute and comprised chronological records of 1274 patients associated with 4839 CVD episodes across 16 years. We then used the continuous-time Markov chain method to develop our model, which offers a robust approach to time-variant transitions between disease states in chronic diseases.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Our study presents time-variant transition probabilities of CVD state changes, revealing patterns of CVD progression against time. We found that the transition from myocardial infarction (MI) to stroke has the fastest transition rate (mean transition time 3, SD 0 days, because only 1 patient had a MI-to-stroke transition in the dataset), and the transition from MI to angina is the slowest (mean transition time 1457, SD 1449 days). Congestive heart failure is the most probable first episode (371/840, 44.2%), followed by stroke (216/840, 25.7%). The resultant artifact is actionable as it can act as an eHealth cohort analytics tool, helping physicians gain insights into treatment and intervention strategies. Through expert panel interviews and surveys, we found 9 application use cases of our model.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Past research does not provide actionable cohort-level decision support tools based on a comprehensive, 10-state, continuous-time Markov chain model to unveil complex CVD progression patterns from real-world patient data and support clinical decision-making. This paper aims to address this crucial limitation. Our stochastic model-embedded artifact can help clinicians in efficient disease monitoring and intervention deci","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e59392"},"PeriodicalIF":3.1,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462104/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
State-of-the-Art Fast Healthcare Interoperability Resources (FHIR)-Based Data Model and Structure Implementations: Systematic Scoping Review. 基于快速医疗互操作性资源(FHIR)的数据模型和结构实施现状:系统范围审查。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-24 DOI: 10.2196/58445
Parinaz Tabari, Gennaro Costagliola, Mattia De Rosa, Martin Boeker

Background: Data models are crucial for clinical research as they enable researchers to fully use the vast amount of clinical data stored in medical systems. Standardized data and well-defined relationships between data points are necessary to guarantee semantic interoperability. Using the Fast Healthcare Interoperability Resources (FHIR) standard for clinical data representation would be a practical methodology to enhance and accelerate interoperability and data availability for research.

Objective: This research aims to provide a comprehensive overview of the state-of-the-art and current landscape in FHIR-based data models and structures. In addition, we intend to identify and discuss the tools, resources, limitations, and other critical aspects mentioned in the selected research papers.

Methods: To ensure the extraction of reliable results, we followed the instructions of the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist. We analyzed the indexed articles in PubMed, Scopus, Web of Science, IEEE Xplore, the ACM Digital Library, and Google Scholar. After identifying, extracting, and assessing the quality and relevance of the articles, we synthesized the extracted data to identify common patterns, themes, and variations in the use of FHIR-based data models and structures across different studies.

Results: On the basis of the reviewed articles, we could identify 2 main themes: dynamic (pipeline-based) and static data models. The articles were also categorized into health care use cases, including chronic diseases, COVID-19 and infectious diseases, cancer research, acute or intensive care, random and general medical notes, and other conditions. Furthermore, we summarized the important or common tools and approaches of the selected papers. These items included FHIR-based tools and frameworks, machine learning approaches, and data storage and security. The most common resource was "Observation" followed by "Condition" and "Patient." The limitations and challenges of developing data models were categorized based on the issues of data integration, interoperability, standardization, performance, and scalability or generalizability.

Conclusions: FHIR serves as a highly promising interoperability standard for developing real-world health care apps. The implementation of FHIR modeling for electronic health record data facilitates the integration, transmission, and analysis of data while also advancing translational research and phenotyping. Generally, FHIR-based exports of local data repositories improve data interoperability for systems and data warehouses across different settings. However, ongoing efforts to address existing limitations and challenges are essential for the successful implementation and integration of FHIR data models.

背景:数据模型对临床研究至关重要,因为它们能让研究人员充分利用医疗系统中存储的大量临床数据。标准化的数据和数据点之间定义明确的关系是保证语义互操作性的必要条件。使用快速医疗互操作性资源(FHIR)标准进行临床数据表示将是一种切实可行的方法,可提高和加快研究的互操作性和数据可用性:本研究旨在全面概述基于 FHIR 的数据模型和结构的最新技术和现状。此外,我们还打算确定并讨论所选研究论文中提到的工具、资源、局限性和其他关键方面:为确保提取可靠的结果,我们遵循了 PRISMA-ScR(系统综述和 Meta 分析首选报告项目扩展范围综述)核对表的说明。我们分析了 PubMed、Scopus、Web of Science、IEEE Xplore、ACM 数字图书馆和 Google Scholar 中的索引文章。在对文章的质量和相关性进行识别、提取和评估后,我们对提取的数据进行了综合,以确定不同研究中使用基于 FHIR 的数据模型和结构的共同模式、主题和差异:根据所审查的文章,我们确定了两大主题:动态数据模型(基于管道)和静态数据模型。我们还将文章按医疗保健用例进行了分类,包括慢性病、COVID-19 和传染病、癌症研究、急诊或重症监护、随机和一般医疗记录以及其他情况。此外,我们还总结了所选论文中重要或常见的工具和方法。这些项目包括基于 FHIR 的工具和框架、机器学习方法以及数据存储和安全。最常见的资源是 "观察",其次是 "病情 "和 "患者"。开发数据模型的局限性和挑战根据数据集成、互操作性、标准化、性能和可扩展性或通用性等问题进行了分类:结论:FHIR 是一种极具前景的互操作性标准,可用于开发真实世界的医疗保健应用程序。对电子健康记录数据实施 FHIR 建模有助于数据的整合、传输和分析,同时还能促进转化研究和表型分析。一般来说,基于 FHIR 的本地数据存储库输出可提高不同环境下系统和数据仓库的数据互操作性。然而,要成功实施和整合 FHIR 数据模型,就必须不断努力解决现有的局限性和挑战。
{"title":"State-of-the-Art Fast Healthcare Interoperability Resources (FHIR)-Based Data Model and Structure Implementations: Systematic Scoping Review.","authors":"Parinaz Tabari, Gennaro Costagliola, Mattia De Rosa, Martin Boeker","doi":"10.2196/58445","DOIUrl":"10.2196/58445","url":null,"abstract":"<p><strong>Background: </strong>Data models are crucial for clinical research as they enable researchers to fully use the vast amount of clinical data stored in medical systems. Standardized data and well-defined relationships between data points are necessary to guarantee semantic interoperability. Using the Fast Healthcare Interoperability Resources (FHIR) standard for clinical data representation would be a practical methodology to enhance and accelerate interoperability and data availability for research.</p><p><strong>Objective: </strong>This research aims to provide a comprehensive overview of the state-of-the-art and current landscape in FHIR-based data models and structures. In addition, we intend to identify and discuss the tools, resources, limitations, and other critical aspects mentioned in the selected research papers.</p><p><strong>Methods: </strong>To ensure the extraction of reliable results, we followed the instructions of the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist. We analyzed the indexed articles in PubMed, Scopus, Web of Science, IEEE Xplore, the ACM Digital Library, and Google Scholar. After identifying, extracting, and assessing the quality and relevance of the articles, we synthesized the extracted data to identify common patterns, themes, and variations in the use of FHIR-based data models and structures across different studies.</p><p><strong>Results: </strong>On the basis of the reviewed articles, we could identify 2 main themes: dynamic (pipeline-based) and static data models. The articles were also categorized into health care use cases, including chronic diseases, COVID-19 and infectious diseases, cancer research, acute or intensive care, random and general medical notes, and other conditions. Furthermore, we summarized the important or common tools and approaches of the selected papers. These items included FHIR-based tools and frameworks, machine learning approaches, and data storage and security. The most common resource was \"Observation\" followed by \"Condition\" and \"Patient.\" The limitations and challenges of developing data models were categorized based on the issues of data integration, interoperability, standardization, performance, and scalability or generalizability.</p><p><strong>Conclusions: </strong>FHIR serves as a highly promising interoperability standard for developing real-world health care apps. The implementation of FHIR modeling for electronic health record data facilitates the integration, transmission, and analysis of data while also advancing translational research and phenotyping. Generally, FHIR-based exports of local data repositories improve data interoperability for systems and data warehouses across different settings. However, ongoing efforts to address existing limitations and challenges are essential for the successful implementation and integration of FHIR data models.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e58445"},"PeriodicalIF":3.1,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11472501/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated System to Capture Patient Symptoms From Multitype Japanese Clinical Texts: Retrospective Study. 从多语种日语临床文本中捕捉患者症状的自动化系统:回顾性研究
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-24 DOI: 10.2196/58977
Tomohiro Nishiyama, Ayane Yamaguchi, Peitao Han, Lis Weiji Kanashiro Pereira, Yuka Otsuki, Gabriel Herman Bernardim Andrade, Noriko Kudo, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki, Masahiro Takada, Masakazu Toi
<p><strong>Background: </strong>Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health record texts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adverse drug event (ADE) signals. As different aspects of a patient's status are stored in different types of documents, we propose an NLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes.</p><p><strong>Objective: </strong>This study aimed to investigate the system's performance in detecting ADEs by evaluating the results from multitype texts. The main objective is to detect adverse events accurately using an NLP system.</p><p><strong>Methods: </strong>We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our system performs 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documents from multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxel or docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regression analysis to compare their performance with each single type of document. All evaluations of detection rates with our system are performed 30 days after drug administration.</p><p><strong>Results: </strong>Our system underestimates by 13.3 percentage points (74.0%-60.7%), as the incidence of paclitaxel-induced PN was 60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient between the manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate among each type of document, the rate did not match the performance using all documents. The estimated median duration of PN with paclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number of events detected in each document was highest in the physician's progress notes, followed by the pharmacist's and nursing records.</p><p><strong>Conclusions: </strong>Considering the inherent cost that requires constant monitoring of the patient's condition, such as the treatment of PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning a new NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance. Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the data follow-up period. The results suggest that our m
背景:自然语言处理(NLP)技术可用于分析大量电子健康记录文本,这些文本包含各种类型的患者信息,如生活质量、治疗效果和药物不良事件(ADE)信号。由于患者状态的不同方面存储在不同类型的文档中,因此我们提出了一种 NLP 系统,该系统能够处理 6 种类型的文档:医生进度记录、出院摘要、放射科报告、放射性同位素报告、护理记录和药剂师进度记录:本研究旨在通过评估来自多类型文本的结果,研究该系统在检测 ADE 方面的性能。主要目的是利用 NLP 系统准确检测不良事件:我们使用了 2289 名乳腺癌患者用日语撰写的数据,其中包括用药数据、医生进展记录、出院总结、放射学报告、放射性同位素报告、护理记录和药剂师进展记录。我们的系统进行了 3 个处理过程:命名实体识别、症状规范化和汇总来自多个患者的多种类型文档。在所有乳腺癌患者中,分别有 103 名和 112 名周围神经病变患者接受了紫杉醇或多西他赛治疗。我们通过相关系数和回归分析来评估使用多种类型文档的效用,并将其与每种单一类型文档的性能进行比较。我们的系统对检测率的所有评估都是在用药 30 天后进行的:我们的系统低估了 13.3 个百分点(74.0%-60.7%),因为紫杉醇诱发 PN 的发生率为 60.7%,而之前基于人工提取的研究结果为 74.0%。虽然药剂师进度记录的检出率在各类文件中最高,但与所有文件的检出率并不匹配,人工提取结果与系统结果之间的皮尔逊相关系数为 0.87。使用紫杉醇进行 PN 的估计中位持续时间为 92 天,而之前报告的使用紫杉醇进行 PN 的中位持续时间为 727 天。每份文件中检测到的事件数量以医生的病程记录最多,其次是药剂师记录和护理记录:考虑到持续监测患者病情(如紫杉醇治疗)所需的固有成本,我们的系统具有一个显著的优势,即无需对新的 NLP 模型进行微调,就能立即估算出治疗时间。利用多类型文档比使用单类型文档更能提高检测性能。虽然对发病时间的估计相对准确,但持续时间可能受到数据跟踪期长度的影响。这些结果表明,我们使用不同类型数据的方法可以从临床文档中检测出更多的 ADE。
{"title":"Automated System to Capture Patient Symptoms From Multitype Japanese Clinical Texts: Retrospective Study.","authors":"Tomohiro Nishiyama, Ayane Yamaguchi, Peitao Han, Lis Weiji Kanashiro Pereira, Yuka Otsuki, Gabriel Herman Bernardim Andrade, Noriko Kudo, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki, Masahiro Takada, Masakazu Toi","doi":"10.2196/58977","DOIUrl":"10.2196/58977","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health record texts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adverse drug event (ADE) signals. As different aspects of a patient's status are stored in different types of documents, we propose an NLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to investigate the system's performance in detecting ADEs by evaluating the results from multitype texts. The main objective is to detect adverse events accurately using an NLP system.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our system performs 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documents from multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxel or docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regression analysis to compare their performance with each single type of document. All evaluations of detection rates with our system are performed 30 days after drug administration.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;Our system underestimates by 13.3 percentage points (74.0%-60.7%), as the incidence of paclitaxel-induced PN was 60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient between the manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate among each type of document, the rate did not match the performance using all documents. The estimated median duration of PN with paclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number of events detected in each document was highest in the physician's progress notes, followed by the pharmacist's and nursing records.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;Considering the inherent cost that requires constant monitoring of the patient's condition, such as the treatment of PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning a new NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance. Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the data follow-up period. The results suggest that our m","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e58977"},"PeriodicalIF":3.1,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462096/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Bias in Hospital Data: Automatic Preprocessing of Patient Pathways Algorithm Development and Validation Study. 评估医院数据的偏差:患者路径自动预处理算法开发与验证研究》。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-23 DOI: 10.2196/58978
Laura Uhl, Vincent Augusto, Benjamin Dalmas, Youenn Alexandre, Paolo Bercelli, Fanny Jardinaud, Saber Aloui

Background: The optimization of patient care pathways is crucial for hospital managers in the context of a scarcity of medical resources. Assuming unlimited capacities, the pathway of a patient would only be governed by pure medical logic to meet at best the patient's needs. However, logistical limitations (eg, resources such as inpatient beds) are often associated with delayed treatments and may ultimately affect patient pathways. This is especially true for unscheduled patients-when a patient in the emergency department needs to be admitted to another medical unit without disturbing the flow of planned hospitalizations.

Objective: In this study, we proposed a new framework to automatically detect activities in patient pathways that may be unrelated to patients' needs but rather induced by logistical limitations.

Methods: The scientific contribution lies in a method that transforms a database of historical pathways with bias into 2 databases: a labeled pathway database where each activity is labeled as relevant (related to a patient's needs) or irrelevant (induced by logistical limitations) and a corrected pathway database where each activity corresponds to the activity that would occur assuming unlimited resources. The labeling algorithm was assessed through medical expertise. In total, 2 case studies quantified the impact of our method of preprocessing health care data using process mining and discrete event simulation.

Results: Focusing on unscheduled patient pathways, we collected data covering 12 months of activity at the Groupe Hospitalier Bretagne Sud in France. Our algorithm had 87% accuracy and demonstrated its usefulness for preprocessing traces and obtaining a clean database. The 2 case studies showed the importance of our preprocessing step before any analysis. The process graphs of the processed data had, on average, 40% (SD 10%) fewer variants than the raw data. The simulation revealed that 30% of the medical units had >1 bed difference in capacity between the processed and raw data.

Conclusions: Patient pathway data reflect the actual activity of hospitals that is governed by medical requirements and logistical limitations. Before using these data, these limitations should be identified and corrected. We anticipate that our approach can be generalized to obtain unbiased analyses of patient pathways for other hospitals.

背景:在医疗资源稀缺的情况下,优化病人护理路径对医院管理者至关重要。假设医疗能力不受限制,那么病人的治疗路径只能由纯粹的医疗逻辑来决定,以最大限度地满足病人的需求。然而,后勤方面的限制(如住院床位等资源)往往与治疗延误有关,并可能最终影响病人的治疗路径。这对于计划外病人来说尤其如此--当急诊科的病人需要在不影响计划内住院流程的情况下被送往其他医疗单位时:在这项研究中,我们提出了一个新的框架,用于自动检测病人路径中可能与病人需求无关,而是由后勤限制引起的活动:科学贡献在于我们采用了一种方法,将有偏差的历史路径数据库转化为两个数据库:一个是标注路径数据库,其中每项活动都被标注为相关(与患者需求相关)或不相关(由后勤限制引起);另一个是校正路径数据库,其中每项活动都与假设资源无限时的活动相对应。通过医学专业知识对标记算法进行了评估。共有 2 个案例研究量化了我们利用流程挖掘和离散事件模拟预处理医疗数据的方法所产生的影响:我们收集了法国布列塔尼南方医院集团(Groupe Hospitalier Bretagne Sud)12 个月的活动数据,重点是计划外病人路径。我们的算法准确率为 87%,证明了其在预处理痕迹和获取干净数据库方面的实用性。这两个案例研究表明,在进行任何分析之前,我们的预处理步骤非常重要。与原始数据相比,处理后数据的流程图平均减少了 40%(标准差 10%)的变体。模拟结果显示,30% 的医疗单位在处理数据和原始数据之间的床位数相差超过 1 张:病人路径数据反映了医院的实际活动,而这些活动受到医疗要求和后勤限制的制约。在使用这些数据之前,应找出并纠正这些局限性。我们预计,我们的方法可以推广到其他医院,以获得无偏见的患者路径分析。
{"title":"Evaluating the Bias in Hospital Data: Automatic Preprocessing of Patient Pathways Algorithm Development and Validation Study.","authors":"Laura Uhl, Vincent Augusto, Benjamin Dalmas, Youenn Alexandre, Paolo Bercelli, Fanny Jardinaud, Saber Aloui","doi":"10.2196/58978","DOIUrl":"10.2196/58978","url":null,"abstract":"<p><strong>Background: </strong>The optimization of patient care pathways is crucial for hospital managers in the context of a scarcity of medical resources. Assuming unlimited capacities, the pathway of a patient would only be governed by pure medical logic to meet at best the patient's needs. However, logistical limitations (eg, resources such as inpatient beds) are often associated with delayed treatments and may ultimately affect patient pathways. This is especially true for unscheduled patients-when a patient in the emergency department needs to be admitted to another medical unit without disturbing the flow of planned hospitalizations.</p><p><strong>Objective: </strong>In this study, we proposed a new framework to automatically detect activities in patient pathways that may be unrelated to patients' needs but rather induced by logistical limitations.</p><p><strong>Methods: </strong>The scientific contribution lies in a method that transforms a database of historical pathways with bias into 2 databases: a labeled pathway database where each activity is labeled as relevant (related to a patient's needs) or irrelevant (induced by logistical limitations) and a corrected pathway database where each activity corresponds to the activity that would occur assuming unlimited resources. The labeling algorithm was assessed through medical expertise. In total, 2 case studies quantified the impact of our method of preprocessing health care data using process mining and discrete event simulation.</p><p><strong>Results: </strong>Focusing on unscheduled patient pathways, we collected data covering 12 months of activity at the Groupe Hospitalier Bretagne Sud in France. Our algorithm had 87% accuracy and demonstrated its usefulness for preprocessing traces and obtaining a clean database. The 2 case studies showed the importance of our preprocessing step before any analysis. The process graphs of the processed data had, on average, 40% (SD 10%) fewer variants than the raw data. The simulation revealed that 30% of the medical units had >1 bed difference in capacity between the processed and raw data.</p><p><strong>Conclusions: </strong>Patient pathway data reflect the actual activity of hospitals that is governed by medical requirements and logistical limitations. Before using these data, these limitations should be identified and corrected. We anticipate that our approach can be generalized to obtain unbiased analyses of patient pathways for other hospitals.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e58978"},"PeriodicalIF":3.1,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11459108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personalized Prediction of Long-Term Renal Function Prognosis Following Nephrectomy Using Interpretable Machine Learning Algorithms: Case-Control Study. 使用可解释的机器学习算法个性化预测肾切除术后的长期肾功能预后:病例对照研究。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-20 DOI: 10.2196/52837
Lingyu Xu, Chenyu Li, Shuang Gao, Long Zhao, Chen Guan, Xuefei Shen, Zhihui Zhu, Cheng Guo, Liwei Zhang, Chengyu Yang, Quandong Bu, Bin Zhou, Yan Xu
<p><strong>Background: </strong>Acute kidney injury (AKI) is a common adverse outcome following nephrectomy. The progression from AKI to acute kidney disease (AKD) and subsequently to chronic kidney disease (CKD) remains a concern; yet, the predictive mechanisms for these transitions are not fully understood. Interpretable machine learning (ML) models offer insights into how clinical features influence long-term renal function outcomes after nephrectomy, providing a more precise framework for identifying patients at risk and supporting improved clinical decision-making processes.</p><p><strong>Objective: </strong>This study aimed to (1) evaluate postnephrectomy rates of AKI, AKD, and CKD, analyzing long-term renal outcomes along different trajectories; (2) interpret AKD and CKD models using Shapley Additive Explanations values and Local Interpretable Model-Agnostic Explanations algorithm; and (3) develop a web-based tool for estimating AKD or CKD risk after nephrectomy.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study involving patients who underwent nephrectomy between July 2012 and June 2019. Patient data were randomly split into training, validation, and test sets, maintaining a ratio of 76.5:8.5:15. Eight ML algorithms were used to construct predictive models for postoperative AKD and CKD. The performance of the best-performing models was assessed using various metrics. We used various Shapley Additive Explanations plots and Local Interpretable Model-Agnostic Explanations bar plots to interpret the model and generated directed acyclic graphs to explore the potential causal relationships between features. Additionally, we developed a web-based prediction tool using the top 10 features for AKD prediction and the top 5 features for CKD prediction.</p><p><strong>Results: </strong>The study cohort comprised 1559 patients. Incidence rates for AKI, AKD, and CKD were 21.7% (n=330), 15.3% (n=238), and 10.6% (n=165), respectively. Among the evaluated ML models, the Light Gradient-Boosting Machine (LightGBM) model demonstrated superior performance, with an area under the receiver operating characteristic curve of 0.97 for AKD prediction and 0.96 for CKD prediction. Performance metrics and plots highlighted the model's competence in discrimination, calibration, and clinical applicability. Operative duration, hemoglobin, blood loss, urine protein, and hematocrit were identified as the top 5 features associated with predicted AKD. Baseline estimated glomerular filtration rate, pathology, trajectories of renal function, age, and total bilirubin were the top 5 features associated with predicted CKD. Additionally, we developed a web application using the LightGBM model to estimate AKD and CKD risks.</p><p><strong>Conclusions: </strong>An interpretable ML model effectively elucidated its decision-making process in identifying patients at risk of AKD and CKD following nephrectomy by enumerating critical features. The web-based calculato
背景:急性肾损伤(AKI)是肾切除术后常见的不良后果。从急性肾损伤发展为急性肾病(AKD),进而发展为慢性肾病(CKD)仍然是一个令人担忧的问题;然而,这些转变的预测机制尚未完全明了。可解释的机器学习(ML)模型提供了关于临床特征如何影响肾切除术后长期肾功能结果的见解,为识别高危患者提供了更精确的框架,并支持改善临床决策过程:本研究旨在:(1) 评估肾切除术后 AKI、AKD 和 CKD 的发生率,分析不同轨迹的长期肾功能预后;(2) 使用 Shapley Additive Explanations 值和 Local Interpretable Model-Agnostic Explanations 算法解释 AKD 和 CKD 模型;(3) 开发一种基于网络的工具,用于估计肾切除术后 AKD 或 CKD 风险:我们进行了一项回顾性队列研究,涉及 2012 年 7 月至 2019 年 6 月间接受肾切除术的患者。患者数据被随机分成训练集、验证集和测试集,保持 76.5:8.5:15 的比例。八种 ML 算法用于构建术后 AKD 和 CKD 的预测模型。我们使用各种指标评估了表现最佳模型的性能。我们使用各种夏普利相加解释图和局部可解释模型-诊断解释条形图来解释模型,并生成有向无环图来探索特征之间的潜在因果关系。此外,我们还利用预测 AKD 的前 10 个特征和预测 CKD 的前 5 个特征开发了基于网络的预测工具:研究队列由 1559 名患者组成。AKI、AKD 和 CKD 的发病率分别为 21.7%(n=330)、15.3%(n=238)和 10.6%(n=165)。在所评估的 ML 模型中,轻梯度增强机(LightGBM)模型表现出卓越的性能,其 AKD 预测的接收器操作特征曲线下面积为 0.97,CKD 预测的接收器操作特征曲线下面积为 0.96。性能指标和曲线图凸显了该模型在辨别、校准和临床应用方面的能力。手术时间、血红蛋白、失血量、尿蛋白和血细胞比容被确定为与预测 AKD 相关的前 5 个特征。基线估计肾小球滤过率、病理学、肾功能轨迹、年龄和总胆红素是与预测的 CKD 相关的前 5 个特征。此外,我们还利用 LightGBM 模型开发了一个网络应用程序,用于估算 AKD 和 CKD 风险:结论:可解释的 ML 模型通过列举关键特征,有效地阐明了在肾切除术后识别有 AKD 和 CKD 风险的患者的决策过程。在 LightGBM 模型中发现的基于网络的计算器可以帮助制定更加个性化的循证临床策略。
{"title":"Personalized Prediction of Long-Term Renal Function Prognosis Following Nephrectomy Using Interpretable Machine Learning Algorithms: Case-Control Study.","authors":"Lingyu Xu, Chenyu Li, Shuang Gao, Long Zhao, Chen Guan, Xuefei Shen, Zhihui Zhu, Cheng Guo, Liwei Zhang, Chengyu Yang, Quandong Bu, Bin Zhou, Yan Xu","doi":"10.2196/52837","DOIUrl":"10.2196/52837","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Acute kidney injury (AKI) is a common adverse outcome following nephrectomy. The progression from AKI to acute kidney disease (AKD) and subsequently to chronic kidney disease (CKD) remains a concern; yet, the predictive mechanisms for these transitions are not fully understood. Interpretable machine learning (ML) models offer insights into how clinical features influence long-term renal function outcomes after nephrectomy, providing a more precise framework for identifying patients at risk and supporting improved clinical decision-making processes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Objective: &lt;/strong&gt;This study aimed to (1) evaluate postnephrectomy rates of AKI, AKD, and CKD, analyzing long-term renal outcomes along different trajectories; (2) interpret AKD and CKD models using Shapley Additive Explanations values and Local Interpretable Model-Agnostic Explanations algorithm; and (3) develop a web-based tool for estimating AKD or CKD risk after nephrectomy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Methods: &lt;/strong&gt;We conducted a retrospective cohort study involving patients who underwent nephrectomy between July 2012 and June 2019. Patient data were randomly split into training, validation, and test sets, maintaining a ratio of 76.5:8.5:15. Eight ML algorithms were used to construct predictive models for postoperative AKD and CKD. The performance of the best-performing models was assessed using various metrics. We used various Shapley Additive Explanations plots and Local Interpretable Model-Agnostic Explanations bar plots to interpret the model and generated directed acyclic graphs to explore the potential causal relationships between features. Additionally, we developed a web-based prediction tool using the top 10 features for AKD prediction and the top 5 features for CKD prediction.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;The study cohort comprised 1559 patients. Incidence rates for AKI, AKD, and CKD were 21.7% (n=330), 15.3% (n=238), and 10.6% (n=165), respectively. Among the evaluated ML models, the Light Gradient-Boosting Machine (LightGBM) model demonstrated superior performance, with an area under the receiver operating characteristic curve of 0.97 for AKD prediction and 0.96 for CKD prediction. Performance metrics and plots highlighted the model's competence in discrimination, calibration, and clinical applicability. Operative duration, hemoglobin, blood loss, urine protein, and hematocrit were identified as the top 5 features associated with predicted AKD. Baseline estimated glomerular filtration rate, pathology, trajectories of renal function, age, and total bilirubin were the top 5 features associated with predicted CKD. Additionally, we developed a web application using the LightGBM model to estimate AKD and CKD risks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Conclusions: &lt;/strong&gt;An interpretable ML model effectively elucidated its decision-making process in identifying patients at risk of AKD and CKD following nephrectomy by enumerating critical features. The web-based calculato","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e52837"},"PeriodicalIF":3.1,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11452755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Collaborative Documentation on Person-Centered Care: Textual Analysis of Clinical Notes. 合作文档对以人为本的护理的影响:临床笔记文本分析。
IF 3.1 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-20 DOI: 10.2196/52678
Victoria Stanhope, Nari Yoo, Elizabeth Matthews, Daniel Baslock, Yuanyuan Hu

Background: Collaborative documentation (CD) is a behavioral health practice involving shared writing of clinic visit notes by providers and consumers. Despite widespread dissemination of CD, research on its effectiveness or impact on person-centered care (PCC) has been limited. Principles of PCC planning, a recovery-based approach to service planning that operationalizes PCC, can inform the measurement of person-centeredness within clinical documentation.

Objective: This study aims to use the clinical informatics approach of natural language processing (NLP) to examine the impact of CD on person-centeredness in clinic visit notes. Using a dictionary-based approach, this study conducts a textual analysis of clinic notes from a community mental health center before and after staff were trained in CD.

Methods: This study used visit notes (n=1981) from 10 providers in a community mental health center 6 months before and after training in CD. LIWC-22 was used to assess all notes using the Linguistic Inquiry and Word Count (LIWC) dictionary, which categorizes over 5000 linguistic and psychological words. Twelve LIWC categories were selected and mapped onto PCC planning principles through the consensus of 3 domain experts. The LIWC-22 contextualizer was used to extract sentence fragments from notes corresponding to LIWC categories. Then, fixed-effects modeling was used to identify differences in notes before and after CD training while accounting for nesting within the provider.

Results: Sentence fragments identified by the contextualizing process illustrated how visit notes demonstrated PCC. The fixed effects analysis found a significant positive shift toward person-centeredness; this was observed in 6 of the selected LIWC categories post CD. Specifically, there was a notable increase in words associated with achievement (β=.774, P<.001), power (β=.831, P<.001), money (β=.204, P<.001), physical health (β=.427, P=.03), while leisure words decreased (β=-.166, P=.002).

Conclusions: By using a dictionary-based approach, the study identified how CD might influence the integration of PCC principles within clinical notes. Although the results were mixed, the findings highlight the potential effectiveness of CD in enhancing person-centeredness in clinic notes. By leveraging NLP techniques, this research illuminated the value of narrative clinical notes in assessing the quality of care in behavioral health contexts. These findings underscore the promise of NLP for quality assurance in health care settings and emphasize the need for refining algorithms to more accurately measure PCC.

背景:协作记录(CD)是一种行为健康实践,涉及医疗服务提供者和消费者共同撰写门诊记录。尽管 CD 得到了广泛传播,但有关其有效性或对以人为本的护理(PCC)的影响的研究却十分有限。以人为本的护理(PCC)规划原则是一种以康复为基础的服务规划方法,它将以人为本的护理(PCC)具体化,可为在临床记录中衡量以人为本的护理提供参考:本研究旨在使用自然语言处理(NLP)的临床信息学方法,研究 CD 对门诊记录中以人为中心的影响。本研究采用基于字典的方法,对一家社区心理健康中心员工接受 CD 培训前后的门诊记录进行文本分析:本研究使用了一家社区心理健康中心的 10 名医疗服务提供者在接受 CD 培训前后 6 个月的就诊记录(n=1981)。使用语言调查和字数统计(LIWC)词典对所有笔记进行评估,该词典对 5000 多个语言和心理词汇进行了分类。通过 3 位领域专家的共识,选择了 12 个 LIWC 类别,并将其映射到 PCC 规划原则中。使用 LIWC-22 上下文分析器从笔记中提取与 LIWC 类别相对应的句子片段。然后,使用固定效应模型确定 CD 培训前后笔记的差异,同时考虑到提供者内部的嵌套:结果:通过情境化过程识别出的句子片段说明了访问笔记是如何展示 PCC 的。固定效应分析发现,在 CD 培训后,以人为本发生了显著的积极转变;在所选的 LIWC 类别中,有 6 个类别观察到了这种转变。具体来说,与成就相关的词语明显增加(β=.774,PConclusions:通过使用基于词典的方法,本研究确定了 CD 如何影响临床笔记中 PCC 原则的整合。虽然结果参差不齐,但研究结果凸显了 CD 在临床笔记中增强以人为本的潜在有效性。通过利用 NLP 技术,这项研究阐明了临床笔记叙事在评估行为健康护理质量方面的价值。这些发现强调了 NLP 在医疗质量保证方面的前景,并强调了改进算法以更准确地衡量 PCC 的必要性。
{"title":"The Impact of Collaborative Documentation on Person-Centered Care: Textual Analysis of Clinical Notes.","authors":"Victoria Stanhope, Nari Yoo, Elizabeth Matthews, Daniel Baslock, Yuanyuan Hu","doi":"10.2196/52678","DOIUrl":"10.2196/52678","url":null,"abstract":"<p><strong>Background: </strong>Collaborative documentation (CD) is a behavioral health practice involving shared writing of clinic visit notes by providers and consumers. Despite widespread dissemination of CD, research on its effectiveness or impact on person-centered care (PCC) has been limited. Principles of PCC planning, a recovery-based approach to service planning that operationalizes PCC, can inform the measurement of person-centeredness within clinical documentation.</p><p><strong>Objective: </strong>This study aims to use the clinical informatics approach of natural language processing (NLP) to examine the impact of CD on person-centeredness in clinic visit notes. Using a dictionary-based approach, this study conducts a textual analysis of clinic notes from a community mental health center before and after staff were trained in CD.</p><p><strong>Methods: </strong>This study used visit notes (n=1981) from 10 providers in a community mental health center 6 months before and after training in CD. LIWC-22 was used to assess all notes using the Linguistic Inquiry and Word Count (LIWC) dictionary, which categorizes over 5000 linguistic and psychological words. Twelve LIWC categories were selected and mapped onto PCC planning principles through the consensus of 3 domain experts. The LIWC-22 contextualizer was used to extract sentence fragments from notes corresponding to LIWC categories. Then, fixed-effects modeling was used to identify differences in notes before and after CD training while accounting for nesting within the provider.</p><p><strong>Results: </strong>Sentence fragments identified by the contextualizing process illustrated how visit notes demonstrated PCC. The fixed effects analysis found a significant positive shift toward person-centeredness; this was observed in 6 of the selected LIWC categories post CD. Specifically, there was a notable increase in words associated with achievement (β=.774, P<.001), power (β=.831, P<.001), money (β=.204, P<.001), physical health (β=.427, P=.03), while leisure words decreased (β=-.166, P=.002).</p><p><strong>Conclusions: </strong>By using a dictionary-based approach, the study identified how CD might influence the integration of PCC principles within clinical notes. Although the results were mixed, the findings highlight the potential effectiveness of CD in enhancing person-centeredness in clinic notes. By leveraging NLP techniques, this research illuminated the value of narrative clinical notes in assessing the quality of care in behavioral health contexts. These findings underscore the promise of NLP for quality assurance in health care settings and emphasize the need for refining algorithms to more accurately measure PCC.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"12 ","pages":"e52678"},"PeriodicalIF":3.1,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11429664/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PCEtoFHIR: Decomposition of Postcoordinated SNOMED CT Expressions for Storage as HL7 FHIR Resources PCEtoFHIR:分解后协调的 SNOMED CT 表达式,将其存储为 HL7 FHIR 资源
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-17 DOI: 10.2196/57853
Tessa Ohlsen, Josef Ingenerf, Andrea Essenwanger, Cora Drenkhahn
<strong>Background:</strong> To ensure interoperability, both structural and semantic standards must be followed. For exchanging medical data between information systems, the structural standard FHIR (Fast Healthcare Interoperability Resources) has recently gained popularity. Regarding semantic interoperability, the reference terminology SNOMED Clinical Terms (SNOMED CT), as a semantic standard, allows for postcoordination, offering advantages over many other vocabularies. These postcoordinated expressions (PCEs) make SNOMED CT an expressive and flexible interlingua, allowing for precise coding of medical facts. However, this comes at the cost of increased complexity, as well as challenges in storage and processing. Additionally, the boundary between semantic (terminology) and structural (information model) standards becomes blurred, leading to what is known as the TermInfo problem. Although often viewed critically, the TermInfo overlap can also be explored for its potential benefits, such as enabling flexible transformation of parts of PCEs. <strong>Objective:</strong> In this paper, an alternative solution for storing PCEs is presented, which involves combining them with the FHIR data model. Ultimately, all components of a PCE should be expressible solely through precoordinated concepts that are linked to the appropriate elements of the information model. <strong>Methods:</strong> The approach involves storing PCEs decomposed into their components in alignment with FHIR resources. By utilizing the Web Ontology Language (OWL) to generate an OWL ClassExpression, and combining it with an external reasoner and semantic similarity measures, a precoordinated SNOMED CT concept that most accurately describes the PCE is identified as a Superconcept. In addition, the nonmatching attribute relationships between the Superconcept and the PCE are identified as the “Delta.” Once SNOMED CT attributes are manually mapped to FHIR elements, FHIRPath expressions can be defined for both the Superconcept and the Delta, allowing the identified precoordinated codes to be stored within FHIR resources. <strong>Results:</strong> A web application called PCEtoFHIR was developed to implement this approach. In a validation process with 600 randomly selected precoordinated concepts, the formal correctness of the generated OWL ClassExpressions was verified. Additionally, 33 PCEs were used for two separate validation tests. Based on these validations, it was demonstrated that a previously proposed semantic similarity calculation is suitable for determining the Superconcept. Additionally, the 33 PCEs were used to confirm the correct functioning of the entire approach. Furthermore, the FHIR StructureMaps were reviewed and deemed meaningful by FHIR experts. <strong>Conclusions:</strong> PCEtoFHIR offers services to decompose PCEs for storage within FHIR resources. When creating structure mappings for specific subdomains of SNOMED CT concepts (eg, allergies) to desired FHIR profil
背景:为确保互操作性,必须同时遵循结构标准和语义标准。为了在信息系统之间交换医疗数据,结构性标准 FHIR(快速医疗互操作性资源)最近广受欢迎。在语义互操作性方面,参考术语 SNOMED Clinical Terms(SNOMED CT)作为一种语义标准,允许后协调,与许多其他词汇相比具有优势。这些后协调表达式(PCEs)使 SNOMED CT 成为一种富有表现力和灵活性的语际语言,可以对医学事实进行精确编码。然而,这样做的代价是复杂性的增加以及存储和处理方面的挑战。此外,语义(术语)和结构(信息模型)标准之间的界限变得模糊,导致了所谓的术语信息问题。尽管 TermInfo 重叠问题经常被批判性地看待,但我们也可以探索它的潜在好处,例如可以灵活地转换 PCE 的各个部分。目的本文提出了一种存储 PCE 的替代解决方案,即将 PCE 与 FHIR 数据模型相结合。最终,PCE 的所有组成部分都应仅通过与信息模型的适当元素相连的预协调概念来表达。方法:该方法涉及将 PCE 分解为与 FHIR 资源一致的组件进行存储。通过利用网络本体语言(OWL)生成 OWL ClassExpression,并将其与外部推理器和语义相似性度量相结合,将最准确描述 PCE 的预协调 SNOMED CT 概念确定为超级概念。此外,超级概念与 PCE 之间不匹配的属性关系被识别为 "Delta"。将 SNOMED CT 属性手动映射到 FHIR 元素后,就可以为 "超级概念 "和 "三角洲 "定义 FHIRPath 表达式,从而将识别出的预协调代码存储到 FHIR 资源中。结果为实现这一方法,开发了一个名为 PCEtoFHIR 的网络应用程序。在使用 600 个随机选择的预协调概念进行验证的过程中,验证了生成的 OWL ClassExpressions 的形式正确性。此外,33 个 PCE 被用于两个单独的验证测试。基于这些验证,证明了之前提出的语义相似性计算方法适用于确定超级概念。此外,33 个 PCE 被用于确认整个方法的正确功能。此外,FHIR 专家还对 FHIR 结构图进行了审查,并认为该结构图很有意义。结论:PCEtoFHIR 提供了将 PCE 分解并存储到 FHIR 资源中的服务。在为 SNOMED CT 概念(如过敏症)的特定子域创建与所需 FHIR 配置文件的结构映射时,使用 SNOMED CT 表达模板已被证明非常有效。领域专家可以创建具有适当映射的模板,然后终端用户可以以受限的方式轻松地重复使用这些模板。
{"title":"PCEtoFHIR: Decomposition of Postcoordinated SNOMED CT Expressions for Storage as HL7 FHIR Resources","authors":"Tessa Ohlsen, Josef Ingenerf, Andrea Essenwanger, Cora Drenkhahn","doi":"10.2196/57853","DOIUrl":"https://doi.org/10.2196/57853","url":null,"abstract":"&lt;strong&gt;Background:&lt;/strong&gt; To ensure interoperability, both structural and semantic standards must be followed. For exchanging medical data between information systems, the structural standard FHIR (Fast Healthcare Interoperability Resources) has recently gained popularity. Regarding semantic interoperability, the reference terminology SNOMED Clinical Terms (SNOMED CT), as a semantic standard, allows for postcoordination, offering advantages over many other vocabularies. These postcoordinated expressions (PCEs) make SNOMED CT an expressive and flexible interlingua, allowing for precise coding of medical facts. However, this comes at the cost of increased complexity, as well as challenges in storage and processing. Additionally, the boundary between semantic (terminology) and structural (information model) standards becomes blurred, leading to what is known as the TermInfo problem. Although often viewed critically, the TermInfo overlap can also be explored for its potential benefits, such as enabling flexible transformation of parts of PCEs. &lt;strong&gt;Objective:&lt;/strong&gt; In this paper, an alternative solution for storing PCEs is presented, which involves combining them with the FHIR data model. Ultimately, all components of a PCE should be expressible solely through precoordinated concepts that are linked to the appropriate elements of the information model. &lt;strong&gt;Methods:&lt;/strong&gt; The approach involves storing PCEs decomposed into their components in alignment with FHIR resources. By utilizing the Web Ontology Language (OWL) to generate an OWL ClassExpression, and combining it with an external reasoner and semantic similarity measures, a precoordinated SNOMED CT concept that most accurately describes the PCE is identified as a Superconcept. In addition, the nonmatching attribute relationships between the Superconcept and the PCE are identified as the “Delta.” Once SNOMED CT attributes are manually mapped to FHIR elements, FHIRPath expressions can be defined for both the Superconcept and the Delta, allowing the identified precoordinated codes to be stored within FHIR resources. &lt;strong&gt;Results:&lt;/strong&gt; A web application called PCEtoFHIR was developed to implement this approach. In a validation process with 600 randomly selected precoordinated concepts, the formal correctness of the generated OWL ClassExpressions was verified. Additionally, 33 PCEs were used for two separate validation tests. Based on these validations, it was demonstrated that a previously proposed semantic similarity calculation is suitable for determining the Superconcept. Additionally, the 33 PCEs were used to confirm the correct functioning of the entire approach. Furthermore, the FHIR StructureMaps were reviewed and deemed meaningful by FHIR experts. &lt;strong&gt;Conclusions:&lt;/strong&gt; PCEtoFHIR offers services to decompose PCEs for storage within FHIR resources. When creating structure mappings for specific subdomains of SNOMED CT concepts (eg, allergies) to desired FHIR profil","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"29 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142257173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Standardizing Corneal Transplantation Records Using openEHR: Case Study 使用 openEHR 实现角膜移植记录标准化:案例研究
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-16 DOI: 10.2196/48407
Diana Ferreira, Cristiana Neto, Francini Hak, António Abelha, Manuel Santos, José Machado
Background: Corneal transplantation, also known as keratoplasty, is a widely performed surgical procedure that aims to restore vision in patients with corneal damage. The success of corneal transplantation relies on the accurate and timely management of patient information, which can be enhanced using electronic health records (EHRs). However, conventional EHRs are often fragmented and lack standardization, leading to difficulties in information access and sharing, increased medical errors, and decreased patient safety. In the wake of these problems, there is a growing demand for standardized EHRs that can ensure the accuracy and consistency of patient data across health care organizations. Objective: This paper proposes the use of openEHR structures for standardizing corneal transplantation records. The main objective of this research was to improve the quality and interoperability of EHRs in corneal transplantation, making it easier for health care providers to capture, share, and analyze clinical information. Methods: A series of sequential steps were carried out in this study to implement standardized clinical records using openEHR specifications. These specifications furnish a methodical approach that ascertains the development of high-quality clinical records. In broad terms, the methodology followed encompasses the conduction of meetings with health care professionals and the modeling of archetypes, templates, forms, decision rules, and work plans. Results: This research resulted in a tailored solution that streamlines health care delivery and meets the needs of medical professionals involved in the corneal transplantation process while seamlessly aligning with contemporary clinical practices. The proposed solution culminated in the successful integration within a Portuguese hospital of 3 key components of openEHR specifications: forms, Decision Logic Modules, and Work Plans. A statistical analysis of data collected from May 1, 2022, to March 31, 2023, allowed for the perception of the use of the new technologies within the corneal transplantation workflow. Despite the completion rate being only 63.9% (530/830), which can be explained by external factors such as patient health and availability of donor organs, there was an overall improvement in terms of task control and follow-up of the patients’ clinical process. Conclusions: This study shows that the adoption of openEHR structures represents a significant step forward in the standardization and optimization of corneal transplantation records. It offers a detailed demonstration of how to implement openEHR specifications and highlights the different advantages of standardizing EHRs in the field of corneal transplantation. Furthermore, it serves as a valuable reference for researchers and practitioners who are interested in advancing and improving the exploitation of EHRs in health care.
背景:角膜移植(又称角膜成形术)是一种广泛开展的外科手术,旨在恢复角膜受损患者的视力。角膜移植手术的成功有赖于准确、及时地管理患者信息,而电子健康记录(EHR)则可提高管理效率。然而,传统的电子病历往往支离破碎,缺乏标准化,导致信息访问和共享困难重重,医疗差错增加,患者安全性降低。面对这些问题,人们对标准化电子病历的需求与日俱增,因为这种病历可以确保各医疗机构病人数据的准确性和一致性。目标:本文建议使用开放式电子病历结构来实现角膜移植记录的标准化。这项研究的主要目的是提高角膜移植电子病历的质量和互操作性,使医疗服务提供者更容易获取、共享和分析临床信息。方法:本研究采取了一系列循序渐进的步骤,利用开放式电子病历规范实施标准化临床记录。这些规范提供了一种有条不紊的方法,可确保开发出高质量的临床记录。从广义上讲,所采用的方法包括与医护专业人员举行会议,以及对原型、模板、表格、决策规则和工作计划进行建模。结果:这项研究产生了一个量身定制的解决方案,既简化了医疗服务的提供,又满足了参与角膜移植过程的医疗专业人员的需求,同时还与当代临床实践实现了无缝对接。建议的解决方案最终在葡萄牙一家医院内成功整合了 openEHR 规范的 3 个关键组件:表单、决策逻辑模块和工作计划。通过对 2022 年 5 月 1 日至 2023 年 3 月 31 日期间收集的数据进行统计分析,可以了解角膜移植工作流程中新技术的使用情况。尽管完成率仅为 63.9%(530/830),这可能是由于患者健康和供体器官可用性等外部因素造成的,但在任务控制和患者临床过程跟踪方面,总体上有所改善。结论:这项研究表明,采用开放式电子病历结构代表着在角膜移植记录的标准化和优化方面迈出了重要一步。它详细演示了如何实施开放式电子病历规范,并强调了角膜移植领域电子病历标准化的不同优势。此外,对于有志于推进和改善电子病历在医疗保健领域的应用的研究人员和从业人员来说,该书也是一份宝贵的参考资料。
{"title":"Standardizing Corneal Transplantation Records Using openEHR: Case Study","authors":"Diana Ferreira, Cristiana Neto, Francini Hak, António Abelha, Manuel Santos, José Machado","doi":"10.2196/48407","DOIUrl":"https://doi.org/10.2196/48407","url":null,"abstract":"<strong>Background:</strong> Corneal transplantation, also known as keratoplasty, is a widely performed surgical procedure that aims to restore vision in patients with corneal damage. The success of corneal transplantation relies on the accurate and timely management of patient information, which can be enhanced using electronic health records (EHRs). However, conventional EHRs are often fragmented and lack standardization, leading to difficulties in information access and sharing, increased medical errors, and decreased patient safety. In the wake of these problems, there is a growing demand for standardized EHRs that can ensure the accuracy and consistency of patient data across health care organizations. <strong>Objective:</strong> This paper proposes the use of openEHR structures for standardizing corneal transplantation records. The main objective of this research was to improve the quality and interoperability of EHRs in corneal transplantation, making it easier for health care providers to capture, share, and analyze clinical information. <strong>Methods:</strong> A series of sequential steps were carried out in this study to implement standardized clinical records using openEHR specifications. These specifications furnish a methodical approach that ascertains the development of high-quality clinical records. In broad terms, the methodology followed encompasses the conduction of meetings with health care professionals and the modeling of archetypes, templates, forms, decision rules, and work plans. <strong>Results:</strong> This research resulted in a tailored solution that streamlines health care delivery and meets the needs of medical professionals involved in the corneal transplantation process while seamlessly aligning with contemporary clinical practices. The proposed solution culminated in the successful integration within a Portuguese hospital of 3 key components of openEHR specifications: forms, Decision Logic Modules, and Work Plans. A statistical analysis of data collected from May 1, 2022, to March 31, 2023, allowed for the perception of the use of the new technologies within the corneal transplantation workflow. Despite the completion rate being only 63.9% (530/830), which can be explained by external factors such as patient health and availability of donor organs, there was an overall improvement in terms of task control and follow-up of the patients’ clinical process. <strong>Conclusions:</strong> This study shows that the adoption of openEHR structures represents a significant step forward in the standardization and optimization of corneal transplantation records. It offers a detailed demonstration of how to implement openEHR specifications and highlights the different advantages of standardizing EHRs in the field of corneal transplantation. Furthermore, it serves as a valuable reference for researchers and practitioners who are interested in advancing and improving the exploitation of EHRs in health care.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"78 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142257130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early Diagnosis of Hereditary Angioedema in Japan Based on a US Medical Dataset: Algorithm Development and Validation 基于美国医疗数据集的日本遗传性血管性水肿早期诊断:算法开发与验证
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-13 DOI: 10.2196/59858
Kouhei Yamashita, Yuji Nomoto, Tomoya Hirose, Akira Yutani, Akira Okada, Nayu Watanabe, Ken Suzuki, Munenori Senzaki, Tomohiro Kuroda
Background: Hereditary angioedema (HAE), a rare genetic disease, induces acute attacks of swelling in various regions of the body. Its prevalence is estimated to be 1 in 50,000 people, with no reported bias among different ethnic groups. However, considering the estimated prevalence, the number of patients in Japan diagnosed with HAE remains approximately 1 in 250,000, which means that only 20% of potential HAE cases are identified. Objective: This study aimed to develop an artificial intelligence (AI) model that can detect patients with suspected HAE using medical history data (medical claims, prescriptions, and electronic medical records [EMRs]) in the United States. We also aimed to validate the detection performance of the model for HAE cases using the Japanese dataset. Methods: The HAE patient and control groups were identified using the US claims and EMR datasets. We analyzed the characteristics of the diagnostic history of patients with HAE and developed an AI model to predict the probability of HAE based on a generalized linear model and bootstrap method. The model was then applied to the EMR data of the Kyoto University Hospital to verify its applicability to the Japanese dataset. Results: Precision and sensitivity were measured to validate the model performance. Using the comprehensive US dataset, the precision score was 2% in the initial model development step. Our model can screen out suspected patients, where 1 in 50 of these patients have HAE. In addition, in the validation step with Japanese EMR data, the precision score was 23.6%, which exceeded our expectations. We achieved a sensitivity score of 61.5% for the US dataset and 37.6% for the validation exercise using data from a single Japanese hospital. Overall, our model could predict patients with typical HAE symptoms. Conclusions: This study indicates that our AI model can detect HAE in patients with typical symptoms and is effective in Japanese data. However, further prospective clinical studies are required to investigate whether this model can be used to diagnose HAE.
背景:遗传性血管性水肿(HAE遗传性血管性水肿(HAE)是一种罕见的遗传病,会诱发身体各部位的急性肿胀。据估计,该病的发病率为五万人中有一人,不同种族群体之间没有偏差。然而,考虑到估计的发病率,日本确诊的 HAE 患者人数仍约为二十五万分之一,这意味着只有 20% 的潜在 HAE 病例被发现。研究目的本研究旨在开发一种人工智能(AI)模型,利用美国的病史数据(医疗索赔、处方和电子病历 [EMR])检测疑似 HAE 患者。我们还旨在利用日本数据集验证该模型对 HAE 病例的检测性能。方法:利用美国的索赔和 EMR 数据集确定 HAE 患者组和对照组。我们分析了 HAE 患者的诊断史特征,并根据广义线性模型和引导法建立了一个人工智能模型来预测 HAE 的概率。然后将该模型应用于京都大学医院的 EMR 数据,以验证其是否适用于日本数据集。结果:对精确度和灵敏度进行了测量,以验证模型的性能。使用全面的美国数据集,初始模型开发步骤的精确度为 2%。我们的模型可以筛选出疑似患者,其中每 50 名患者中就有 1 人患有 HAE。此外,在使用日本 EMR 数据进行验证的步骤中,精确度达到了 23.6%,超出了我们的预期。我们使用美国数据集获得了 61.5% 的灵敏度分数,使用日本一家医院的数据进行验证时获得了 37.6% 的灵敏度分数。总体而言,我们的模型可以预测具有典型 HAE 症状的患者。结论:这项研究表明,我们的人工智能模型可以检测出具有典型症状的 HAE 患者,而且在日本数据中也很有效。不过,还需要进一步的前瞻性临床研究来探讨该模型是否可用于诊断 HAE。
{"title":"Early Diagnosis of Hereditary Angioedema in Japan Based on a US Medical Dataset: Algorithm Development and Validation","authors":"Kouhei Yamashita, Yuji Nomoto, Tomoya Hirose, Akira Yutani, Akira Okada, Nayu Watanabe, Ken Suzuki, Munenori Senzaki, Tomohiro Kuroda","doi":"10.2196/59858","DOIUrl":"https://doi.org/10.2196/59858","url":null,"abstract":"<strong>Background:</strong> Hereditary angioedema (HAE), a rare genetic disease, induces acute attacks of swelling in various regions of the body. Its prevalence is estimated to be 1 in 50,000 people, with no reported bias among different ethnic groups. However, considering the estimated prevalence, the number of patients in Japan diagnosed with HAE remains approximately 1 in 250,000, which means that only 20% of potential HAE cases are identified. <strong>Objective:</strong> This study aimed to develop an artificial intelligence (AI) model that can detect patients with suspected HAE using medical history data (medical claims, prescriptions, and electronic medical records [EMRs]) in the United States. We also aimed to validate the detection performance of the model for HAE cases using the Japanese dataset. <strong>Methods:</strong> The HAE patient and control groups were identified using the US claims and EMR datasets. We analyzed the characteristics of the diagnostic history of patients with HAE and developed an AI model to predict the probability of HAE based on a generalized linear model and bootstrap method. The model was then applied to the EMR data of the Kyoto University Hospital to verify its applicability to the Japanese dataset. <strong>Results:</strong> Precision and sensitivity were measured to validate the model performance. Using the comprehensive US dataset, the precision score was 2% in the initial model development step. Our model can screen out suspected patients, where 1 in 50 of these patients have HAE. In addition, in the validation step with Japanese EMR data, the precision score was 23.6%, which exceeded our expectations. We achieved a sensitivity score of 61.5% for the US dataset and 37.6% for the validation exercise using data from a single Japanese hospital. Overall, our model could predict patients with typical HAE symptoms. <strong>Conclusions:</strong> This study indicates that our AI model can detect HAE in patients with typical symptoms and is effective in Japanese data. However, further prospective clinical studies are required to investigate whether this model can be used to diagnose HAE.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"17 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142257175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Natural Language Processing Versus Diagnosis Code–Based Methods for Postherpetic Neuralgia Identification: Algorithm Development and Validation 自然语言处理与基于诊断代码的带状疱疹后神经痛识别方法:算法开发与验证
IF 3.2 3区 医学 Q2 MEDICAL INFORMATICS Pub Date : 2024-09-10 DOI: 10.2196/57949
Chengyi Zheng, Bradley Ackerson, Sijia Qiu, Lina S Sy, Leticia I Vega Daily, Jeannie Song, Lei Qian, Yi Luo, Jennifer H Ku, Yanjun Cheng, Jun Wu, Hung Fu Tseng
Background: Diagnosis codes and prescription data are used in algorithms to identify postherpetic neuralgia (PHN), a debilitating complication of herpes zoster (HZ). Because of the questionable accuracy of codes and prescription data, manual chart review is sometimes used to identify PHN in electronic health records (EHR), which can be costly and time-consuming. Objective: To develop and validate a natural language processing (NLP) algorithm for automatically identifying PHN from unstructured EHR data. To compare its performance with that of code-based methods. Methods: This retrospective study used EHR data from Kaiser Permanente Southern California, a large integrated healthcare system that serves over 4.8 million members. The source population included members aged ≥50 years who received an incident HZ diagnosis and accompanying antiviral prescription between 2018-2020 and had ≥1 encounter within 90-180 days of the incident HZ diagnosis. The study team manually reviewed the EHR and identified PHN cases. For NLP development and validation, 500 and 800 random samples from the source population were selected, respectively. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F-score, and Matthews correlation coefficient (MCC) of NLP and the code-based methods were evaluated using chart-reviewed results as the reference standard. Results: The NLP algorithm identified PHN cases with 90.9% sensitivity, 98.5% specificity, 82.0% PPV, and 99.3% NPV. The composite scores of the NLP algorithm were 0.89 (F-score) and 0.85 (MCC). The prevalences of PHN in the validation data were 6.9% (reference standard), 7.6% (NLP), and 5.4-13.1% (code-based). The code-based methods achieved 52.7-61.8% sensitivity, 89.8-98.4% specificity, 27.6-72.1% PPV, and 96.3-97.1% NPV. The F-scores and MCCs were ranged between 0.45-0.59 and 0.32-0.61, respectively. Conclusions: The automated NLP-based approach identified PHN cases from the EHR with good accuracy. This method could be useful in population-based PHN research.
背景:诊断代码和处方数据被用于识别带状疱疹后神经痛(PHN)的算法中,带状疱疹后神经痛是带状疱疹(HZ)的一种使人衰弱的并发症。由于代码和处方数据的准确性值得怀疑,有时会使用人工病历审查来识别电子健康记录(EHR)中的 PHN,这可能既费钱又费时。目标:开发并验证一种自然语言处理(NLP)算法,用于从非结构化电子病历数据中自动识别 PHN。将其性能与基于代码的方法进行比较。方法:这项回顾性研究使用了南加州 Kaiser Permanente 的电子病历数据,这是一个为超过 480 万会员提供服务的大型综合医疗保健系统。研究对象包括年龄≥50 岁的会员,他们在 2018-2020 年间接受过 HZ 诊断和相应的抗病毒处方,并在 HZ 诊断后的 90-180 天内进行过≥1 次就诊。研究小组人工审核了电子病历并确定了 PHN 病例。为进行 NLP 开发和验证,分别从源人群中随机抽取了 500 和 800 个样本。以图表审查结果为参考标准,评估了 NLP 和基于代码方法的灵敏度、特异性、阳性预测值 (PPV)、阴性预测值 (NPV)、F 评分和马修斯相关系数 (MCC)。结果NLP 算法识别 PHN 病例的灵敏度为 90.9%,特异度为 98.5%,PPV 为 82.0%,NPV 为 99.3%。NLP 算法的综合评分为 0.89(F-score)和 0.85(MCC)。验证数据中的 PHN 患病率分别为 6.9%(参考标准)、7.6%(NLP)和 5.4-13.1%(基于代码)。基于代码的方法的灵敏度为 52.7-61.8%,特异度为 89.8-98.4%,PPV 为 27.6-72.1%,NPV 为 96.3-97.1%。F score 和 MCC 分别介于 0.45-0.59 和 0.32-0.61 之间。结论基于 NLP 的自动方法能从电子病历中准确识别 PHN 病例。这种方法可用于基于人群的 PHN 研究。
{"title":"Natural Language Processing Versus Diagnosis Code–Based Methods for Postherpetic Neuralgia Identification: Algorithm Development and Validation","authors":"Chengyi Zheng, Bradley Ackerson, Sijia Qiu, Lina S Sy, Leticia I Vega Daily, Jeannie Song, Lei Qian, Yi Luo, Jennifer H Ku, Yanjun Cheng, Jun Wu, Hung Fu Tseng","doi":"10.2196/57949","DOIUrl":"https://doi.org/10.2196/57949","url":null,"abstract":"Background: Diagnosis codes and prescription data are used in algorithms to identify postherpetic neuralgia (PHN), a debilitating complication of herpes zoster (HZ). Because of the questionable accuracy of codes and prescription data, manual chart review is sometimes used to identify PHN in electronic health records (EHR), which can be costly and time-consuming. Objective: To develop and validate a natural language processing (NLP) algorithm for automatically identifying PHN from unstructured EHR data. To compare its performance with that of code-based methods. Methods: This retrospective study used EHR data from Kaiser Permanente Southern California, a large integrated healthcare system that serves over 4.8 million members. The source population included members aged ≥50 years who received an incident HZ diagnosis and accompanying antiviral prescription between 2018-2020 and had ≥1 encounter within 90-180 days of the incident HZ diagnosis. The study team manually reviewed the EHR and identified PHN cases. For NLP development and validation, 500 and 800 random samples from the source population were selected, respectively. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F-score, and Matthews correlation coefficient (MCC) of NLP and the code-based methods were evaluated using chart-reviewed results as the reference standard. Results: The NLP algorithm identified PHN cases with 90.9% sensitivity, 98.5% specificity, 82.0% PPV, and 99.3% NPV. The composite scores of the NLP algorithm were 0.89 (F-score) and 0.85 (MCC). The prevalences of PHN in the validation data were 6.9% (reference standard), 7.6% (NLP), and 5.4-13.1% (code-based). The code-based methods achieved 52.7-61.8% sensitivity, 89.8-98.4% specificity, 27.6-72.1% PPV, and 96.3-97.1% NPV. The F-scores and MCCs were ranged between 0.45-0.59 and 0.32-0.61, respectively. Conclusions: The automated NLP-based approach identified PHN cases from the EHR with good accuracy. This method could be useful in population-based PHN research.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"56 1","pages":""},"PeriodicalIF":3.2,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142199766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JMIR Medical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1