首页 > 最新文献

JAMIA Open最新文献

英文 中文
Implementation and evaluation of an electronic health record-integrated app for postpartum monitoring of hypertensive disorders of pregnancy using patient-contributed data collection. 利用患者提供的数据收集,实施和评估用于产后监测妊娠高血压疾病的电子健康记录集成应用程序。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-11-14 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad098
Prashila Dullabh, Krysta K Heaney-Huls, Andrew B Chiao, Melissa G Callaham, Priyanka Desai, Nicole A Gauthreaux, Nitu Kashyap, David F Lobach, Aziz Boxwala

Remote monitoring of women experiencing hypertensive disorders of pregnancy (HDP) can provide timely life-saving data, particularly if these data are integrated into existing patient and clinical workflows. This pilot intervention of a smartphone application (app) for postpartum monitoring of hypertensive disorders integrates patient-contributed data into electronic health records (EHRs) to support monitoring and clinical decision-making. Results from the evaluation of the pilot highlight the resources needed when implementing the app, challenges for integrating an app into the EHR, and the usability and utility of the HDP monitoring app for patient and clinician users. The implementation team's key observations included the importance of a local clinical champion, more robust patient involvement and support for the remote patient monitoring program, an impetus for EHR developers to adopt data integration standards, and a need to expand the capabilities of the standards to support interventions using patient-contributed data.

对患有妊娠高血压疾病的妇女进行远程监测可以及时提供挽救生命的数据,特别是如果将这些数据整合到现有的患者和临床工作流程中。这是一款用于产后高血压疾病监测的智能手机应用程序(app)的试点干预,将患者提供的数据整合到电子健康记录(EHRs)中,以支持监测和临床决策。试点评估的结果突出了实施应用程序时所需的资源,将应用程序集成到EHR中的挑战,以及HDP监测应用程序对患者和临床医生用户的可用性和实用性。实施小组的主要观察结果包括当地临床倡导者的重要性、患者更积极的参与和对远程患者监测项目的支持、电子病历开发人员采用数据集成标准的动力,以及扩大标准能力以支持使用患者贡献数据的干预措施的必要性。
{"title":"Implementation and evaluation of an electronic health record-integrated app for postpartum monitoring of hypertensive disorders of pregnancy using patient-contributed data collection.","authors":"Prashila Dullabh, Krysta K Heaney-Huls, Andrew B Chiao, Melissa G Callaham, Priyanka Desai, Nicole A Gauthreaux, Nitu Kashyap, David F Lobach, Aziz Boxwala","doi":"10.1093/jamiaopen/ooad098","DOIUrl":"10.1093/jamiaopen/ooad098","url":null,"abstract":"<p><p>Remote monitoring of women experiencing hypertensive disorders of pregnancy (HDP) can provide timely life-saving data, particularly if these data are integrated into existing patient and clinical workflows. This pilot intervention of a smartphone application (app) for postpartum monitoring of hypertensive disorders integrates patient-contributed data into electronic health records (EHRs) to support monitoring and clinical decision-making. Results from the evaluation of the pilot highlight the resources needed when implementing the app, challenges for integrating an app into the EHR, and the usability and utility of the HDP monitoring app for patient and clinician users. The implementation team's key observations included the importance of a local clinical champion, more robust patient involvement and support for the remote patient monitoring program, an impetus for EHR developers to adopt data integration standards, and a need to expand the capabilities of the standards to support interventions using patient-contributed data.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad098"},"PeriodicalIF":2.1,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10646567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138463163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Normalization of drug and therapeutic concepts with Thera-Py. 用Thera-Py规范药物和治疗概念。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-11-08 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad093
Matthew Cannon, James Stevenson, Kori Kuzma, Susanna Kiwala, Jeremy L Warner, Obi L Griffith, Malachi Griffith, Alex H Wagner

Objective: The diversity of nomenclature and naming strategies makes therapeutic terminology difficult to manage and harmonize. As the number and complexity of available therapeutic ontologies continues to increase, the need for harmonized cross-resource mappings is becoming increasingly apparent. This study creates harmonized concept mappings that enable the linking together of like-concepts despite source-dependent differences in data structure or semantic representation.

Materials and methods: For this study, we created Thera-Py, a Python package and web API that constructs searchable concepts for drugs and therapeutic terminologies using 9 public resources and thesauri. By using a directed graph approach, Thera-Py captures commonly used aliases, trade names, annotations, and associations for any given therapeutic and combines them under a single concept record.

Results: We highlight the creation of 16 069 unique merged therapeutic concepts from 9 distinct sources using Thera-Py and observe an increase in overlap of therapeutic concepts in 2 or more knowledge bases after harmonization using Thera-Py (9.8%-41.8%).

Conclusion: We observe that Thera-Py tends to normalize therapeutic concepts to their underlying active ingredients (excluding nondrug therapeutics, eg, radiation therapy, biologics), and unifies all available descriptors regardless of ontological origin.

目的:命名法和命名策略的多样性使治疗术语难以管理和协调。随着可用治疗本体论的数量和复杂性不断增加,对协调的跨资源映射的需求变得越来越明显。本研究创建了协调的概念映射,使类似概念能够链接在一起,尽管数据结构或语义表示的源依赖差异。材料和方法:在这项研究中,我们创建了Thera-Py,这是一个Python包和web API,它使用9个公共资源和词典构建可搜索的药物和治疗术语概念。通过使用有向图方法,Thera-Py捕获任何给定治疗的常用别名、商品名称、注释和关联,并将它们合并到单个概念记录下。结果:我们强调使用Thera-Py从9个不同的来源创建了16069个独特的合并治疗概念,并观察到使用Thera-Py协调后,两个或更多知识库中治疗概念的重叠增加(9.8%-41.8%)。结论:我们观察到,therapy - py倾向于将治疗概念标准化到其潜在的有效成分(不包括非药物治疗,例如放射治疗,生物制剂),并统一所有可用的描述符,而不考虑本体论起源。
{"title":"Normalization of drug and therapeutic concepts with Thera-Py.","authors":"Matthew Cannon, James Stevenson, Kori Kuzma, Susanna Kiwala, Jeremy L Warner, Obi L Griffith, Malachi Griffith, Alex H Wagner","doi":"10.1093/jamiaopen/ooad093","DOIUrl":"10.1093/jamiaopen/ooad093","url":null,"abstract":"<p><strong>Objective: </strong>The diversity of nomenclature and naming strategies makes therapeutic terminology difficult to manage and harmonize. As the number and complexity of available therapeutic ontologies continues to increase, the need for harmonized cross-resource mappings is becoming increasingly apparent. This study creates harmonized concept mappings that enable the linking together of like-concepts despite source-dependent differences in data structure or semantic representation.</p><p><strong>Materials and methods: </strong>For this study, we created Thera-Py, a Python package and web API that constructs searchable concepts for drugs and therapeutic terminologies using 9 public resources and thesauri. By using a directed graph approach, Thera-Py captures commonly used aliases, trade names, annotations, and associations for any given therapeutic and combines them under a single concept record.</p><p><strong>Results: </strong>We highlight the creation of 16 069 unique merged therapeutic concepts from 9 distinct sources using Thera-Py and observe an increase in overlap of therapeutic concepts in 2 or more knowledge bases after harmonization using Thera-Py (9.8%-41.8%).</p><p><strong>Conclusion: </strong>We observe that Thera-Py tends to normalize therapeutic concepts to their underlying active ingredients (excluding nondrug therapeutics, eg, radiation therapy, biologics), and unifies all available descriptors regardless of ontological origin.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad093"},"PeriodicalIF":2.1,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10637840/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89719861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creation of a data commons for substance misuse related health research through privacy-preserving patient record linkage between hospitals and state agencies. 通过医院和国家机构之间的隐私保护患者记录链接,为药物滥用相关的健康研究创建数据共享。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-11-02 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad092
Majid Afshar, Madeline Oguss, Thomas A Callaci, Timothy Gruenloh, Preeti Gupta, Claire Sun, Askar Safipour Afshar, Joseph Cavanaugh, Matthew M Churpek, Edwin Nyakoe-Nyasani, Huong Nguyen-Hilfiger, Ryan Westergaard, Elizabeth Salisbury-Afshar, Megan Gussick, Brian Patterson, Claire Manneh, Jomol Mathew, Anoop Mayampurath

Objectives: Substance misuse is a complex and heterogeneous set of conditions associated with high mortality and regional/demographic variations. Existing data systems are siloed and have been ineffective in curtailing the substance misuse epidemic. Therefore, we aimed to build a novel informatics platform, the Substance Misuse Data Commons (SMDC), by integrating multiple data modalities to provide a unified record of information crucial to improving outcomes in substance misuse patients.

Materials and methods: The SMDC was created by linking electronic health record (EHR) data from adult cases of substance (alcohol, opioid, nonopioid drug) misuse at the University of Wisconsin hospitals to socioeconomic and state agency data. To ensure private and secure data exchange, Privacy-Preserving Record Linkage (PPRL) and Honest Broker services were utilized. The overlap in mortality reporting among the EHR, state Vital Statistics, and a commercial national data source was assessed.

Results: The SMDC included data from 36 522 patients experiencing 62 594 healthcare encounters. Over half of patients were linked to the statewide ambulance database and prescription drug monitoring program. Chronic diseases accounted for most underlying causes of death, while drug-related overdoses constituted 8%. Our analysis of mortality revealed a 49.1% overlap across the 3 data sources. Nonoverlapping deaths were associated with poor socioeconomic indicators.

Discussion: Through PPRL, the SMDC enabled the longitudinal integration of multimodal data. Combining death data from local, state, and national sources enhanced mortality tracking and exposed disparities.

Conclusion: The SMDC provides a comprehensive resource for clinical providers and policymakers to inform interventions targeting substance misuse-related hospitalizations, overdoses, and death.

目标:药物滥用是一组复杂而异质的情况,与高死亡率和区域/人口差异有关。现有的数据系统各自为政,在遏制药物滥用流行病方面效果不佳。因此,我们旨在通过整合多种数据模式,建立一个新的信息学平台,即药物滥用数据共享(SMDC),以提供对改善药物滥用患者预后至关重要的信息的统一记录。材料和方法:SMDC是通过将威斯康星大学医院成人滥用物质(酒精、阿片类药物、非阿片类药品)病例的电子健康记录(EHR)数据与社会经济和州机构数据联系起来创建的。为了确保私人和安全的数据交换,使用了隐私保护记录链接(PPRL)和诚实经纪人服务。评估了EHR、州生命统计和国家商业数据来源之间的死亡率报告重叠。结果:SMDC包括来自经历了62594次医疗保健遭遇的36522名患者的数据。超过一半的患者与全州救护车数据库和处方药监测项目有关。慢性病是导致死亡的最根本原因,而与药物有关的过量用药占8%。我们对死亡率的分析显示,三个数据来源之间有49.1%的重叠。非重叠死亡与较差的社会经济指标有关。讨论:通过PPRL,SMDC实现了多模式数据的纵向集成。结合来自地方、州和国家来源的死亡数据,可以加强死亡率跟踪并暴露差异。结论:SMDC为临床提供者和决策者提供了一个全面的资源,为针对药物滥用相关住院、过量用药和死亡的干预措施提供信息。
{"title":"Creation of a data commons for substance misuse related health research through privacy-preserving patient record linkage between hospitals and state agencies.","authors":"Majid Afshar, Madeline Oguss, Thomas A Callaci, Timothy Gruenloh, Preeti Gupta, Claire Sun, Askar Safipour Afshar, Joseph Cavanaugh, Matthew M Churpek, Edwin Nyakoe-Nyasani, Huong Nguyen-Hilfiger, Ryan Westergaard, Elizabeth Salisbury-Afshar, Megan Gussick, Brian Patterson, Claire Manneh, Jomol Mathew, Anoop Mayampurath","doi":"10.1093/jamiaopen/ooad092","DOIUrl":"10.1093/jamiaopen/ooad092","url":null,"abstract":"<p><strong>Objectives: </strong>Substance misuse is a complex and heterogeneous set of conditions associated with high mortality and regional/demographic variations. Existing data systems are siloed and have been ineffective in curtailing the substance misuse epidemic. Therefore, we aimed to build a novel informatics platform, the Substance Misuse Data Commons (SMDC), by integrating multiple data modalities to provide a unified record of information crucial to improving outcomes in substance misuse patients.</p><p><strong>Materials and methods: </strong>The SMDC was created by linking electronic health record (EHR) data from adult cases of substance (alcohol, opioid, nonopioid drug) misuse at the University of Wisconsin hospitals to socioeconomic and state agency data. To ensure private and secure data exchange, Privacy-Preserving Record Linkage (PPRL) and Honest Broker services were utilized. The overlap in mortality reporting among the EHR, state Vital Statistics, and a commercial national data source was assessed.</p><p><strong>Results: </strong>The SMDC included data from 36 522 patients experiencing 62 594 healthcare encounters. Over half of patients were linked to the statewide ambulance database and prescription drug monitoring program. Chronic diseases accounted for most underlying causes of death, while drug-related overdoses constituted 8%. Our analysis of mortality revealed a 49.1% overlap across the 3 data sources. Nonoverlapping deaths were associated with poor socioeconomic indicators.</p><p><strong>Discussion: </strong>Through PPRL, the SMDC enabled the longitudinal integration of multimodal data. Combining death data from local, state, and national sources enhanced mortality tracking and exposed disparities.</p><p><strong>Conclusion: </strong>The SMDC provides a comprehensive resource for clinical providers and policymakers to inform interventions targeting substance misuse-related hospitalizations, overdoses, and death.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad092"},"PeriodicalIF":2.1,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10629613/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71522848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using machine learning to improve anaphylaxis case identification in medical claims data. 使用机器学习改进医疗索赔数据中的过敏反应病例识别。
IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-27 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad090
Kamil Can Kural, Ilya Mazo, Mark Walderhaug, Luis Santana-Quintero, Konstantinos Karagiannis, Elaine E Thompson, Jeffrey A Kelman, Ravi Goud

Objective: Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of "Big Data" for healthcare or public health purposes.

Methods: This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases.

Results: Resulting machine learning model accuracies ranged between 47.7% and 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms.

Discussion: Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm.

Conclusion: Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.

目的:过敏反应是一种严重的危及生命的过敏反应,在医疗保健数据库中准确识别它可以利用“大数据”的潜力用于医疗保健或公共卫生目的。方法:本研究使用2015年10月1日至2019年2月28日期间从CMS数据库获得的索赔数据,检验机器学习在识别事件过敏病例中的效用。我们创建了一个特征选择管道来识别不同数据集之间的关键特征。然后,使用各种无监督和有监督的方法(例如,Sammon映射和极限梯度增强)在不同数据质量的数据集上训练模型,这反映了医学数据库中地面实况数据的不同可用性和潜在的稀有性。结果:在实际数据上测试时,得到的机器学习模型准确率在47.7%和94.4%之间。最后,我们发现了新的功能来帮助专家增强现有的案例查找算法。讨论:开发精确的算法来检测索赔中的医疗结果可能是一个费力而昂贵的过程,尤其是对于呈现和编码不同的情况。我们发现过滤掉用于数据管理的高效代码以识别潜在的模式和特征是有益的。为了在必要时改进基于规则的算法,研究人员可以使用模型解释器来确定值得注意的特征,然后与专家共享并将其包含在算法中。结论:我们的工作表明,机器学习模型可以在与之前发表的专家案例发现算法类似的水平上执行,同时也有可能通过识别算法构建的新相关特征来提高性能或简化算法构建过程。
{"title":"Using machine learning to improve anaphylaxis case identification in medical claims data.","authors":"Kamil Can Kural, Ilya Mazo, Mark Walderhaug, Luis Santana-Quintero, Konstantinos Karagiannis, Elaine E Thompson, Jeffrey A Kelman, Ravi Goud","doi":"10.1093/jamiaopen/ooad090","DOIUrl":"10.1093/jamiaopen/ooad090","url":null,"abstract":"<p><strong>Objective: </strong>Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of \"Big Data\" for healthcare or public health purposes.</p><p><strong>Methods: </strong>This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases.</p><p><strong>Results: </strong>Resulting machine learning model accuracies ranged between 47.7% and 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms.</p><p><strong>Discussion: </strong>Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm.</p><p><strong>Conclusion: </strong>Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad090"},"PeriodicalIF":2.5,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10611436/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71414454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and application of pharmacological statin-associated muscle symptoms phenotyping algorithms using structured and unstructured electronic health records data. 利用结构化和非结构化电子健康记录数据开发和应用药理学他汀类药物相关肌肉症状表型算法。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-24 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad087
Boguang Sun, Pui Ying Yew, Chih-Lin Chi, Meijia Song, Matt Loth, Rui Zhang, Robert J Straka

Importance: Statins are widely prescribed cholesterol-lowering medications in the United States, but their clinical benefits can be diminished by statin-associated muscle symptoms (SAMS), leading to discontinuation.

Objectives: In this study, we aimed to develop and validate a pharmacological SAMS clinical phenotyping algorithm using electronic health records (EHRs) data from Minnesota Fairview.

Materials and methods: We retrieved structured and unstructured EHR data of statin users and manually ascertained a gold standard set of SAMS cases and controls using the published SAMS-Clinical Index tool from clinical notes in 200 patients. We developed machine learning algorithms and rule-based algorithms that incorporated various criteria, including ICD codes, statin allergy, creatine kinase elevation, and keyword mentions in clinical notes. We applied the best-performing algorithm to the statin cohort to identify SAMS.

Results: We identified 16 889 patients who started statins in the Fairview EHR system from 2010 to 2020. The combined rule-based (CRB) algorithm, which utilized both clinical notes and structured data criteria, achieved similar performance compared to machine learning algorithms with a precision of 0.85, recall of 0.71, and F1 score of 0.77 against the gold standard set. Applying the CRB algorithm to the statin cohort, we identified the pharmacological SAMS prevalence to be 1.9% and selective risk factors which included female gender, coronary artery disease, hypothyroidism, and use of immunosuppressants or fibrates.

Discussion and conclusion: Our study developed and validated a simple pharmacological SAMS phenotyping algorithm that can be used to create SAMS case/control cohort to enable further analysis which can lead to the development of a SAMS risk prediction model.

重要性:他汀类药物在美国被广泛用于降胆固醇药物,但其临床益处可能会因他汀类药物相关肌肉症状(SAMS)而减弱,从而导致停药。目的:在本研究中,我们的目标是使用明尼苏达州Fairview的电子健康记录(EHR)数据开发和验证药理学SAMS临床表型算法。材料和方法:我们检索他汀类药物使用者的结构化和非结构化EHR数据,并使用已发布的SAMS临床索引工具从200名患者的临床笔记中手动确定SAMS病例和对照的金标准集。我们开发了机器学习算法和基于规则的算法,这些算法结合了各种标准,包括ICD代码、他汀类药物过敏、肌酸激酶升高和临床笔记中的关键词提及。我们在他汀类药物队列中应用了性能最好的算法来识别SAMS。结果:我们确定了16个 2010年至2020年,889名患者在Fairview EHR系统中开始服用他汀类药物。与机器学习算法相比,基于规则的组合算法(CRB)同时利用了临床笔记和结构化数据标准,实现了类似的性能,精度为0.85,召回率为0.71,F1分数为0.77。将CRB算法应用于他汀类药物队列,我们确定了药理学SAMS患病率为1.9%,以及选择性风险因素,包括女性、冠状动脉疾病、甲状腺功能减退和使用免疫抑制剂或纤维蛋白。讨论和结论:我们的研究开发并验证了一种简单的药理学SAMS表型算法,该算法可用于创建SAMS病例/对照队列,以进行进一步分析,从而开发SAMS风险预测模型。
{"title":"Development and application of pharmacological statin-associated muscle symptoms phenotyping algorithms using structured and unstructured electronic health records data.","authors":"Boguang Sun, Pui Ying Yew, Chih-Lin Chi, Meijia Song, Matt Loth, Rui Zhang, Robert J Straka","doi":"10.1093/jamiaopen/ooad087","DOIUrl":"10.1093/jamiaopen/ooad087","url":null,"abstract":"<p><strong>Importance: </strong>Statins are widely prescribed cholesterol-lowering medications in the United States, but their clinical benefits can be diminished by statin-associated muscle symptoms (SAMS), leading to discontinuation.</p><p><strong>Objectives: </strong>In this study, we aimed to develop and validate a pharmacological SAMS clinical phenotyping algorithm using electronic health records (EHRs) data from Minnesota Fairview.</p><p><strong>Materials and methods: </strong>We retrieved structured and unstructured EHR data of statin users and manually ascertained a gold standard set of SAMS cases and controls using the published SAMS-Clinical Index tool from clinical notes in 200 patients. We developed machine learning algorithms and rule-based algorithms that incorporated various criteria, including ICD codes, statin allergy, creatine kinase elevation, and keyword mentions in clinical notes. We applied the best-performing algorithm to the statin cohort to identify SAMS.</p><p><strong>Results: </strong>We identified 16 889 patients who started statins in the Fairview EHR system from 2010 to 2020. The combined rule-based (CRB) algorithm, which utilized both clinical notes and structured data criteria, achieved similar performance compared to machine learning algorithms with a precision of 0.85, recall of 0.71, and F1 score of 0.77 against the gold standard set. Applying the CRB algorithm to the statin cohort, we identified the pharmacological SAMS prevalence to be 1.9% and selective risk factors which included female gender, coronary artery disease, hypothyroidism, and use of immunosuppressants or fibrates.</p><p><strong>Discussion and conclusion: </strong>Our study developed and validated a simple pharmacological SAMS phenotyping algorithm that can be used to create SAMS case/control cohort to enable further analysis which can lead to the development of a SAMS risk prediction model.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad087"},"PeriodicalIF":2.1,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/de/c5/ooad087.PMC10597587.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50163081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Black American women's attitudes toward seeking mental health services and use of mobile technology to support the management of anxiety. 美国黑人女性对寻求心理健康服务的态度,以及使用移动技术支持焦虑症的管理。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-17 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad088
Terika McCall, Meagan Foster, Holly R Tomlin, Todd A Schwartz

Objectives: This study aimed to understand Black American women's attitudes toward seeking mental health services and using mobile technology to receive support for managing anxiety.

Methods: A self-administered web-based questionnaire was launched in October 2019 and closed in January 2020. Women who identified as Black/African American were eligible to participate. The survey consisted of approximately 70 questions and covered topics such as, attitudes toward seeking professional psychological help, acceptability of using a mobile phone to receive mental health care, and screening for anxiety.

Results: The findings of the study (N = 395) showed that younger Black women were more likely to have greater severity of anxiety than their older counterparts. Respondents were most comfortable with the use of a voice call or video call to communicate with a professional to receive support to manage anxiety in comparison to text messaging or mobile app. Younger age, higher income, and greater scores for psychological openness and help-seeking propensity increased odds of indicating agreement with using mobile technology to communicate with a professional. Black women in the Southern region of the United States had twice the odds of agreeing to the use of mobile apps than women in the Midwest and Northeast regions.

Discussion: Black American women, in general, have favorable views toward the use of mobile technology to receive support to manage anxiety.

Conclusion: Preferences and cultural appropriateness of resources should be assessed on an individual basis to increase likelihood of adoption and engagement with digital mental health interventions for management of anxiety.

目的:本研究旨在了解美国黑人女性对寻求心理健康服务和使用移动技术获得焦虑管理支持的态度。方法:2019年10月发布了一份基于网络的自我管理问卷,并于2020年1月结束。被认定为黑人/非裔美国人的妇女有资格参加。这项调查由大约70个问题组成,涵盖了寻求专业心理帮助的态度、使用手机接受心理健康护理的可接受性以及焦虑筛查等主题。结果:研究结果(N = 395)表明,年轻的黑人女性比年长的女性更有可能出现更严重的焦虑。与短信或移动应用程序相比,受访者最喜欢使用语音通话或视频通话与专业人士沟通,以获得管理焦虑的支持。年龄越小,收入越高,心理开放性和求助倾向得分越高,表示同意使用移动技术与专业人士沟通的几率就越高。美国南部地区的黑人女性同意使用移动应用程序的几率是中西部和东北部地区女性的两倍。讨论:总的来说,美国黑人女性对使用移动技术来获得管理焦虑的支持持赞成态度。结论:应根据个人情况评估资源的偏好和文化适宜性,以增加采用和参与数字心理健康干预措施管理焦虑的可能性。
{"title":"Black American women's attitudes toward seeking mental health services and use of mobile technology to support the management of anxiety.","authors":"Terika McCall, Meagan Foster, Holly R Tomlin, Todd A Schwartz","doi":"10.1093/jamiaopen/ooad088","DOIUrl":"10.1093/jamiaopen/ooad088","url":null,"abstract":"<p><strong>Objectives: </strong>This study aimed to understand Black American women's attitudes toward seeking mental health services and using mobile technology to receive support for managing anxiety.</p><p><strong>Methods: </strong>A self-administered web-based questionnaire was launched in October 2019 and closed in January 2020. Women who identified as Black/African American were eligible to participate. The survey consisted of approximately 70 questions and covered topics such as, attitudes toward seeking professional psychological help, acceptability of using a mobile phone to receive mental health care, and screening for anxiety.</p><p><strong>Results: </strong>The findings of the study (<i>N</i> = 395) showed that younger Black women were more likely to have greater severity of anxiety than their older counterparts. Respondents were most comfortable with the use of a voice call or video call to communicate with a professional to receive support to manage anxiety in comparison to text messaging or mobile app. Younger age, higher income, and greater scores for psychological openness and help-seeking propensity increased odds of indicating agreement with using mobile technology to communicate with a professional. Black women in the Southern region of the United States had twice the odds of agreeing to the use of mobile apps than women in the Midwest and Northeast regions.</p><p><strong>Discussion: </strong>Black American women, in general, have favorable views toward the use of mobile technology to receive support to manage anxiety.</p><p><strong>Conclusion: </strong>Preferences and cultural appropriateness of resources should be assessed on an individual basis to increase likelihood of adoption and engagement with digital mental health interventions for management of anxiety.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad088"},"PeriodicalIF":2.1,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10582519/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49683052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medical Informatics Operating Room Vitals and Events Repository (MOVER): a public-access operating room database. 医疗信息学手术室生命和事件库(MOVER):一个公共访问手术室数据库。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-17 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad084
Muntaha Samad, Mirana Angel, Joseph Rinehart, Yuzo Kanomata, Pierre Baldi, Maxime Cannesson

Objectives: Artificial intelligence (AI) holds great promise for transforming the healthcare industry. However, despite its potential, AI is yet to see widespread deployment in clinical settings in significant part due to the lack of publicly available clinical data and the lack of transparency in the published AI algorithms. There are few clinical data repositories publicly accessible to researchers to train and test AI algorithms, and even fewer that contain specialized data from the perioperative setting. To address this gap, we present and release the Medical Informatics Operating Room Vitals and Events Repository (MOVER).

Materials and methods: This first release of MOVER includes adult patients who underwent surgery at the University of California, Irvine Medical Center from 2015 to 2022. Data for patients who underwent surgery were captured from 2 different sources: High-fidelity physiological waveforms from all of the operating rooms were captured in real time and matched with electronic medical record data.

Results: MOVER includes data from 58 799 unique patients and 83 468 surgeries. MOVER is available for download at https://doi.org/10.24432/C5VS5G, it can be downloaded by anyone who signs a data usage agreement (DUA), to restrict traffic to legitimate researchers.

Discussion: To the best of our knowledge MOVER is the only freely available public data repository that contains electronic health record and high-fidelity physiological waveforms data for patients undergoing surgery.

Conclusion: MOVER is freely available to all researchers who sign a DUA, and we hope that it will accelerate the integration of AI into healthcare settings, ultimately leading to improved patient outcomes.

目标:人工智能(AI)为医疗保健行业的转型带来了巨大的希望。然而,尽管人工智能具有潜力,但在很大程度上,由于缺乏公开的临床数据和公布的人工智能算法缺乏透明度,人工智能尚未在临床环境中得到广泛部署。很少有临床数据存储库可供研究人员公开访问,以训练和测试人工智能算法,甚至更少包含围手术期的专业数据。为了解决这一差距,我们推出并发布了医学信息学手术室生命和事件库(MOVER)。材料和方法:MOVER的首次发布包括2015年至2022年在加州大学欧文医学中心接受手术的成年患者。接受手术的患者的数据来自两个不同的来源:所有手术室的高保真生理波形都是实时采集的,并与电子病历数据相匹配。结果:MOVER包括来自58799名独特患者和83468例手术的数据。MOVER可在下载https://doi.org/10.24432/C5VS5G,任何签署数据使用协议(DUA)的人都可以下载,以限制合法研究人员的流量。讨论:据我们所知,MOVER是唯一一个免费提供的公共数据存储库,包含接受手术的患者的电子健康记录和高保真生理波形数据。结论:所有签署DUA的研究人员都可以免费使用MOVER,我们希望它将加速人工智能融入医疗环境,最终改善患者的预后。
{"title":"Medical Informatics Operating Room Vitals and Events Repository (MOVER): a public-access operating room database.","authors":"Muntaha Samad, Mirana Angel, Joseph Rinehart, Yuzo Kanomata, Pierre Baldi, Maxime Cannesson","doi":"10.1093/jamiaopen/ooad084","DOIUrl":"10.1093/jamiaopen/ooad084","url":null,"abstract":"<p><strong>Objectives: </strong>Artificial intelligence (AI) holds great promise for transforming the healthcare industry. However, despite its potential, AI is yet to see widespread deployment in clinical settings in significant part due to the lack of publicly available clinical data and the lack of transparency in the published AI algorithms. There are few clinical data repositories publicly accessible to researchers to train and test AI algorithms, and even fewer that contain specialized data from the perioperative setting. To address this gap, we present and release the Medical Informatics Operating Room Vitals and Events Repository (MOVER).</p><p><strong>Materials and methods: </strong>This first release of MOVER includes adult patients who underwent surgery at the University of California, Irvine Medical Center from 2015 to 2022. Data for patients who underwent surgery were captured from 2 different sources: High-fidelity physiological waveforms from all of the operating rooms were captured in real time and matched with electronic medical record data.</p><p><strong>Results: </strong>MOVER includes data from 58 799 unique patients and 83 468 surgeries. MOVER is available for download at https://doi.org/10.24432/C5VS5G, it can be downloaded by anyone who signs a data usage agreement (DUA), to restrict traffic to legitimate researchers.</p><p><strong>Discussion: </strong>To the best of our knowledge MOVER is the only freely available public data repository that contains electronic health record and high-fidelity physiological waveforms data for patients undergoing surgery.</p><p><strong>Conclusion: </strong>MOVER is freely available to all researchers who sign a DUA, and we hope that it will accelerate the integration of AI into healthcare settings, ultimately leading to improved patient outcomes.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad084"},"PeriodicalIF":2.1,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10582520/pdf/ooad084.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49683054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction. 用于重度抑郁症风险预测的特征工程和预训练的序列自动编码器。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-09 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad086
Barrett W Jones, Warren D Taylor, Colin G Walsh

Objectives: We evaluated autoencoders as a feature engineering and pretraining technique to improve major depressive disorder (MDD) prognostic risk prediction. Autoencoders can represent temporal feature relationships not identified by aggregate features. The predictive performance of autoencoders of multiple sequential structures was evaluated as feature engineering and pretraining strategies on an array of prediction tasks and compared to a restricted Boltzmann machine (RBM) and random forests as a benchmark.

Materials and methods: We study MDD patients from Vanderbilt University Medical Center. Autoencoder models with Attention and long-short-term memory (LSTM) layers were trained to create latent representations of the input data. Predictive performance was evaluated temporally by fitting random forest models to predict future outcomes with engineered features as input and using autoencoder weights to initialize neural network layers. We evaluated area under the precision-recall curve (AUPRC) trends and variation over the study population's treatment course.

Results: The pretrained LSTM model improved predictive performance over pretrained Attention models and benchmarks in 3 of 4 outcomes including self-harm/suicide attempt (AUPRCs, LSTM pretrained = 0.012, Attention pretrained = 0.010, RBM = 0.009, random forest = 0.005). The use of autoencoders for feature engineering had varied results, with benchmarks outperforming LSTM and Attention encodings on the self-harm/suicide attempt outcome (AUPRCs, LSTM encodings = 0.003, Attention encodings = 0.004, RBM = 0.009, random forest = 0.005).

Discussion: Improvement in prediction resulting from pretraining has the potential for increased clinical impact of MDD risk models. We did not find evidence that the use of temporal feature encodings was additive to predictive performance in the study population. This suggests that predictive information retained by model weights may be lost during encoding. LSTM pretrained model predictive performance is shown to be clinically useful and improves over state-of-the-art predictors in the MDD phenotype. LSTM model performance warrants consideration of use in future related studies.

Conclusion: LSTM models with pretrained weights from autoencoders were able to outperform the benchmark and a pretrained Attention model. Future researchers developing risk models in MDD may benefit from the use of LSTM autoencoder pretrained weights.

目的:我们评估了自动编码器作为一种功能工程和预训练技术,以改善重度抑郁障碍(MDD)的预后风险预测。自动编码器可以表示未由聚合特征识别的时间特征关系。将多个序列结构的自动编码器的预测性能作为一系列预测任务的特征工程和预训练策略进行评估,并与限制玻尔兹曼机(RBM)和随机森林作为基准进行比较。材料和方法:我们研究了范德比尔特大学医学中心的MDD患者。训练具有注意力和长短期记忆(LSTM)层的自动编码器模型,以创建输入数据的潜在表示。通过拟合随机森林模型以预测未来结果,将工程特征作为输入,并使用自动编码器权重初始化神经网络层,对预测性能进行了时间评估。我们评估了精确回忆曲线下面积(AUPRC)在研究人群治疗过程中的趋势和变化。结果:在包括自残/自杀未遂在内的4种结果中,预训练的LSTM模型比预训练的注意力模型和基准提高了预测性能(AUPRCs,LSTM预训练 = 0.012,注意力预训练 = 0.010,RBM = 0.009,随机森林 = 0.005)。将自动编码器用于特征工程的结果各不相同,在自残/自杀未遂结果方面,基准测试优于LSTM和注意力编码(AUPRCs,LSTM编码 = 0.003,注意编码 = 0.004,RBM = 0.009,随机森林 = 0.005)。讨论:预训练带来的预测改进有可能增加MDD风险模型的临床影响。我们没有发现证据表明,在研究人群中,时间特征编码的使用可以增加预测性能。这表明由模型权重保留的预测信息可能在编码期间丢失。LSTM预训练的模型预测性能被证明在临床上是有用的,并且在MDD表型中优于最先进的预测因子。LSTM模型的性能值得在未来的相关研究中考虑使用。结论:具有来自自动编码器的预训练权重的LSTM模型能够优于基准和预训练的注意力模型。未来开发MDD风险模型的研究人员可能会受益于LSTM自动编码器预训练权重的使用。
{"title":"Sequential autoencoders for feature engineering and pretraining in major depressive disorder risk prediction.","authors":"Barrett W Jones, Warren D Taylor, Colin G Walsh","doi":"10.1093/jamiaopen/ooad086","DOIUrl":"10.1093/jamiaopen/ooad086","url":null,"abstract":"<p><strong>Objectives: </strong>We evaluated autoencoders as a feature engineering and pretraining technique to improve major depressive disorder (MDD) prognostic risk prediction. Autoencoders can represent temporal feature relationships not identified by aggregate features. The predictive performance of autoencoders of multiple sequential structures was evaluated as feature engineering and pretraining strategies on an array of prediction tasks and compared to a restricted Boltzmann machine (RBM) and random forests as a benchmark.</p><p><strong>Materials and methods: </strong>We study MDD patients from Vanderbilt University Medical Center. Autoencoder models with Attention and long-short-term memory (LSTM) layers were trained to create latent representations of the input data. Predictive performance was evaluated temporally by fitting random forest models to predict future outcomes with engineered features as input and using autoencoder weights to initialize neural network layers. We evaluated area under the precision-recall curve (AUPRC) trends and variation over the study population's treatment course.</p><p><strong>Results: </strong>The pretrained LSTM model improved predictive performance over pretrained Attention models and benchmarks in 3 of 4 outcomes including self-harm/suicide attempt (AUPRCs, LSTM pretrained = 0.012, Attention pretrained = 0.010, RBM = 0.009, random forest = 0.005). The use of autoencoders for feature engineering had varied results, with benchmarks outperforming LSTM and Attention encodings on the self-harm/suicide attempt outcome (AUPRCs, LSTM encodings = 0.003, Attention encodings = 0.004, RBM = 0.009, random forest = 0.005).</p><p><strong>Discussion: </strong>Improvement in prediction resulting from pretraining has the potential for increased clinical impact of MDD risk models. We did not find evidence that the use of temporal feature encodings was additive to predictive performance in the study population. This suggests that predictive information retained by model weights may be lost during encoding. LSTM pretrained model predictive performance is shown to be clinically useful and improves over state-of-the-art predictors in the MDD phenotype. LSTM model performance warrants consideration of use in future related studies.</p><p><strong>Conclusion: </strong>LSTM models with pretrained weights from autoencoders were able to outperform the benchmark and a pretrained Attention model. Future researchers developing risk models in MDD may benefit from the use of LSTM autoencoder pretrained weights.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad086"},"PeriodicalIF":2.1,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10561992/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41214963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of natural language processing to identify social needs from patient medical notes: development and assessment of a scalable, performant, and rule-based model in an integrated healthcare delivery system. 应用自然语言处理从患者病历中识别社会需求:在综合医疗服务提供系统中开发和评估可扩展、高效和基于规则的模型。
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-04 eCollection Date: 2023-12-01 DOI: 10.1093/jamiaopen/ooad085
Geoffrey M Gray, Ayah Zirikly, Luis M Ahumada, Masoud Rouhizadeh, Thomas Richards, Christopher Kitchen, Iman Foroughmand, Elham Hatef

Objectives: To develop and test a scalable, performant, and rule-based model for identifying 3 major domains of social needs (residential instability, food insecurity, and transportation issues) from the unstructured data in electronic health records (EHRs).

Materials and methods: We included patients aged 18 years or older who received care at the Johns Hopkins Health System (JHHS) between July 2016 and June 2021 and had at least 1 unstructured (free-text) note in their EHR during the study period. We used a combination of manual lexicon curation and semiautomated lexicon creation for feature development. We developed an initial rules-based pipeline (Match Pipeline) using 2 keyword sets for each social needs domain. We performed rule-based keyword matching for distinct lexicons and tested the algorithm using an annotated dataset comprising 192 patients. Starting with a set of expert-identified keywords, we tested the adjustments by evaluating false positives and negatives identified in the labeled dataset. We assessed the performance of the algorithm using measures of precision, recall, and F1 score.

Results: The algorithm for identifying residential instability had the best overall performance, with a weighted average for precision, recall, and F1 score of 0.92, 0.84, and 0.92 for identifying patients with homelessness and 0.84, 0.82, and 0.79 for identifying patients with housing insecurity. Metrics for the food insecurity algorithm were high but the transportation issues algorithm was the lowest overall performing metric.

Discussion: The NLP algorithm in identifying social needs at JHHS performed relatively well and would provide the opportunity for implementation in a healthcare system.

Conclusion: The NLP approach developed in this project could be adapted and potentially operationalized in the routine data processes of a healthcare system.

目标:开发和测试一种可扩展的、高性能的,和基于规则的模型,用于从电子健康记录(EHR)中的非结构化数据中识别社会需求的3个主要领域(居住不稳定、粮食不安全和交通问题)。材料和方法:我们纳入了2016年7月至2021年6月在约翰斯·霍普金斯卫生系统(JHHS)接受护理的18岁或以上患者,他们至少有1名非结构化(自由文本)研究期间EHR中的注释。我们使用了手动词典管理和半自动词典创建相结合的方法来开发功能。我们开发了一个初始的基于规则的管道(Match pipeline),为每个社会需求领域使用2个关键字集。我们对不同的词典进行了基于规则的关键词匹配,并使用包含192名患者的注释数据集测试了该算法。从一组专家识别的关键词开始,我们通过评估标记数据集中识别的假阳性和阴性来测试调整。我们使用精度、召回率和F1分数来评估算法的性能。结果:用于识别居住不稳定的算法具有最佳的总体性能,用于识别无家可归患者的精确度、召回率和F1得分的加权平均值分别为0.92、0.84和0.92,用于识别住房不安全患者的加权平均数分别为0.84、0.82和0.79。粮食不安全算法的指标很高,但运输问题算法是总体表现最低的指标。讨论:在JHHS识别社会需求的NLP算法表现相对较好,将为在医疗系统中实施提供机会。结论:该项目中开发的NLP方法可以在医疗保健系统的常规数据过程中进行调整,并有可能付诸实施。
{"title":"Application of natural language processing to identify social needs from patient medical notes: development and assessment of a scalable, performant, and rule-based model in an integrated healthcare delivery system.","authors":"Geoffrey M Gray, Ayah Zirikly, Luis M Ahumada, Masoud Rouhizadeh, Thomas Richards, Christopher Kitchen, Iman Foroughmand, Elham Hatef","doi":"10.1093/jamiaopen/ooad085","DOIUrl":"10.1093/jamiaopen/ooad085","url":null,"abstract":"<p><strong>Objectives: </strong>To develop and test a scalable, performant, and rule-based model for identifying 3 major domains of social needs (residential instability, food insecurity, and transportation issues) from the unstructured data in electronic health records (EHRs).</p><p><strong>Materials and methods: </strong>We included patients aged 18 years or older who received care at the Johns Hopkins Health System (JHHS) between July 2016 and June 2021 and had at least 1 unstructured (free-text) note in their EHR during the study period. We used a combination of manual lexicon curation and semiautomated lexicon creation for feature development. We developed an initial rules-based pipeline (Match Pipeline) using 2 keyword sets for each social needs domain. We performed rule-based keyword matching for distinct lexicons and tested the algorithm using an annotated dataset comprising 192 patients. Starting with a set of expert-identified keywords, we tested the adjustments by evaluating false positives and negatives identified in the labeled dataset. We assessed the performance of the algorithm using measures of precision, recall, and <i>F</i>1 score.</p><p><strong>Results: </strong>The algorithm for identifying residential instability had the best overall performance, with a weighted average for precision, recall, and <i>F</i>1 score of 0.92, 0.84, and 0.92 for identifying patients with homelessness and 0.84, 0.82, and 0.79 for identifying patients with housing insecurity. Metrics for the food insecurity algorithm were high but the transportation issues algorithm was the lowest overall performing metric.</p><p><strong>Discussion: </strong>The NLP algorithm in identifying social needs at JHHS performed relatively well and would provide the opportunity for implementation in a healthcare system.</p><p><strong>Conclusion: </strong>The NLP approach developed in this project could be adapted and potentially operationalized in the routine data processes of a healthcare system.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"6 4","pages":"ooad085"},"PeriodicalIF":2.1,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2e/eb/ooad085.PMC10550267.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41168703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the impact of missingness on racial disparities in predictive performance of a machine learning model for emergency department triage 探索遗漏对急诊科分诊机器学习模型预测性能种族差异的影响
IF 2.1 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2023-10-04 DOI: 10.1093/jamiaopen/ooad107
Stephanie Teeple, Aria G. Smith, Matthew F. Toerper, Scott Levin, Scott Halpern, Oluwakemi Badaki‐Makun, J. Hinson
To investigate how missing data in the patient problem list may impact racial disparities in the predictive performance of a machine learning (ML) model for emergency department (ED) triage. Racial disparities may exist in the missingness of EHR data (eg, systematic differences in access, testing, and/or treatment) that can impact model predictions across racialized patient groups. We use an ML model that predicts patients’ risk for adverse events to produce triage-level recommendations, patterned after a clinical decision support tool deployed at multiple EDs. We compared the model’s predictive performance on sets of observed (problem list data at the point of triage) versus manipulated (updated to the more complete problem list at the end of the encounter) test data. These differences were compared between Black and non-Hispanic White patient groups using multiple performance measures relevant to health equity. There were modest, but significant, changes in predictive performance comparing the observed to manipulated models across both Black and non-Hispanic White patient groups; c-statistic improvement ranged between 0.027 and 0.058. The manipulation produced no between-group differences in c-statistic by race. However, there were small between-group differences in other performance measures, with greater change for non-Hispanic White patients. Problem list missingness impacted model performance for both patient groups, with marginal differences detected by race. Further exploration is needed to examine how missingness may contribute to racial disparities in clinical model predictions across settings. The novel manipulation method demonstrated may aid future research.
目的:研究患者问题清单中的缺失数据如何影响急诊科(ED)分诊机器学习(ML)模型预测性能中的种族差异。 电子病历数据的缺失可能存在种族差异(例如,就诊、检测和/或治疗方面的系统性差异),这会影响模型对不同种族患者群体的预测。我们使用了一个预测患者不良事件风险的 ML 模型,以多个急诊室部署的临床决策支持工具为蓝本,提出分诊建议。我们比较了该模型在观察数据集(分诊时的问题列表数据)和操作数据集(就诊结束时更新为更完整的问题列表)上的预测性能。使用与健康公平相关的多种绩效指标,比较了黑人和非西班牙裔白人患者群体之间的差异。 在黑人和非西班牙裔白人患者群体中,将观察到的模型与操作模型进行比较,预测性能发生了适度但显著的变化;c 统计量的提高幅度在 0.027 和 0.058 之间。操纵模型在不同种族的 c 统计量上没有组间差异。但是,在其他绩效指标方面,组间差异较小,非西班牙裔白人患者的变化更大。 问题列表缺失对两组患者的模型性能都有影响,种族间的差异微乎其微。 我们还需要进一步研究遗漏是如何导致不同环境下临床模型预测的种族差异的。所展示的新颖操作方法可能有助于未来的研究。
{"title":"Exploring the impact of missingness on racial disparities in predictive performance of a machine learning model for emergency department triage","authors":"Stephanie Teeple, Aria G. Smith, Matthew F. Toerper, Scott Levin, Scott Halpern, Oluwakemi Badaki‐Makun, J. Hinson","doi":"10.1093/jamiaopen/ooad107","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooad107","url":null,"abstract":"To investigate how missing data in the patient problem list may impact racial disparities in the predictive performance of a machine learning (ML) model for emergency department (ED) triage. Racial disparities may exist in the missingness of EHR data (eg, systematic differences in access, testing, and/or treatment) that can impact model predictions across racialized patient groups. We use an ML model that predicts patients’ risk for adverse events to produce triage-level recommendations, patterned after a clinical decision support tool deployed at multiple EDs. We compared the model’s predictive performance on sets of observed (problem list data at the point of triage) versus manipulated (updated to the more complete problem list at the end of the encounter) test data. These differences were compared between Black and non-Hispanic White patient groups using multiple performance measures relevant to health equity. There were modest, but significant, changes in predictive performance comparing the observed to manipulated models across both Black and non-Hispanic White patient groups; c-statistic improvement ranged between 0.027 and 0.058. The manipulation produced no between-group differences in c-statistic by race. However, there were small between-group differences in other performance measures, with greater change for non-Hispanic White patients. Problem list missingness impacted model performance for both patient groups, with marginal differences detected by race. Further exploration is needed to examine how missingness may contribute to racial disparities in clinical model predictions across settings. The novel manipulation method demonstrated may aid future research.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"13 1","pages":""},"PeriodicalIF":2.1,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139323583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JAMIA Open
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1