首页 > 最新文献

AMIA ... Annual Symposium proceedings. AMIA Symposium最新文献

英文 中文
Leveraging A Clinical Dashboard and Process Mappings to Improve Treatment Access and Outcomes for Women Veterans with Urinary Incontinence. 利用临床仪表板和流程映射改善患有尿失禁的女性退伍军人的治疗途径和效果。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Grace Gao, Camille P Vaughan, Alayne D Markland, Kayla Reinicke, Neeraja Annavaram, Zachary Burningham

In support of the Improving Primary Care Understanding of Resources and Screening for Urinary Incontinence to Enhance Treatment initiative with the Veterans Health Administration, we developed a clinical dashboard to support primary care providers in identifying underdiagnosed, undertreated women Veterans with urinary incontinence. This paper describes our dashboard development and evaluation. We employed a user-centered design in determining dashboard requirements, interface design, and functionality. We invited early users at three pilot sites to formal usability reviews. We quantified the dashboard usability using the System Usability Scale and administered surveys and interviews for insights on performance. We employed process maps to uncover processes of end-users' dashboard engagements within local environments. User evaluations demonstrated the dashboard as a helpful instrument in identifying women Veterans with good to excellent usability performance. User feedback offers a user-driven pathway to develop our dashboard that supports clinicians to better care for women Veterans with urinary incontinence.

为支持退伍军人健康管理局的 "提高初级保健对尿失禁资源和筛查的了解以加强治疗 "倡议,我们开发了一个临床仪表板,以支持初级保健提供者识别诊断不足、治疗不足的尿失禁女性退伍军人。本文介绍了我们的仪表板开发和评估。我们在确定仪表板要求、界面设计和功能时采用了以用户为中心的设计。我们邀请了三个试点地区的早期用户进行了正式的可用性审查。我们使用系统可用性量表对仪表盘的可用性进行了量化,并进行了调查和访谈以了解其性能。我们使用流程图来揭示最终用户在当地环境中使用仪表盘的过程。用户评估结果表明,仪表盘是一种有助于识别女性退伍军人的工具,其可用性表现从良好到卓越不等。用户反馈为我们开发仪表盘提供了一条用户驱动的途径,可帮助临床医生更好地护理患有尿失禁的女退伍军人。
{"title":"Leveraging A Clinical Dashboard and Process Mappings to Improve Treatment Access and Outcomes for Women Veterans with Urinary Incontinence.","authors":"Grace Gao, Camille P Vaughan, Alayne D Markland, Kayla Reinicke, Neeraja Annavaram, Zachary Burningham","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In support of the Improving Primary Care Understanding of Resources and Screening for Urinary Incontinence to Enhance Treatment initiative with the Veterans Health Administration, we developed a clinical dashboard to support primary care providers in identifying underdiagnosed, undertreated women Veterans with urinary incontinence. This paper describes our dashboard development and evaluation. We employed a user-centered design in determining dashboard requirements, interface design, and functionality. We invited early users at three pilot sites to formal usability reviews. We quantified the dashboard usability using the System Usability Scale and administered surveys and interviews for insights on performance. We employed process maps to uncover processes of end-users' dashboard engagements within local environments. User evaluations demonstrated the dashboard as a helpful instrument in identifying women Veterans with good to excellent usability performance. User feedback offers a user-driven pathway to develop our dashboard that supports clinicians to better care for women Veterans with urinary incontinence.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"359-368"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785906/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Unlabeled Clinical Data to Boost Performance of Risk Stratification Models for Suspected Acute Coronary Syndrome. 利用未标记的临床数据提高疑似急性冠状动脉综合征风险分层模型的性能。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Yutong Wu, David Conlan, Siegfried Perez, Anthony Nguyen

The performance of deep learning models in the health domain is desperately limited by the scarcity of labeled data, especially for specific clinical-domain tasks. Conversely, there are vastly available clinical unlabeled data waiting to be exploited to improve deep learning models where their training labeled data are limited. This paper investigates the use of task-specific unlabeled data to boost the performance of classification models for the risk stratification of suspected acute coronary syndrome. By leveraging large numbers of unlabeled clinical notes in task-adaptive language model pretraining, valuable prior task-specific knowledge can be attained. Based on such pretrained models, task-specific fine-tuning with limited labeled data produces better performances. Extensive experiments demonstrate that the pretrained task-specific language models using task-specific unlabeled data can significantly improve the performance of the downstream models for specific classification tasks.

深度学习模型在健康领域的表现因标注数据的稀缺而受到极大限制,特别是在特定的临床领域任务中。相反,在深度学习模型的训练标注数据有限的情况下,有大量可用的临床非标注数据等待着我们去利用,以改进深度学习模型。本文研究了如何利用特定任务的非标记数据来提高疑似急性冠状动脉综合征风险分层分类模型的性能。通过在任务自适应语言模型预训练中利用大量未标记的临床笔记,可以获得有价值的任务特定先验知识。在这种预训练模型的基础上,利用有限的标注数据对特定任务进行微调,可以产生更好的性能。大量实验证明,使用特定任务的非标记数据预训练特定任务语言模型,可以显著提高下游模型在特定分类任务中的性能。
{"title":"Leveraging Unlabeled Clinical Data to Boost Performance of Risk Stratification Models for Suspected Acute Coronary Syndrome.","authors":"Yutong Wu, David Conlan, Siegfried Perez, Anthony Nguyen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The performance of deep learning models in the health domain is desperately limited by the scarcity of labeled data, especially for specific clinical-domain tasks. Conversely, there are vastly available clinical unlabeled data waiting to be exploited to improve deep learning models where their training labeled data are limited. This paper investigates the use of task-specific unlabeled data to boost the performance of classification models for the risk stratification of suspected acute coronary syndrome. By leveraging large numbers of unlabeled clinical notes in task-adaptive language model pretraining, valuable prior task-specific knowledge can be attained. Based on such pretrained models, task-specific fine-tuning with limited labeled data produces better performances. Extensive experiments demonstrate that the pretrained task-specific language models using task-specific unlabeled data can significantly improve the performance of the downstream models for specific classification tasks.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"744-753"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Outliers in diagnosis ratios: A clue toward possibly absent data. 诊断比率中的异常值:可能缺失数据的线索
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Dmitry Morozyuk, Mark G Weiner

The evaluation of completeness of real-world data is a particularly challenging component of data quality assessment because the degree of truly versus erroneously absent data is unknown. Among inpatient data sets, while absolute counts of admissions having specific categories of diagnoses in the principal or any position may vary depending on hospital size, we hypothesized that the ratio of these parameters will be preserved across sites, with outliers suggesting the potential for erroneously absent data. For several categories of clinical conditions assigned to inpatient admissions, we analyzed the ratio of their recording as the principal diagnosis versus any diagnosis across several hospitals and compared the ratios against a national benchmark. Our analysis showed ratios that matched clinical expectations, with reasonable preservation of ratios across sites. However, some conditions exhibited more variability in the ratios and some sites had many outliers possibly reflecting data quality issues that warrant further attention.

对真实世界数据完整性的评估是数据质量评估中特别具有挑战性的部分,因为真正缺失与错误缺失数据的程度是未知的。在住院患者数据集中,虽然在主要位置或任何位置有特定类别诊断的入院患者的绝对数量可能因医院规模而异,但我们假设这些参数的比例在不同地点会保持不变,而异常值则表明可能存在错误缺失的数据。对于分配给住院病人的几类临床病症,我们分析了几家医院将其记录为主要诊断与任何诊断的比率,并与全国基准进行了比较。我们的分析表明,比例符合临床预期,各医院的比例保持合理。不过,有些病症的比率变化较大,有些医院有许多异常值,这可能反映了数据质量问题,值得进一步关注。
{"title":"Outliers in diagnosis ratios: A clue toward possibly absent data.","authors":"Dmitry Morozyuk, Mark G Weiner","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The evaluation of completeness of real-world data is a particularly challenging component of data quality assessment because the degree of truly versus erroneously absent data is unknown. Among inpatient data sets, while absolute counts of admissions having specific categories of diagnoses in the principal or any position may vary depending on hospital size, we hypothesized that the ratio of these parameters will be preserved across sites, with outliers suggesting the potential for erroneously absent data. For several categories of clinical conditions assigned to inpatient admissions, we analyzed the ratio of their recording as the principal diagnosis versus any diagnosis across several hospitals and compared the ratios against a national benchmark. Our analysis showed ratios that matched clinical expectations, with reasonable preservation of ratios across sites. However, some conditions exhibited more variability in the ratios and some sites had many outliers possibly reflecting data quality issues that warrant further attention.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"1175-1182"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785923/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sensitive Data Detection with High-Throughput Machine Learning Models in Electrical Health Records. 利用高通量机器学习模型检测电子健康记录中的敏感数据。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Kai Zhang, Xiaoqian Jiang

In the era of big data, there is an increasing need for healthcare providers, communities, and researchers to share data and collaborate to improve health outcomes, generate valuable insights, and advance research. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) is a federal law designed to protect sensitive health information by defining regulations for protected health information (PHI). However, it does not provide efficient tools for detecting or removing PHI before data sharing. One of the challenges in this area of research is the heterogeneous nature of PHI fields in data across different parties. This variability makes rule-based sensitive variable identification systems that work on one database fail on another. To address this issue, our paper explores the use of machine learning algorithms to identify sensitive variables in structured data, thus facilitating the de-identification process. We made a key observation that the distributions of metadata of PHI fields and non-PHI fields are very different. Based on this novel finding, we engineered over 30 features from the metadata of the original features and used machine learning to build classification models to automatically identify PHI fields in structured Electronic Health Record (EHR) data. We trained the model on a variety of large EHR databases from different data sources and found that our algorithm achieves 99% accuracy when detecting PHI-related fields for unseen datasets. The implications of our study are significant and can benefit industries that handle sensitive data.

在大数据时代,医疗服务提供者、社区和研究人员越来越需要共享数据并开展合作,以改善医疗效果、产生有价值的见解并推进研究。1996 年《健康保险可携性与责任法案》(HIPAA)是一部联邦法律,旨在通过对受保护健康信息(PHI)的规定来保护敏感的健康信息。然而,该法并未提供在数据共享前检测或删除 PHI 的有效工具。这一研究领域面临的挑战之一是不同各方数据中 PHI 字段的异质性。这种差异性使得基于规则的敏感变量识别系统在一个数据库中工作时,在另一个数据库中就会失效。为了解决这个问题,我们的论文探讨了使用机器学习算法来识别结构化数据中的敏感变量,从而促进去身份化过程。我们发现了一个重要现象,即 PHI 字段和非 PHI 字段的元数据分布非常不同。基于这个新发现,我们从原始特征的元数据中提取了 30 多个特征,并使用机器学习建立分类模型,以自动识别结构化电子病历 (EHR) 数据中的 PHI 字段。我们在来自不同数据源的各种大型电子病历数据库上对模型进行了训练,发现我们的算法在检测未见数据集的 PHI 相关字段时准确率达到 99%。我们的研究意义重大,可使处理敏感数据的行业受益。
{"title":"Sensitive Data Detection with High-Throughput Machine Learning Models in Electrical Health Records.","authors":"Kai Zhang, Xiaoqian Jiang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In the era of big data, there is an increasing need for healthcare providers, communities, and researchers to share data and collaborate to improve health outcomes, generate valuable insights, and advance research. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) is a federal law designed to protect sensitive health information by defining regulations for protected health information (PHI). However, it does not provide efficient tools for detecting or removing PHI before data sharing. One of the challenges in this area of research is the heterogeneous nature of PHI fields in data across different parties. This variability makes rule-based sensitive variable identification systems that work on one database fail on another. To address this issue, our paper explores the use of machine learning algorithms to identify sensitive variables in structured data, thus facilitating the de-identification process. We made a key observation that the distributions of metadata of PHI fields and non-PHI fields are very different. Based on this novel finding, we engineered over 30 features from the metadata of the original features and used machine learning to build classification models to automatically identify PHI fields in structured Electronic Health Record (EHR) data. We trained the model on a variety of large EHR databases from different data sources and found that our algorithm achieves 99% accuracy when detecting PHI-related fields for unseen datasets. The implications of our study are significant and can benefit industries that handle sensitive data.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"814-823"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785837/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Automated Strategy to Calculate Medication Regimen Complexity. 自动计算用药方案复杂性的策略。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Yuzhi Lu, Ariel R Green, Rosalphie Quiles, Casey Overby Taylor

Understanding medication regimen complexity is important to understand what patients may benefit from pharmacist interventions. Medication Regimen Complexity Index (MRCI), a 65-item tool to quantify the complexity by incorporating the count, dosage form, frequency, and additional administration instructions of prescription medicines, provides a more nuanced way of assessing complexity. The goal of this study was to construct and validate a computational strategy to automate the calculation of MRCI. The performance of our strategy was evaluated by comparing our calculated MRCI values with gold-standard values, using correlation coefficients and population distributions. The results revealed satisfactory performance to calculate the sub-score of MRCI that includes dosage form and frequency (76 to 80% match with gold standard), and fair performance for sub-score related to additional direction (52% match with gold standard). Our automated strategy shows potential to help reduce the effort for manually calculating MRCI and highlights areas for future development efforts.

了解用药方案的复杂性对于了解哪些患者可能受益于药剂师的干预非常重要。药物治疗方案复杂性指数(MRCI)是一种包含 65 个项目的工具,通过纳入处方药的数量、剂型、频率和附加给药说明来量化药物治疗方案的复杂性。本研究的目标是构建并验证一种自动计算 MRCI 的计算策略。通过使用相关系数和人群分布将计算出的 MRCI 值与黄金标准值进行比较,评估了我们策略的性能。结果显示,计算包括剂型和频率在内的 MRCI 子分数的性能令人满意(与黄金标准的匹配率为 76% 至 80%),而计算与额外方向相关的子分数的性能尚可(与黄金标准的匹配率为 52%)。我们的自动化策略显示出了帮助减少人工计算 MRCI 的潜力,并突出了未来的发展方向。
{"title":"An Automated Strategy to Calculate Medication Regimen Complexity.","authors":"Yuzhi Lu, Ariel R Green, Rosalphie Quiles, Casey Overby Taylor","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Understanding medication regimen complexity is important to understand what patients may benefit from pharmacist interventions. Medication Regimen Complexity Index (MRCI), a 65-item tool to quantify the complexity by incorporating the count, dosage form, frequency, and additional administration instructions of prescription medicines, provides a more nuanced way of assessing complexity. The goal of this study was to construct and validate a computational strategy to automate the calculation of MRCI. The performance of our strategy was evaluated by comparing our calculated MRCI values with gold-standard values, using correlation coefficients and population distributions. The results revealed satisfactory performance to calculate the sub-score of MRCI that includes dosage form and frequency (76 to 80% match with gold standard), and fair performance for sub-score related to additional direction (52% match with gold standard). Our automated strategy shows potential to help reduce the effort for manually calculating MRCI and highlights areas for future development efforts.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"1077-1086"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Case Mix Index within Diagnosis-Related Groups to Evaluate Variation in Hospitalization Costs at a Large Academic Medical Center. 利用诊断相关组内的病例混合指数评估一家大型学术医疗中心的住院费用差异。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Selina Pi, Jonathan Masterson, Stephen P Ma, Conor K Corbin, Arnold Milstein, Jonathan H Chen

In analyzing direct hospitalization cost and clinical data from an academic medical center, commonly used metrics such as diagnosis-related group (DRG) weight explain approximately 37% of cost variability, but a substantial amount of variation remains unaccounted for by case mix index (CMI) alone. Using CMI as a benchmark, we isolate and target individual DRGs with higher than expected average costs for specific quality improvement efforts. While DRGs summarize hospitalization care after discharge, a predictive model using only information known before admission explained up to 60% of cost variability for two DRGs with a high excess cost burden. This level of variability likely reflects underlying patient factors that are not modifiable (e.g., age and prior comorbidities) and therefore less useful for health systems to target for intervention. However, the remaining unexplained variation can be inspected in further studies to discover operational factors that health systems can target to improve quality and value for their patients. Since DRG weights represent the expected resource consumption for a specific hospitalization type relative to the average hospitalization, the data-driven approach we demonstrate can be utilized by any health institution to quantify excess costs and potential savings among DRGs.

在分析一家学术医疗中心的直接住院费用和临床数据时发现,诊断相关组(DRG)权重等常用指标可解释约 37% 的费用变化,但仅靠病例组合指数(CMI)仍无法解释大量变化。以 CMI 为基准,我们分离出平均成本高于预期的 DRGs,并针对这些 DRGs 开展具体的质量改进工作。虽然 DRGs 总结了出院后的住院护理情况,但对于两个超额成本负担较高的 DRGs,仅使用入院前已知信息的预测模型就能解释高达 60% 的成本变异。这种水平的变异性可能反映了无法改变的潜在患者因素(如年龄和既往合并症),因此对医疗系统的干预目标来说作用不大。然而,在进一步的研究中,可以对剩余的无法解释的变异进行检查,以发现医疗系统可以有针对性地改善患者质量和价值的操作因素。由于 DRG 权重代表了特定住院类型相对于平均住院的预期资源消耗,因此任何医疗机构都可以利用我们展示的数据驱动方法来量化 DRGs 中的超额成本和潜在节约。
{"title":"Using Case Mix Index within Diagnosis-Related Groups to Evaluate Variation in Hospitalization Costs at a Large Academic Medical Center.","authors":"Selina Pi, Jonathan Masterson, Stephen P Ma, Conor K Corbin, Arnold Milstein, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In analyzing direct hospitalization cost and clinical data from an academic medical center, commonly used metrics such as diagnosis-related group (DRG) weight explain approximately 37% of cost variability, but a substantial amount of variation remains unaccounted for by case mix index (CMI) alone. Using CMI as a benchmark, we isolate and target individual DRGs with higher than expected average costs for specific quality improvement efforts. While DRGs summarize hospitalization care after discharge, a predictive model using only information known before admission explained up to 60% of cost variability for two DRGs with a high excess cost burden. This level of variability likely reflects underlying patient factors that are not modifiable (e.g., age and prior comorbidities) and therefore less useful for health systems to target for intervention. However, the remaining unexplained variation can be inspected in further studies to discover operational factors that health systems can target to improve quality and value for their patients. Since DRG weights represent the expected resource consumption for a specific hospitalization type relative to the average hospitalization, the data-driven approach we demonstrate can be utilized by any health institution to quantify excess costs and potential savings among DRGs.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"1201-1208"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785921/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139466289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Understanding the Generalization of Medical Text-to-SQL Models and Datasets. 了解医学文本到 SQL 模型和数据集的通用性。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Richard Tarbell, Kim-Kwang Raymond Choo, Glenn Dietrich, Anthony Rios

Electronic medical records (EMRs) are stored in relational databases. It can be challenging to access the required information if the user is unfamiliar with the database schema or general database fundamentals. Hence, researchers have explored text-to-SQL generation methods that provide healthcare professionals direct access to EMR data without needing a database expert. However, currently available datasets have been essentially "solved" with state-of-the-art models achieving accuracy greater than or near 90%. In this paper, we show that there is still a long way to go before solving text-to-SQL generation in the medical domain. To show this, we create new splits of the existing medical text-to- SQL dataset MIMICSQL that better measure the generalizability of the resulting models. We evaluate state-of-the-art language models on our new split showing substantial drops in performance with accuracy dropping from up to 92% to 28%, thus showing substantial room for improvement. Moreover, we introduce a novel data augmentation approach to improve the generalizability of the language models. Overall, this paper is the first step towards developing more robust text-to-SQL models in the medical domain.

电子病历(EMR)存储在关系数据库中。如果用户不熟悉数据库模式或一般数据库基础知识,访问所需信息就会很困难。因此,研究人员探索了文本到 SQL 的生成方法,使医疗保健专业人员无需数据库专家就能直接访问 EMR 数据。然而,目前可用的数据集基本上都已 "解决",最先进的模型准确率超过或接近 90%。在本文中,我们将展示在解决医疗领域的文本到 SQL 生成问题方面还有很长的路要走。为了证明这一点,我们对现有的医学文本到 SQL 数据集 MIMICSQL 进行了新的拆分,以更好地衡量所生成模型的通用性。我们在新拆分的数据集上对最先进的语言模型进行了评估,结果显示性能大幅下降,准确率从高达 92% 降至 28%,由此可见还有很大的改进空间。此外,我们还引入了一种新颖的数据增强方法,以提高语言模型的通用性。总之,本文是在医疗领域开发更强大的文本到 SQL 模型的第一步。
{"title":"Towards Understanding the Generalization of Medical Text-to-SQL Models and Datasets.","authors":"Richard Tarbell, Kim-Kwang Raymond Choo, Glenn Dietrich, Anthony Rios","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic medical records (EMRs) are stored in relational databases. It can be challenging to access the required information if the user is unfamiliar with the database schema or general database fundamentals. Hence, researchers have explored text-to-SQL generation methods that provide healthcare professionals direct access to EMR data without needing a database expert. However, currently available datasets have been essentially \"solved\" with state-of-the-art models achieving accuracy greater than or near 90%. In this paper, we show that there is still a long way to go before solving text-to-SQL generation in the medical domain. To show this, we create new splits of the existing medical text-to- SQL dataset MIMICSQL that better measure the generalizability of the resulting models. We evaluate state-of-the-art language models on our new split showing substantial drops in performance with accuracy dropping from up to 92% to 28%, thus showing substantial room for improvement. Moreover, we introduce a novel data augmentation approach to improve the generalizability of the language models. Overall, this paper is the first step towards developing more robust text-to-SQL models in the medical domain.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"669-678"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785918/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139465614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Phenotyping of OMOP CDM Normalized EHR for Prenatal and Postpartum Episodes: An Informatics Framework and Clinical Implementation on All of Us. 产前和产后发作的 OMOP CDM 归一化电子病历的计算表型:我们所有人的信息学框架和临床实施。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Tianchu Lyu, Chen Liang

The use of Electronic Health Records (EHR) in pregnancy care and obstetrics-gynecology (OB/GYN) research has increased in recent years. In pregnancy, timing is important because clinical characteristics, risks, and patient management are different in each stage of pregnancy. However, the difficulty of accurately differentiating pregnancy episodes and temporal information of clinical events presents unique challenges for EHR phenotyping. In this work, we introduced the concept of time relativity and proposed a comprehensive framework of computational phenotyping for prenatal and postpartum episodes based on the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). We implemented it on the All of Us national EHR database and identified 6,280 pregnancies with accurate start and end dates among 5,399 female patients. With the ability to identify different episodes in pregnancy care, this framework provides new opportunities for phenotyping complex clinical events and gestational morbidities for pregnant women, thus improving maternal and infant health.

近年来,电子健康记录(EHR)在孕期保健和妇产科(OB/GYN)研究中的应用越来越多。在孕期,由于每个阶段的临床特征、风险和患者管理都不同,因此时间安排非常重要。然而,难以准确区分妊娠发作和临床事件的时间信息给电子病历表型带来了独特的挑战。在这项工作中,我们引入了时间相对性的概念,并基于观察性医疗结果合作组织(OMOP)通用数据模型(CDM)提出了产前和产后事件计算表型的综合框架。我们在 All of Us 国家电子病历数据库中实施了这一方法,并在 5399 名女性患者中识别出了 6280 例具有准确开始和结束日期的妊娠。该框架能够识别孕期护理的不同阶段,为孕妇复杂临床事件和妊娠期疾病的表型分析提供了新的机会,从而改善了母婴健康。
{"title":"Computational Phenotyping of OMOP CDM Normalized EHR for Prenatal and Postpartum Episodes: An Informatics Framework and Clinical Implementation on All of Us.","authors":"Tianchu Lyu, Chen Liang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The use of Electronic Health Records (EHR) in pregnancy care and obstetrics-gynecology (OB/GYN) research has increased in recent years. In pregnancy, timing is important because clinical characteristics, risks, and patient management are different in each stage of pregnancy. However, the difficulty of accurately differentiating pregnancy episodes and temporal information of clinical events presents unique challenges for EHR phenotyping. In this work, we introduced the concept of time relativity and proposed a comprehensive framework of computational phenotyping for prenatal and postpartum episodes based on the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). We implemented it on the All of Us national EHR database and identified 6,280 pregnancies with accurate start and end dates among 5,399 female patients. With the ability to identify different episodes in pregnancy care, this framework provides new opportunities for phenotyping complex clinical events and gestational morbidities for pregnant women, thus improving maternal and infant health.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"1096-1104"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785883/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Creating Augmented Reality Holograms for Polytrauma Patients Using 3D Slicer and Holomedicine Medical Image Platform. 利用 3D 切片机和全息医学图像平台为多发性创伤患者创建增强现实全息图像。
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Wei-Shao Sun, Chun-Chuan Sun, Lorenzo Porta, Ting-Kai Yang, Shih-Hao Su, Shih-Hung Liu, Tsung-Hsin Chou, Shyr-Chyr Chen, Joshua Ho, Chien-Chang Lee

In traumatology physicians heavily rely on computed tomography (CT) 2D axial scans to identify and assess the patient's injuries after an accident. However, in some cases it can be difficult to rigorously evaluate the real extent of the damage considering only the bidimensional slices produced by the CT, and some life-threatening lesions can be missed. With the development of 3D holographic rendering and extended reality (XR) technology, CT images can be projected in a 3D format through head-mounted holographic displays, allowing multi-view from different angles and interactive slice intersections, thus increasing anatomical intelligibility. In this article, we explain how to import CT scans into holographic displays for 3D visualization and further compare the methodolgy with traditional bidimensional reading.

在创伤科,医生主要依靠计算机断层扫描(CT)二维轴向扫描来识别和评估事故后病人的损伤。然而,在某些情况下,仅凭计算机断层扫描产生的二维切片很难严格评估损伤的真实程度,一些危及生命的病变可能会被遗漏。随着三维全息渲染和扩展现实(XR)技术的发展,CT 图像可以通过头戴式全息显示器以三维格式投射出来,实现不同角度的多视角观察和切片交汇互动,从而提高解剖的可理解性。本文将介绍如何将 CT 扫描图像导入全息显示器进行三维可视化,并进一步将该方法与传统的二维阅读方法进行比较。
{"title":"Creating Augmented Reality Holograms for Polytrauma Patients Using 3D Slicer and Holomedicine Medical Image Platform.","authors":"Wei-Shao Sun, Chun-Chuan Sun, Lorenzo Porta, Ting-Kai Yang, Shih-Hao Su, Shih-Hung Liu, Tsung-Hsin Chou, Shyr-Chyr Chen, Joshua Ho, Chien-Chang Lee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In traumatology physicians heavily rely on computed tomography (CT) 2D axial scans to identify and assess the patient's injuries after an accident. However, in some cases it can be difficult to rigorously evaluate the real extent of the damage considering only the bidimensional slices produced by the CT, and some life-threatening lesions can be missed. With the development of 3D holographic rendering and extended reality (XR) technology, CT images can be projected in a 3D format through head-mounted holographic displays, allowing multi-view from different angles and interactive slice intersections, thus increasing anatomical intelligibility. In this article, we explain how to import CT scans into holographic displays for 3D visualization and further compare the methodolgy with traditional bidimensional reading.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"663-668"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a Knowledge Graph Embeddings Model for Pain. 开发疼痛知识图谱嵌入模型
Pub Date : 2024-01-11 eCollection Date: 2023-01-01
Jaya Chaturvedi, Tao Wang, Sumithra Velupillai, Robert Stewart, Angus Roberts

Pain is a complex concept that can interconnect with other concepts such as a disorder that might cause pain, a medication that might relieve pain, and so on. To fully understand the context of pain experienced by either an individual or across a population, we may need to examine all concepts related to pain and the relationships between them. This is especially useful when modeling pain that has been recorded in electronic health records. Knowledge graphs represent concepts and their relations by an interlinked network, enabling semantic and context-based reasoning in a computationally tractable form. These graphs can, however, be too large for efficient computation. Knowledge graph embeddings help to resolve this by representing the graphs in a low-dimensional vector space. These embeddings can then be used in various downstream tasks such as classification and link prediction. The various relations associated with pain which are required to construct such a knowledge graph can be obtained from external medical knowledge bases such as SNOMED CT, a hierarchical systematic nomenclature of medical terms. A knowledge graph built in this way could be further enriched with real-world examples of pain and its relations extracted from electronic health records. This paper describes the construction of such knowledge graph embedding models of pain concepts, extracted from the unstructured text of mental health electronic health records, combined with external knowledge created from relations described in SNOMED CT, and their evaluation on a subject-object link prediction task. The performance of the models was compared with other baseline models.

疼痛是一个复杂的概念,可能与其他概念相互关联,如可能导致疼痛的疾病、可能缓解疼痛的药物等。为了充分了解个人或整个人群所经历的疼痛的来龙去脉,我们可能需要研究与疼痛相关的所有概念以及它们之间的关系。这在对电子健康记录中记录的疼痛进行建模时尤其有用。知识图谱通过一个相互连接的网络来表示概念及其关系,从而以一种可计算的形式实现基于语义和上下文的推理。然而,这些图可能过于庞大,难以实现高效计算。知识图谱嵌入通过在低维向量空间中表示图谱,有助于解决这一问题。这些嵌入可以用于各种下游任务,如分类和链接预测。构建这种知识图谱所需的与疼痛相关的各种关系可以从外部医学知识库(如 SNOMED CT)中获取,SNOMED CT 是医学术语的分层系统命名法。以这种方式构建的知识图谱可以通过从电子健康记录中提取的疼痛及其关系的真实案例进一步丰富。本文介绍了从精神卫生电子健康记录的非结构化文本中提取的疼痛概念的知识图谱嵌入模型的构建过程,该模型与根据 SNOMED CT 中描述的关系创建的外部知识相结合,并在主客体链接预测任务中对其进行了评估。这些模型的性能与其他基线模型进行了比较。
{"title":"Development of a Knowledge Graph Embeddings Model for Pain.","authors":"Jaya Chaturvedi, Tao Wang, Sumithra Velupillai, Robert Stewart, Angus Roberts","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Pain is a complex concept that can interconnect with other concepts such as a disorder that might cause pain, a medication that might relieve pain, and so on. To fully understand the context of pain experienced by either an individual or across a population, we may need to examine all concepts related to pain and the relationships between them. This is especially useful when modeling pain that has been recorded in electronic health records. Knowledge graphs represent concepts and their relations by an interlinked network, enabling semantic and context-based reasoning in a computationally tractable form. These graphs can, however, be too large for efficient computation. Knowledge graph embeddings help to resolve this by representing the graphs in a low-dimensional vector space. These embeddings can then be used in various downstream tasks such as classification and link prediction. The various relations associated with pain which are required to construct such a knowledge graph can be obtained from external medical knowledge bases such as SNOMED CT, a hierarchical systematic nomenclature of medical terms. A knowledge graph built in this way could be further enriched with real-world examples of pain and its relations extracted from electronic health records. This paper describes the construction of such knowledge graph embedding models of pain concepts, extracted from the unstructured text of mental health electronic health records, combined with external knowledge created from relations described in SNOMED CT, and their evaluation on a subject-object link prediction task. The performance of the models was compared with other baseline models.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2023 ","pages":"299-308"},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10785867/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139467441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AMIA ... Annual Symposium proceedings. AMIA Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1