首页 > 最新文献

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science最新文献

英文 中文
Pragmatic De-Identification of Cross-Domain Unstructured Documents: A Utility-Preserving Approach with Relation Extraction Filtering. 跨域非结构化文档的实用去标识化:利用关系提取过滤的实用性保护方法
Liubov Nedoshivina, Anisa Halimi, Joao Bettencourt-Silva, Stefano Braghin

The volume of information, and in particular personal information, generated each day is increasing at a staggering rate. The ability to leverage such information depends greatly on being able to satisfy the many compliance and privacy regulations that are appearing all over the world. We present READI, a utility preserving framework for the unstructured document de-identification. READI leverages Named Entity Recognition and Relation Extraction technology to improve the quality of the entity detection, thus improving the overall quality of the data de-identification process. In this proof of concept study, we evaluate the proposed approach on two different datasets and compare with the existing state-of-the-art approaches. We show that Relation Extraction-based Approach for De-Identification (READI) notably reduces the number of false positives and improves the utility of the de-identified text.

每天产生的信息量,尤其是个人信息,正在以惊人的速度增长。利用这些信息的能力在很大程度上取决于能否满足世界各地出现的众多合规和隐私法规的要求。我们介绍的 READI 是一个用于非结构化文档去标识化的实用保护框架。READI 利用命名实体识别和关系提取技术来提高实体检测的质量,从而提高数据去标识化过程的整体质量。在这项概念验证研究中,我们在两个不同的数据集上对所提出的方法进行了评估,并与现有的最先进方法进行了比较。我们发现,基于关系提取的去标识化方法(READI)显著减少了误报的数量,提高了去标识化文本的实用性。
{"title":"Pragmatic De-Identification of Cross-Domain Unstructured Documents: A Utility-Preserving Approach with Relation Extraction Filtering.","authors":"Liubov Nedoshivina, Anisa Halimi, Joao Bettencourt-Silva, Stefano Braghin","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The volume of information, and in particular personal information, generated each day is increasing at a staggering rate. The ability to leverage such information depends greatly on being able to satisfy the many compliance and privacy regulations that are appearing all over the world. We present READI, a utility preserving framework for the unstructured document de-identification. READI leverages Named Entity Recognition and Relation Extraction technology to improve the quality of the entity detection, thus improving the overall quality of the data de-identification process. In this proof of concept study, we evaluate the proposed approach on two different datasets and compare with the existing state-of-the-art approaches. We show that Relation Extraction-based Approach for De-Identification (READI) notably reduces the number of false positives and improves the utility of the de-identified text.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141830/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SABER: Statistical Identification of Loci of Interest in GWAS Summary Statistics using a Bayesian Gaussian Mixture Model. SABER:使用贝叶斯高斯混杂模型统计识别 GWAS 摘要统计中的相关基因位点。
Rachit Kumar, Rasika Venkatesh, Marylyn D Ritchie

Genome-wide association studies (GWAS) remain a popular method for identifying novel genetic associations with human phenotypes and have provided many insights into the etiology of many diseases. However, GWAS provide limited support for how a genetic association might contribute to disease due to inherent limitations, such as linkage disequilibrium. As such, many methods that operate on GWAS summary statistics have been developed to generate evidence for functional pathways or for variants of interest, but they require defining the genomic region bounds for loci of interest. At present, there are limited methods for determining these bounds in a rigorous, reproducible way. We present a novel statistical method, Statistical Analysis for Bayesian Estimation of Regions (SABER), that uses Bayesian Gaussian mixture models to reproducibly generate ratios that quantify whether particular genomic positions represent the bounds of loci of interest and can be used to delineate genomic regions for downstream analyses.

全基因组关联研究(GWAS)仍然是确定新的遗传关联与人类表型的常用方法,并为许多疾病的病因学提供了许多见解。然而,全基因组关联研究因其固有的局限性(如连锁不平衡),对遗传关联如何导致疾病提供的支持有限。因此,人们开发了许多基于 GWAS 概要统计的方法,为功能途径或感兴趣的变异提供证据,但这些方法需要定义感兴趣基因座的基因组区域边界。目前,以严格、可重复的方式确定这些界限的方法还很有限。我们提出了一种新颖的统计方法--区域贝叶斯估计统计分析(SABER),它使用贝叶斯高斯混合模型可重复地生成比率,量化特定基因组位置是否代表感兴趣基因座的边界,并可用于为下游分析划定基因组区域。
{"title":"SABER: Statistical Identification of Loci of Interest in GWAS Summary Statistics using a Bayesian Gaussian Mixture Model.","authors":"Rachit Kumar, Rasika Venkatesh, Marylyn D Ritchie","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Genome-wide association studies (GWAS) remain a popular method for identifying novel genetic associations with human phenotypes and have provided many insights into the etiology of many diseases. However, GWAS provide limited support for how a genetic association might contribute to disease due to inherent limitations, such as linkage disequilibrium. As such, many methods that operate on GWAS summary statistics have been developed to generate evidence for functional pathways or for variants of interest, but they require defining the genomic region bounds for loci of interest. At present, there are limited methods for determining these bounds in a rigorous, reproducible way. We present a novel statistical method, Statistical Analysis for Bayesian Estimation of Regions (SABER), that uses Bayesian Gaussian mixture models to reproducibly generate ratios that quantify whether particular genomic positions represent the bounds of loci of interest and can be used to delineate genomic regions for downstream analyses.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141805/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topology-based Clustering of Functional Brain Networks in an Alzheimer's Disease Cohort. 基于拓扑结构的阿尔茨海默氏症队列大脑功能网络聚类研究
Frederick H Xu, Michael Gao, Jiong Chen, Sumita Garai, Duy Anh Duong-Tran, Yize Zhao, Li Shen

Alzheimer's disease is a progressive neurodegenerative disease with many identifying biomarkers for diagnosis. However, whole-brain phenomena, particularly in functional MRI modalities, are not fully understood nor characterized. Here we employ the novel application of topological data analysis (TDA)-based methods of persistent homology to functional brain networks from ADNI-3 cohort to perform a subtyping experiment using unsupervised clustering techniques. We then investigate variations in QT-PAD challenge features across the identified clusters. Using a Wasserstein distance kernel with a variety of clustering algorithms, we found that the 0th-homology Wasserstein distance kernel and spectral clustering yielded clusters with significant differences in whole brain and medial temporal lobe (MTL) volume, thus demonstrating an intrinsic link between whole brain functional topology and brain morphometric structure. These findings demonstrate the importance of MTL in functional connectivity and the efficacy of using TDA-based machine learning methods in network neuroscience and neurodegenerative disease subtyping.

阿尔茨海默病是一种进行性神经退行性疾病,有许多可用于诊断的生物标志物。然而,人们对全脑现象,尤其是功能性核磁共振成像(MRI)模式的全脑现象并不完全了解,也没有对其进行特征描述。在这里,我们将基于拓扑数据分析(TDA)的持续同源性方法新颖地应用于 ADNI-3 队列中的大脑功能网络,利用无监督聚类技术进行了一次亚型实验。然后,我们研究了已识别聚类中 QT-PAD 挑战特征的变化。通过使用瓦瑟斯坦距离核和多种聚类算法,我们发现第 0 次同源性瓦瑟斯坦距离核和谱聚类产生的聚类在全脑和内侧颞叶(MTL)体积上存在显著差异,从而证明了全脑功能拓扑和大脑形态结构之间的内在联系。这些发现证明了内侧颞叶在功能连接中的重要性,以及在网络神经科学和神经退行性疾病亚型分析中使用基于 TDA 的机器学习方法的有效性。
{"title":"Topology-based Clustering of Functional Brain Networks in an Alzheimer's Disease Cohort.","authors":"Frederick H Xu, Michael Gao, Jiong Chen, Sumita Garai, Duy Anh Duong-Tran, Yize Zhao, Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Alzheimer's disease is a progressive neurodegenerative disease with many identifying biomarkers for diagnosis. However, whole-brain phenomena, particularly in functional MRI modalities, are not fully understood nor characterized. Here we employ the novel application of topological data analysis (TDA)-based methods of persistent homology to functional brain networks from ADNI-3 cohort to perform a subtyping experiment using unsupervised clustering techniques. We then investigate variations in QT-PAD challenge features across the identified clusters. Using a Wasserstein distance kernel with a variety of clustering algorithms, we found that the 0<sup>th</sup>-homology Wasserstein distance kernel and spectral clustering yielded clusters with significant differences in whole brain and medial temporal lobe (MTL) volume, thus demonstrating an intrinsic link between whole brain functional topology and brain morphometric structure. These findings demonstrate the importance of MTL in functional connectivity and the efficacy of using TDA-based machine learning methods in network neuroscience and neurodegenerative disease subtyping.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141857/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Family Histories Significantly Improve Risk Prediction in an EHR. 电子病历中的自动家族病史可显著改善风险预测。
Xiayuan Huang, Ross Kleiman, David Page, Scott Hebbring

We recently demonstrated that electronically constructed family pedigrees (e-pedigrees) have great value in epidemiologic research using electronic health record (EHR) data. Prior to this work, it has been well accepted that family health history is a major predictor for a wide spectrum of diseases, reflecting shared effects of genetics, environment, and lifestyle. With the widespread digitalization of patient data via EHRs, there is an unprecedented opportunity to use machine learning algorithms to better predict disease risk. Although predictive models have previously been constructed for a few important diseases, we currently know very little about how accurately the risk for most diseases can be predicted. It is further unknown if the incorporation of e-pedigrees in machine learning can improve the value of these models. In this study, we devised a family pedigree-driven high-throughput machine learning pipeline to simultaneously predict risks for thousands of diagnosis codes using thousands of input features. Models were built to predict future disease risk for three time windows using both Logistic Regression and XGBoost. For example, we achieved average areas under the receiver operating characteristic curves (AUCs) of 0.82, 0.77 and 0.71 for 1, 6, and 24 months, respectively using XGBoost and without e-pedigrees. When adding e-pedigree features to the XGBoost pipeline, AUCs increased to 0.83, 0.79 and 0.74 for the same three time periods, respectively. E-pedigrees similarly improved the predictions when using Logistic Regression. These results emphasize the potential value of incorporating family health history via e-pedigrees into machine learning with no further human time.

最近,我们利用电子健康记录(EHR)数据证明了电子构建家系(e-pedigrees)在流行病学研究中的巨大价值。在这项工作之前,家族健康史是多种疾病的主要预测因素,反映了遗传、环境和生活方式的共同影响,这一点已被广泛接受。随着电子病历(EHR)对患者数据的广泛数字化,为使用机器学习算法更好地预测疾病风险提供了前所未有的机会。虽然以前已经针对一些重要疾病建立了预测模型,但我们目前对如何准确预测大多数疾病的风险知之甚少。此外,我们还不知道在机器学习中加入电子病历是否能提高这些模型的价值。在这项研究中,我们设计了一个家系驱动的高通量机器学习管道,利用数千个输入特征同时预测数千个诊断代码的风险。我们利用 Logistic 回归和 XGBoost 建立了预测三个时间窗未来疾病风险的模型。例如,在使用 XGBoost 和不使用电子病历的情况下,我们在 1 个月、6 个月和 24 个月的接收者工作特征曲线下的平均面积(AUC)分别为 0.82、0.77 和 0.71。在 XGBoost 管道中添加电子家谱特征后,相同三个时间段的 AUC 分别增加到 0.83、0.79 和 0.74。在使用逻辑回归时,电子家谱同样提高了预测结果。这些结果凸显了通过电子pedigrees将家族健康史纳入机器学习的潜在价值,而无需花费更多的人力时间。
{"title":"Automated Family Histories Significantly Improve Risk Prediction in an EHR.","authors":"Xiayuan Huang, Ross Kleiman, David Page, Scott Hebbring","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We recently demonstrated that electronically constructed family pedigrees (e-pedigrees) have great value in epidemiologic research using electronic health record (EHR) data. Prior to this work, it has been well accepted that family health history is a major predictor for a wide spectrum of diseases, reflecting shared effects of genetics, environment, and lifestyle. With the widespread digitalization of patient data via EHRs, there is an unprecedented opportunity to use machine learning algorithms to better predict disease risk. Although predictive models have previously been constructed for a few important diseases, we currently know very little about how accurately the risk for most diseases can be predicted. It is further unknown if the incorporation of e-pedigrees in machine learning can improve the value of these models. In this study, we devised a family pedigree-driven high-throughput machine learning pipeline to simultaneously predict risks for thousands of diagnosis codes using thousands of input features. Models were built to predict future disease risk for three time windows using both Logistic Regression and XGBoost. For example, we achieved average areas under the receiver operating characteristic curves (AUCs) of 0.82, 0.77 and 0.71 for 1, 6, and 24 months, respectively using XGBoost and without e-pedigrees. When adding e-pedigree features to the XGBoost pipeline, AUCs increased to 0.83, 0.79 and 0.74 for the same three time periods, respectively. E-pedigrees similarly improved the predictions when using Logistic Regression. These results emphasize the potential value of incorporating family health history via e-pedigrees into machine learning with no further human time.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141855/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Study of Biomedical Relation Extraction Using GPT Models. 使用 GPT 模型提取生物医学关系的研究。
Jeffrey Zhang, Maxwell Wibert, Huixue Zhou, Xueqing Peng, Qingyu Chen, Vipina K Keloth, Yan Hu, Rui Zhang, Hua Xu, Kalpana Raja

Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same.

关系提取(RE)是一项自然语言处理(NLP)任务,用于提取生物医学实体之间的语义关系。预训练大型语言模型(LLM)的最新发展促使 NLP 研究人员将其用于各种 NLP 任务。我们研究了从 EU-ADR、Gene Associations Database (GAD) 和 ChemProt 这三个标准数据集中提取关系的 GPT-3.5-turbo 和 GPT-4。与使用带有屏蔽实体的数据集的现有方法不同,我们在实验中对每个数据集使用了三个版本:带有屏蔽实体的版本、带有原始实体(未屏蔽)的第二个版本以及用原始术语替换缩写的第三个版本。我们为不同版本开发了提示,并使用了 GPT API 的聊天完成模型。我们的方法在 GPT-3.5-turbo 中取得了 0.498 到 0.809 的 F1 分数,在 GPT-4 中取得了 0.84 的最高 F1 分数。在某些实验中,GPT、BioBERT 和 PubMedBERT 的性能几乎相同。
{"title":"A Study of Biomedical Relation Extraction Using GPT Models.","authors":"Jeffrey Zhang, Maxwell Wibert, Huixue Zhou, Xueqing Peng, Qingyu Chen, Vipina K Keloth, Yan Hu, Rui Zhang, Hua Xu, Kalpana Raja","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Relation Extraction (RE) is a natural language processing (NLP) task for extracting semantic relations between biomedical entities. Recent developments in pre-trained large language models (LLM) motivated NLP researchers to use them for various NLP tasks. We investigated GPT-3.5-turbo and GPT-4 on extracting the relations from three standard datasets, EU-ADR, Gene Associations Database (GAD), and ChemProt. Unlike the existing approaches using datasets with masked entities, we used three versions for each dataset for our experiment: a version with masked entities, a second version with the original entities (unmasked), and a third version with abbreviations replaced with the original terms. We developed the prompts for various versions and used the chat completion model from GPT API. Our approach achieved a F1-score of 0.498 to 0.809 for GPT-3.5-turbo, and a highest F1-score of 0.84 for GPT-4. For certain experiments, the performance of GPT, BioBERT, and PubMedBERT are almost the same.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141827/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Automated Approach for Identifying Erroneous IS-A Relations in SNOMED CT. 识别 SNOMED CT 中错误 IS-A 关系的自动方法。
Ran Hu, Jay Shi, Licong Cui, Rashmie Abeysinghe

SNOMED CT is the most comprehensive clinical terminology employed worldwide and enhancing its accuracy is of utmost importance. In this work, we introduce an automated approach to identifying erroneous IS-A relations in SNOMED CT. We first extract linked concept-pairs from which we generate Term Difference Pairs (TDPs) that contain differences between the concepts. Given a TDP, if the reversed TDP also exists and the number of linked-pairs generating this TDP is less than those generating the reversed TDP, then we suggest the former linked-pairs as potentially erroneous IS-A relations. We applied this approach to the Clinical finding and Procedure subhierarchies of the 2022 March US Edition of SNOMED CT, and obtained 52 potentially erroneous IS-A relations and a candidate list of 48 linked-pairs. A domain expert confirmed 41 out of 52 (78.8%) are valid and identified 26 erroneous IS-A relations out of 48 linked-pairs demonstrating the effectiveness of the approach.

SNOMED CT 是全球使用的最全面的临床术语,提高其准确性至关重要。在这项工作中,我们引入了一种自动方法来识别 SNOMED CT 中错误的 IS-A 关系。我们首先提取链接的概念对,从中生成包含概念间差异的术语差异对(TDP)。给定一个 TDP,如果反向 TDP 也存在,并且生成该 TDP 的链接对数量少于生成反向 TDP 的链接对数量,那么我们就将前一个链接对视为潜在的错误 IS-A 关系。我们将这种方法应用于 2022 年 3 月美国版 SNOMED CT 的临床发现和程序子层次结构,得到了 52 个潜在错误的 IS-A 关系和 48 个链接对的候选列表。一位领域专家确认了 52 个关系中的 41 个(78.8%)是有效的,并从 48 个链接对中找出了 26 个错误的 IS-A 关系,证明了该方法的有效性。
{"title":"An Automated Approach for Identifying Erroneous IS-A Relations in SNOMED CT.","authors":"Ran Hu, Jay Shi, Licong Cui, Rashmie Abeysinghe","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>SNOMED CT is the most comprehensive clinical terminology employed worldwide and enhancing its accuracy is of utmost importance. In this work, we introduce an automated approach to identifying erroneous IS-A relations in SNOMED CT. We first extract linked concept-pairs from which we generate Term Difference Pairs (TDPs) that contain differences between the concepts. Given a TDP, if the reversed TDP also exists and the number of linked-pairs generating this TDP is less than those generating the reversed TDP, then we suggest the former linked-pairs as potentially erroneous IS-A relations. We applied this approach to the Clinical finding and Procedure subhierarchies of the 2022 March US Edition of SNOMED CT, and obtained 52 potentially erroneous IS-A relations and a candidate list of 48 linked-pairs. A domain expert confirmed 41 out of 52 (78.8%) are valid and identified 26 erroneous IS-A relations out of 48 linked-pairs demonstrating the effectiveness of the approach.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141797/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Explainable Artificial Intelligence-enabled ECG Framework for the Prediction of Subclinical Coronary Atherosclerosis. 用于预测亚临床冠状动脉粥样硬化的可解释人工智能心电图框架。
Changho Han, Dukyong Yoon

Coronary artery calcium (CAC) as assessed by computed tomography (CT) is a marker of subclinical coronary atherosclerosis. However, routine application of CAC scoring via CT is limited by high costs and accessibility. An electrocardiogram (ECG) is a widely-used, sensitive, cost-effective, non-invasive, and radiation-free diagnostic tool. Considering this, if artificial intelligence (AI)-enabled electrocardiograms (ECGs) could opportunistically detect CAC, it would be particularly beneficial for the asymptomatic or subclinical populations, acting as an initial screening measure, paving the way for further confirmatory tests and preventive strategies, a step ahead of conventional practices. With this aim, we developed an AI-enabled ECG framework that not only predicts a CAC score ≥400 but also offers a visual explanation of the associated potential morphological ECG changes, and tested its efficacy on individuals undergoing health checkups, a group primarily comprising healthy or subclinical individuals. To ensure broader applicability, we performed external validation at a separate institution.

通过计算机断层扫描(CT)评估的冠状动脉钙化(CAC)是亚临床冠状动脉粥样硬化的标志。然而,通过 CT 进行 CAC 评分的常规应用受到高成本和可及性的限制。心电图(ECG)是一种广泛使用、灵敏度高、成本效益高、无创伤、无辐射的诊断工具。有鉴于此,如果人工智能(AI)支持的心电图(ECG)能适时检测出 CAC,那么它将特别有益于无症状或亚临床人群,可作为初步筛查措施,为进一步的确诊测试和预防策略铺平道路,比传统做法更进一步。为此,我们开发了一个人工智能心电图框架,它不仅能预测 CAC 评分≥400,还能对相关的潜在心电图形态学变化提供可视化解释,并在接受健康检查的人群(主要包括健康或亚临床人群)中测试了其有效性。为了确保更广泛的适用性,我们在另外一家机构进行了外部验证。
{"title":"An Explainable Artificial Intelligence-enabled ECG Framework for the Prediction of Subclinical Coronary Atherosclerosis.","authors":"Changho Han, Dukyong Yoon","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Coronary artery calcium (CAC) as assessed by computed tomography (CT) is a marker of subclinical coronary atherosclerosis. However, routine application of CAC scoring via CT is limited by high costs and accessibility. An electrocardiogram (ECG) is a widely-used, sensitive, cost-effective, non-invasive, and radiation-free diagnostic tool. Considering this, if artificial intelligence (AI)-enabled electrocardiograms (ECGs) could opportunistically detect CAC, it would be particularly beneficial for the asymptomatic or subclinical populations, acting as an initial screening measure, paving the way for further confirmatory tests and preventive strategies, a step ahead of conventional practices. With this aim, we developed an AI-enabled ECG framework that not only predicts a CAC score ≥400 but also offers a visual explanation of the associated potential morphological ECG changes, and tested its efficacy on individuals undergoing health checkups, a group primarily comprising healthy or subclinical individuals. To ensure broader applicability, we performed external validation at a separate institution.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141849/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Traumatic Brain Injury Prescreening Tool for Intimate Partner Violence Patients Using Initial Clinical Reports and Machine Learning. 利用初始临床报告和机器学习为亲密伴侣暴力患者提供创伤性脑损伤预检工具。
Abheet Singh Sachdeva, Avery Bell, Dr Jacob Furst, Dorothy A Kozlowski, Sonya Crabtree-Nelson, Daniela Raicu

Research studies have presented an unappreciated relationship between intimate partner violence (IPV) survivors and symptoms of traumatic brain injuries (TBI). Within these IPV survivors, resulting TBIs are not always identified during emergency room visits. This demonstrates a need for a prescreening tool that identifies IPV survivors who should receive TBI screening. We present a model that measures similarities to clinical reports for confirmed TBI cases to identify whether a patient should be screened for TBI. This is done through an ensemble of three supervised learning classifiers which work in two distinct feature spaces. Individual classifiers are trained on clinical reports and then used to create an ensemble that needs only one positive label to indicate a patient should be screened for TBI.

研究表明,亲密伴侣暴力 (IPV) 幸存者与创伤性脑损伤 (TBI) 症状之间的关系未得到重视。在这些 IPV 幸存者中,并不总能在急诊室就诊时发现由此导致的创伤性脑损伤。这表明我们需要一种预检工具来识别应接受 TBI 筛查的 IPV 幸存者。我们提出了一个模型,该模型可测量确诊创伤性脑损伤病例与临床报告的相似性,以确定患者是否应接受创伤性脑损伤筛查。这是通过在两个不同的特征空间中工作的三个监督学习分类器的组合来实现的。单个分类器根据临床报告进行训练,然后用于创建一个集合,该集合只需要一个阳性标签就能表明患者应接受创伤性脑损伤筛查。
{"title":"A Traumatic Brain Injury Prescreening Tool for Intimate Partner Violence Patients Using Initial Clinical Reports and Machine Learning.","authors":"Abheet Singh Sachdeva, Avery Bell, Dr Jacob Furst, Dorothy A Kozlowski, Sonya Crabtree-Nelson, Daniela Raicu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Research studies have presented an unappreciated relationship between intimate partner violence (IPV) survivors and symptoms of traumatic brain injuries (TBI). Within these IPV survivors, resulting TBIs are not always identified during emergency room visits. This demonstrates a need for a prescreening tool that identifies IPV survivors who should receive TBI screening. We present a model that measures similarities to clinical reports for confirmed TBI cases to identify whether a patient should be screened for TBI. This is done through an ensemble of three supervised learning classifiers which work in two distinct feature spaces. Individual classifiers are trained on clinical reports and then used to create an ensemble that needs only one positive label to indicate a patient should be screened for TBI.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141795/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Aiming for Relevance. 以相关性为目标。
Bar Eini-Porat, Danny Eytan, Uri Shalit

Vital signs are crucial in intensive care units (ICUs). They are used to track the patient's state and to identify clinically significant changes. Predicting vital sign trajectories is valuable for early detection of adverse events. However, conventional machine learning metrics like RMSE often fail to capture the true clinical relevance of such predictions. We introduce novel vital sign prediction performance metrics that align with clinical contexts, focusing on deviations from clinical norms, overall trends, and trend deviations. These metrics are derived from empirical utility curves obtained in a previous study through interviews with ICU clinicians. We validate the metrics' usefulness using simulated and real clinical datasets (MIMIC and eICU). Furthermore, we employ these metrics as loss functions for neural networks, resulting in models that excel in predicting clinically significant events. This research paves the way for clinically relevant machine learning model evaluation and optimization, promising to improve ICU patient care.

生命体征对重症监护病房(ICU)至关重要。它们用于跟踪病人的状态,并识别临床上的重大变化。预测生命体征轨迹对于早期发现不良事件很有价值。然而,RMSE 等传统机器学习指标往往无法捕捉此类预测的真正临床意义。我们引入了符合临床背景的新型生命体征预测性能指标,重点关注与临床标准的偏差、总体趋势和趋势偏差。这些指标来源于之前一项研究通过采访重症监护室临床医生获得的经验效用曲线。我们使用模拟和真实临床数据集(MIMIC 和 eICU)验证了这些指标的实用性。此外,我们还将这些指标作为神经网络的损失函数,从而建立了能够出色预测临床重大事件的模型。这项研究为临床相关的机器学习模型评估和优化铺平了道路,有望改善重症监护室的患者护理。
{"title":"Aiming for Relevance.","authors":"Bar Eini-Porat, Danny Eytan, Uri Shalit","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Vital signs are crucial in intensive care units (ICUs). They are used to track the patient's state and to identify clinically significant changes. Predicting vital sign trajectories is valuable for early detection of adverse events. However, conventional machine learning metrics like RMSE often fail to capture the true clinical relevance of such predictions. We introduce novel vital sign prediction performance metrics that align with clinical contexts, focusing on deviations from clinical norms, overall trends, and trend deviations. These metrics are derived from empirical utility curves obtained in a previous study through interviews with ICU clinicians. We validate the metrics' usefulness using simulated and real clinical datasets (MIMIC and eICU). Furthermore, we employ these metrics as loss functions for neural networks, resulting in models that excel in predicting clinically significant events. This research paves the way for clinically relevant machine learning model evaluation and optimization, promising to improve ICU patient care.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Phenotypic Associations for Parkinson's Disease with Longitudinal Clinical Records. 利用纵向临床记录学习帕金森病的表型关联。
Weishen Pan, Chang Su, Jacqueline R M A Maasch, Kun Chen, Claire Henchcliffe, Fei Wang

Parkinson's disease (PD) is associated with multiple clinical motor and non-motor manifestations. Understanding of PD etiologies has been informed by a growing number of genetic mutations and various fluid-based and brain imaging biomarkers. However, the mechanisms underlying its varied phenotypic features remain elusive. The present work introduces a data-driven approach for generating phenotypic association graphs for PD cohorts. Data collected by the Parkinson's Progression Markers Initiative (PPMI), the Parkinson's Disease Biomarkers Program (PDBP), and the Fox Investigation for New Discovery of Biomarkers (BioFIND) were analyzed by this approach to identify heterogeneous and longitudinal phenotypic associations that may provide insight into the pathology of this complex disease. Findings based on the phenotypic association graphs could improve understanding of longitudinal PD pathologies and how these relate to patient symptomology.

帕金森病(PD)与多种临床运动和非运动表现有关。越来越多的基因突变和各种基于体液和脑成像的生物标志物使人们对帕金森病的病因有了更多的了解。然而,其各种表型特征的内在机制仍然难以捉摸。本研究介绍了一种数据驱动方法,用于生成帕金森病队列的表型关联图。该方法分析了帕金森病进展标志物倡议(PPMI)、帕金森病生物标志物计划(PDBP)和福克斯生物标志物新发现调查(BioFIND)收集的数据,以确定异质性和纵向表型关联,从而深入了解这种复杂疾病的病理。基于表型关联图的研究结果可提高对纵向帕金森病病理以及这些病理与患者症状之间关系的认识。
{"title":"Learning Phenotypic Associations for Parkinson's Disease with Longitudinal Clinical Records.","authors":"Weishen Pan, Chang Su, Jacqueline R M A Maasch, Kun Chen, Claire Henchcliffe, Fei Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Parkinson's disease (PD) is associated with multiple clinical motor and non-motor manifestations. Understanding of PD etiologies has been informed by a growing number of genetic mutations and various fluid-based and brain imaging biomarkers. However, the mechanisms underlying its varied phenotypic features remain elusive. The present work introduces a data-driven approach for generating phenotypic association graphs for PD cohorts. Data collected by the Parkinson's Progression Markers Initiative (PPMI), the Parkinson's Disease Biomarkers Program (PDBP), and the Fox Investigation for New Discovery of Biomarkers (BioFIND) were analyzed by this approach to identify heterogeneous and longitudinal phenotypic associations that may provide insight into the pathology of this complex disease. Findings based on the phenotypic association graphs could improve understanding of longitudinal PD pathologies and how these relate to patient symptomology.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141836/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1