{"title":"利用数据分布特征捕捉与阿尔茨海默病亚型相关的生物标记物","authors":"Kenneth Smith, Sharlee Climer","doi":"10.3389/fncom.2024.1388504","DOIUrl":null,"url":null,"abstract":"Late-onset Alzheimer disease (AD) is a highly complex disease with multiple subtypes, as demonstrated by its disparate risk factors, pathological manifestations, and clinical traits. Discovery of biomarkers to diagnose specific AD subtypes is a key step towards understanding biological mechanisms underlying this enigmatic disease, generating candidate drug targets, and selecting participants for drug trials. Popular statistical methods for evaluating candidate biomarkers, fold change (FC) and area under the receiver operating characteristic curve (AUC), were designed for homogeneous data and we demonstrate the inherent weaknesses of these approaches when used to evaluate subtypes representing less than half of the diseased cases. We introduce a unique evaluation metric that is based on the distribution of the values, rather than the magnitude of the values, to identify analytes that are associated with a subset of the diseased cases, thereby revealing potential biomarkers for subtypes. Our approach, Bimodality Coefficient Difference (BCD), computes the difference between the degrees of bimodality for the cases and controls. We demonstrate the effectiveness of our approach with large-scale synthetic data trials containing nearly perfect subtypes. In order to reveal novel AD biomarkers for heterogeneous subtypes, we applied BCD to gene expression data for 8,650 genes for 176 AD cases and 187 controls. Our results confirm the utility of BCD for identifying subtypes of heterogeneous diseases.","PeriodicalId":12363,"journal":{"name":"Frontiers in Computational Neuroscience","volume":"15 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Capturing biomarkers associated with Alzheimer disease subtypes using data distribution characteristics\",\"authors\":\"Kenneth Smith, Sharlee Climer\",\"doi\":\"10.3389/fncom.2024.1388504\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Late-onset Alzheimer disease (AD) is a highly complex disease with multiple subtypes, as demonstrated by its disparate risk factors, pathological manifestations, and clinical traits. Discovery of biomarkers to diagnose specific AD subtypes is a key step towards understanding biological mechanisms underlying this enigmatic disease, generating candidate drug targets, and selecting participants for drug trials. Popular statistical methods for evaluating candidate biomarkers, fold change (FC) and area under the receiver operating characteristic curve (AUC), were designed for homogeneous data and we demonstrate the inherent weaknesses of these approaches when used to evaluate subtypes representing less than half of the diseased cases. We introduce a unique evaluation metric that is based on the distribution of the values, rather than the magnitude of the values, to identify analytes that are associated with a subset of the diseased cases, thereby revealing potential biomarkers for subtypes. Our approach, Bimodality Coefficient Difference (BCD), computes the difference between the degrees of bimodality for the cases and controls. We demonstrate the effectiveness of our approach with large-scale synthetic data trials containing nearly perfect subtypes. In order to reveal novel AD biomarkers for heterogeneous subtypes, we applied BCD to gene expression data for 8,650 genes for 176 AD cases and 187 controls. Our results confirm the utility of BCD for identifying subtypes of heterogeneous diseases.\",\"PeriodicalId\":12363,\"journal\":{\"name\":\"Frontiers in Computational Neuroscience\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Computational Neuroscience\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fncom.2024.1388504\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computational Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fncom.2024.1388504","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
晚发性阿尔茨海默病(AD)是一种高度复杂的疾病,具有多种亚型,其风险因素、病理表现和临床特征各不相同。发现诊断特定阿尔茨海默病亚型的生物标志物是了解这种神秘疾病的生物机制、产生候选药物靶点和选择药物试验参与者的关键一步。评估候选生物标记物的常用统计方法--折叠变化(FC)和接收者工作特征曲线下面积(AUC)--是针对同质数据设计的,我们证明了这些方法在用于评估占患病病例不到一半的亚型时存在固有的缺陷。我们引入了一种独特的评估指标,它基于值的分布而不是值的大小,以确定与患病病例子集相关的分析物,从而揭示亚型的潜在生物标记物。我们的方法--双峰系数差(BCD)--计算病例和对照组的双峰程度之差。我们用包含近乎完美亚型的大规模合成数据试验证明了我们方法的有效性。为了揭示异质性亚型的新型 AD 生物标记物,我们将 BCD 应用于 176 例 AD 病例和 187 例对照的 8650 个基因的基因表达数据。我们的结果证实了 BCD 在识别异质性疾病亚型方面的实用性。
Capturing biomarkers associated with Alzheimer disease subtypes using data distribution characteristics
Late-onset Alzheimer disease (AD) is a highly complex disease with multiple subtypes, as demonstrated by its disparate risk factors, pathological manifestations, and clinical traits. Discovery of biomarkers to diagnose specific AD subtypes is a key step towards understanding biological mechanisms underlying this enigmatic disease, generating candidate drug targets, and selecting participants for drug trials. Popular statistical methods for evaluating candidate biomarkers, fold change (FC) and area under the receiver operating characteristic curve (AUC), were designed for homogeneous data and we demonstrate the inherent weaknesses of these approaches when used to evaluate subtypes representing less than half of the diseased cases. We introduce a unique evaluation metric that is based on the distribution of the values, rather than the magnitude of the values, to identify analytes that are associated with a subset of the diseased cases, thereby revealing potential biomarkers for subtypes. Our approach, Bimodality Coefficient Difference (BCD), computes the difference between the degrees of bimodality for the cases and controls. We demonstrate the effectiveness of our approach with large-scale synthetic data trials containing nearly perfect subtypes. In order to reveal novel AD biomarkers for heterogeneous subtypes, we applied BCD to gene expression data for 8,650 genes for 176 AD cases and 187 controls. Our results confirm the utility of BCD for identifying subtypes of heterogeneous diseases.
期刊介绍:
Frontiers in Computational Neuroscience is a first-tier electronic journal devoted to promoting theoretical modeling of brain function and fostering interdisciplinary interactions between theoretical and experimental neuroscience. Progress in understanding the amazing capabilities of the brain is still limited, and we believe that it will only come with deep theoretical thinking and mutually stimulating cooperation between different disciplines and approaches. We therefore invite original contributions on a wide range of topics that present the fruits of such cooperation, or provide stimuli for future alliances. We aim to provide an interactive forum for cutting-edge theoretical studies of the nervous system, and for promulgating the best theoretical research to the broader neuroscience community. Models of all styles and at all levels are welcome, from biophysically motivated realistic simulations of neurons and synapses to high-level abstract models of inference and decision making. While the journal is primarily focused on theoretically based and driven research, we welcome experimental studies that validate and test theoretical conclusions.
Also: comp neuro