不完整多模态神经影像数据联合分析的多源学习。

KDD : proceedings. International Conference on Knowledge Discovery & Data Mining Pub Date : 2012-01-01 DOI:10.1145/2339530.2339710

Lei Yuan, Yalin Wang, Paul M Thompson, Vaibhav A Narayan, Jieping Ye

{"title":"不完整多模态神经影像数据联合分析的多源学习。","authors":"Lei Yuan, Yalin Wang, Paul M Thompson, Vaibhav A Narayan, Jieping Ye","doi":"10.1145/2339530.2339710","DOIUrl":null,"url":null,"abstract":"Incomplete data present serious problems when integrating largescale brain imaging data sets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. We address this problem by proposing two novel learning methods where all the samples (with at least one available data source) can be used. In the first method, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. Our second method learns a base classifier for each data source independently, based on which we represent each source using a single column of prediction scores; we then estimate the missing prediction scores, which, combined with the existing prediction scores, are used to build a multi-source fusion model. To illustrate the proposed approaches, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172 AD, 397 MCI, 211 Normal), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithms. Comprehensive experiments show that our proposed methods yield stable and promising results.","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":" ","pages":"1149-1157"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2339530.2339710","citationCount":"46","resultStr":"{\"title\":\"Multi-Source Learning for Joint Analysis of Incomplete Multi-Modality Neuroimaging Data.\",\"authors\":\"Lei Yuan, Yalin Wang, Paul M Thompson, Vaibhav A Narayan, Jieping Ye\",\"doi\":\"10.1145/2339530.2339710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Incomplete data present serious problems when integrating largescale brain imaging data sets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. We address this problem by proposing two novel learning methods where all the samples (with at least one available data source) can be used. In the first method, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. Our second method learns a base classifier for each data source independently, based on which we represent each source using a single column of prediction scores; we then estimate the missing prediction scores, which, combined with the existing prediction scores, are used to build a multi-source fusion model. To illustrate the proposed approaches, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172 AD, 397 MCI, 211 Normal), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithms. Comprehensive experiments show that our proposed methods yield stable and promising results.\",\"PeriodicalId\":74037,\"journal\":{\"name\":\"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining\",\"volume\":\" \",\"pages\":\"1149-1157\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/2339530.2339710\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2339530.2339710\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2339530.2339710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

摘要

当整合来自不同成像方式的大规模脑成像数据集时，数据不完整会带来严重的问题。例如，在阿尔茨海默病神经影像学倡议(ADNI)中，超过一半的受试者缺乏脑脊液(CSF)测量;独立的一半受试者没有氟脱氧葡萄糖正电子发射断层扫描(FDG-PET);许多缺乏蛋白质组学测量。传统上，缺少测量的受试者被丢弃，导致可用信息的严重丢失。我们通过提出两种新颖的学习方法来解决这个问题，其中所有的样本(至少有一个可用的数据源)都可以使用。在第一种方法中，我们根据数据源的可用性划分样本，并使用最先进的稀疏学习方法学习共享的特征集。我们的第二种方法是为每个数据源独立学习一个基本分类器，在此基础上，我们使用单个预测分数列表示每个数据源;然后对缺失的预测分数进行估计，并结合已有的预测分数构建多源融合模型。为了说明所提出的方法，我们根据多模态数据将ADNI研究中的患者分为阿尔茨海默病(AD)、轻度认知障碍(MCI)和正常对照组。在基线时，ADNI的780名参与者(172名AD, 397名MCI, 211名正常)至少有四种数据类型中的一种:磁共振成像(MRI)， FDG-PET, CSF和蛋白质组学。这些数据用来测试我们的算法。综合实验表明，我们提出的方法产生了稳定和有希望的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multi-Source Learning for Joint Analysis of Incomplete Multi-Modality Neuroimaging Data.

Incomplete data present serious problems when integrating largescale brain imaging data sets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. We address this problem by proposing two novel learning methods where all the samples (with at least one available data source) can be used. In the first method, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. Our second method learns a base classifier for each data source independently, based on which we represent each source using a single column of prediction scores; we then estimate the missing prediction scores, which, combined with the existing prediction scores, are used to build a multi-source fusion model. To illustrate the proposed approaches, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172 AD, 397 MCI, 211 Normal), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithms. Comprehensive experiments show that our proposed methods yield stable and promising results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

KDD : proceedings. International Conference on Knowledge Discovery & Data Mining

自引率

0.00%

发文量