Imputing Brain Measurements Across Data Sets via Graph Neural Networks.

Yixin Wang, Wei Peng, Susan F Tapert, Qingyu Zhao, Kilian M Pohl
{"title":"Imputing Brain Measurements Across Data Sets via Graph Neural Networks.","authors":"Yixin Wang, Wei Peng, Susan F Tapert, Qingyu Zhao, Kilian M Pohl","doi":"10.1007/978-3-031-46005-0_15","DOIUrl":null,"url":null,"abstract":"<p><p>Publicly available data sets of structural MRIs might not contain specific measurements of brain Regions of Interests (ROIs) that are important for training machine learning models. For example, the curvature scores computed by Freesurfer are not released by the Adolescent Brain Cognitive Development (ABCD) Study. One can address this issue by simply reapplying Freesurfer to the data set. However, this approach is generally computationally and labor intensive (e.g., requiring quality control). An alternative is to impute the missing measurements via a deep learning approach. However, the state-of-the-art is designed to estimate randomly missing values rather than entire measurements. We therefore propose to re-frame the imputation problem as a prediction task on another (public) data set that contains the missing measurements and shares some ROI measurements with the data sets of interest. A deep learning model is then trained to predict the missing measurements from the shared ones and afterwards is applied to the other data sets. Our proposed algorithm models the dependencies between ROI measurements via a graph neural network (GNN) and accounts for demographic differences in brain measurements (e.g. sex) by feeding the graph encoding into a parallel architecture. The architecture simultaneously optimizes a graph decoder to impute values and a classifier in predicting demographic factors. We test the approach, called <i>D</i>emographic <i>A</i>ware <i>G</i>raph-based <i>I</i>mputation (<i>DAGI</i>), on imputing those missing Freesurfer measurements of ABCD (N=3760; minimum age 12 years) by training the predictor on those publicly released by the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA, N=540). 5-fold cross-validation on NCANDA reveals that the imputed scores are more accurate than those generated by linear regressors and deep learning models. Adding them also to a classifier trained in identifying sex results in higher accuracy than only using those Freesurfer scores provided by ABCD.</p>","PeriodicalId":92572,"journal":{"name":"PRedictive Intelligence in MEdicine. PRIME (Workshop)","volume":"14277 ","pages":"172-183"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634632/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PRedictive Intelligence in MEdicine. PRIME (Workshop)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-46005-0_15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Publicly available data sets of structural MRIs might not contain specific measurements of brain Regions of Interests (ROIs) that are important for training machine learning models. For example, the curvature scores computed by Freesurfer are not released by the Adolescent Brain Cognitive Development (ABCD) Study. One can address this issue by simply reapplying Freesurfer to the data set. However, this approach is generally computationally and labor intensive (e.g., requiring quality control). An alternative is to impute the missing measurements via a deep learning approach. However, the state-of-the-art is designed to estimate randomly missing values rather than entire measurements. We therefore propose to re-frame the imputation problem as a prediction task on another (public) data set that contains the missing measurements and shares some ROI measurements with the data sets of interest. A deep learning model is then trained to predict the missing measurements from the shared ones and afterwards is applied to the other data sets. Our proposed algorithm models the dependencies between ROI measurements via a graph neural network (GNN) and accounts for demographic differences in brain measurements (e.g. sex) by feeding the graph encoding into a parallel architecture. The architecture simultaneously optimizes a graph decoder to impute values and a classifier in predicting demographic factors. We test the approach, called Demographic Aware Graph-based Imputation (DAGI), on imputing those missing Freesurfer measurements of ABCD (N=3760; minimum age 12 years) by training the predictor on those publicly released by the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA, N=540). 5-fold cross-validation on NCANDA reveals that the imputed scores are more accurate than those generated by linear regressors and deep learning models. Adding them also to a classifier trained in identifying sex results in higher accuracy than only using those Freesurfer scores provided by ABCD.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过图神经网络对数据集的大脑测量进行脉冲。
公开可用的结构MRI数据集可能不包含对训练机器学习模型很重要的大脑兴趣区域(ROI)的特定测量。例如,青少年大脑认知发展研究没有公布Freesurfer计算的曲率分数。可以通过简单地将Freesurfer重新应用到数据集来解决这个问题。然而,这种方法通常是计算密集型和劳动密集型的(例如,需要质量控制)。另一种选择是通过深度学习方法估算缺失的测量值。然而,最先进的技术旨在估计随机缺失的值,而不是整个测量值。因此,我们建议将插补问题重新定义为另一个(公共)数据集的预测任务,该数据集包含缺失的测量值,并与感兴趣的数据集共享一些ROI测量值。然后训练深度学习模型以从共享的测量值中预测缺失的测量值,然后将其应用于其他数据集。我们提出的算法通过图神经网络(GNN)对ROI测量之间的依赖性进行建模,并通过将图编码输入并行架构来解释大脑测量中的人口统计学差异(例如性别)。该架构同时优化图解码器以估算值,并优化分类器以预测人口统计因素。我们测试了一种名为“基于人口统计感知图的推断”(DAGI)的方法,通过对国家青少年酒精和神经发育联合会(NCANDA,N=540)公开发布的预测因子进行训练,来推断那些缺失的ABCD自由冲浪测量值(N=3760;最低年龄12岁)。NCANDA的5倍交叉验证表明,估算的分数比线性回归和深度学习模型产生的分数更准确。将它们添加到经过性别识别训练的分类器中,比只使用ABCD提供的Freesurfer分数更准确。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Spectral Graph Sample Weighting for Interpretable Sub-cohort Analysis in Predictive Models for Neuroimaging. SynthA1c: Towards Clinically Interpretable Patient Representations for Diabetes Risk Stratification. Imputing Brain Measurements Across Data Sets via Graph Neural Networks. Multiple Instance Neuroimage Transformer. Bridging the Gap between Deep Learning and Hypothesis-Driven Analysis via Permutation Testing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1