Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples.

ArXiv Pub Date : 2024-10-01

Sai Spandana Chintapalli, Rongguang Wang, Zhijian Yang, Vasiliki Tassopoulou, Fanyang Yu, Vishnu Bashyam, Guray Erus, Pratik Chaudhari, Haochang Shou, Christos Davatzikos

{"title":"Generative models of MRI-derived neuroimaging features and associated dataset of 18,000 samples.","authors":"Sai Spandana Chintapalli, Rongguang Wang, Zhijian Yang, Vasiliki Tassopoulou, Fanyang Yu, Vishnu Bashyam, Guray Erus, Pratik Chaudhari, Haochang Shou, Christos Davatzikos","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Availability of large and diverse medical datasets is often challenged by privacy and data sharing restrictions. For successful application of machine learning techniques for disease diagnosis, prognosis, and precision medicine, large amounts of data are necessary for model building and optimization. To help overcome such limitations in the context of brain MRI, we present GenMIND: a collection of generative models of normative regional volumetric features derived from structural brain imaging. GenMIND models are trained on real brain imaging regional volumetric measures from the iSTAGING consortium, which encompasses over 40,000 MRI scans across 13 studies, incorporating covariates such as age, sex, and race. Leveraging GenMIND, we produce and offer 18,000 synthetic samples spanning the adult lifespan (ages 22-90 years), alongside the model's capability to generate unlimited data. Experimental results indicate that samples generated from GenMIND agree with the distributions obtained from real data. Most importantly, the generated normative data significantly enhance the accuracy of downstream machine learning models on tasks such as disease classification. Data and models are available at: https://huggingface.co/spaces/rongguangw/GenMIND.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11275685/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Availability of large and diverse medical datasets is often challenged by privacy and data sharing restrictions. For successful application of machine learning techniques for disease diagnosis, prognosis, and precision medicine, large amounts of data are necessary for model building and optimization. To help overcome such limitations in the context of brain MRI, we present GenMIND: a collection of generative models of normative regional volumetric features derived from structural brain imaging. GenMIND models are trained on real brain imaging regional volumetric measures from the iSTAGING consortium, which encompasses over 40,000 MRI scans across 13 studies, incorporating covariates such as age, sex, and race. Leveraging GenMIND, we produce and offer 18,000 synthetic samples spanning the adult lifespan (ages 22-90 years), alongside the model's capability to generate unlimited data. Experimental results indicate that samples generated from GenMIND agree with the distributions obtained from real data. Most importantly, the generated normative data significantly enhance the accuracy of downstream machine learning models on tasks such as disease classification. Data and models are available at: https://huggingface.co/spaces/rongguangw/GenMIND.

微信好友朋友圈 QQ好友复制链接

本刊更多论文

NeuroSynth：MRI 衍生的神经解剖生成模型和包含 18,000 个样本的相关数据集。

由于隐私和数据共享方面的限制，大型、多样化医疗数据集的可用性常常受到挑战。要将机器学习技术成功应用于疾病诊断、预后和精准医疗，就需要大量数据来构建和优化模型。为了帮助克服脑部核磁共振成像中的这些限制，我们提出了 NeuroSynth：一个从脑部结构成像中提取的规范区域容积特征的生成模型集合。NeuroSynth 模型是根据 iSTAGING 联合体的真实脑成像区域容积测量结果训练而成的，该联合体包含 13 项研究中的 40,000 多张 MRI 扫描图像，并纳入了年龄、性别和种族等协变量。利用 NeuroSynth，我们制作并提供了 18,000 个合成样本，这些样本跨越了成年人的生命周期（22-90 岁），同时该模型还具有生成无限数据的能力。实验结果表明，NeuroSynth 生成的样本与从真实数据中获得的分布一致。最重要的是，生成的常模数据大大提高了下游机器学习模型在疾病分类等任务中的准确性。数据和模型可在以下网址获取：https://huggingface.co/spaces/rongguangw/neuro-synth。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ArXiv

自引率

0.00%

发文量