用于神经退行性疾病分类的对比式自我监督学习

medRxiv - Radiology and Imaging Pub Date : 2024-07-04 DOI:10.1101/2024.07.03.24309882

Vadym Gryshchuk, Devesh Singh, Stefan J. Teipel, Martin Dyrba

{"title":"用于神经退行性疾病分类的对比式自我监督学习","authors":"Vadym Gryshchuk, Devesh Singh, Stefan J. Teipel, Martin Dyrba","doi":"10.1101/2024.07.03.24309882","DOIUrl":null,"url":null,"abstract":"Neurodegenerative diseases such as Alzheimer's disease (AD) or frontotemporal lobar degeneration (FTLD) involve specific loss of brain volume, detectable in vivo using T1-weighted MRI scans. Supervised machine learning approaches classifying neurodegenerative diseases require diagnostic-labels for each sample. However, it can be difficult to obtain expert labels for a large amount of data. Self-supervised learning (SSL) offers an alternative for training machine learning models without data-labels. We investigated if the SSL models can applied to distinguish between different neurodegenerative disorders in an interpretable manner. Our method comprises a feature extractor and a downstream classification head. A deep convolutional neural network trained in a contrastive self-supervised way serves as the feature extractor, learning latent representation, while the classifier head is a single-layer perceptron. We used N=2694 T1-weighted MRI scans from four data cohorts: two ADNI datasets, AIBL and FTLDNI, including cognitively normal controls (CN), cases with prodromal and clinical AD, as well as FTLD cases differentiated into its sub-types. Our results showed that the feature extractor trained in a self-supervised way provides generalizable and robust representations for the downstream classification. For AD vs. CN, our model achieves 82% balanced accuracy on the test subset and 80% on an independent holdout dataset. Similarly, the behavioral variant of frontotemporal dementia (BV) vs. CN model attains an 88% balanced accuracy on the test subset. The average feature attribution heatmaps obtained by the Integrated Gradient method highlighted hallmark regions, i.e., temporal gray matter atrophy for AD, and insular atrophy for BV. In conclusion, our models perform comparably to state-of-the-art supervised deep learning approaches. This suggests that the SSL methodology can successfully make use of unannotated neuroimaging datasets as training data while remaining robust and interpretable.","PeriodicalId":501358,"journal":{"name":"medRxiv - Radiology and Imaging","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contrastive Self-supervised Learning for Neurodegenerative Disorder Classification\",\"authors\":\"Vadym Gryshchuk, Devesh Singh, Stefan J. Teipel, Martin Dyrba\",\"doi\":\"10.1101/2024.07.03.24309882\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neurodegenerative diseases such as Alzheimer's disease (AD) or frontotemporal lobar degeneration (FTLD) involve specific loss of brain volume, detectable in vivo using T1-weighted MRI scans. Supervised machine learning approaches classifying neurodegenerative diseases require diagnostic-labels for each sample. However, it can be difficult to obtain expert labels for a large amount of data. Self-supervised learning (SSL) offers an alternative for training machine learning models without data-labels. We investigated if the SSL models can applied to distinguish between different neurodegenerative disorders in an interpretable manner. Our method comprises a feature extractor and a downstream classification head. A deep convolutional neural network trained in a contrastive self-supervised way serves as the feature extractor, learning latent representation, while the classifier head is a single-layer perceptron. We used N=2694 T1-weighted MRI scans from four data cohorts: two ADNI datasets, AIBL and FTLDNI, including cognitively normal controls (CN), cases with prodromal and clinical AD, as well as FTLD cases differentiated into its sub-types. Our results showed that the feature extractor trained in a self-supervised way provides generalizable and robust representations for the downstream classification. For AD vs. CN, our model achieves 82% balanced accuracy on the test subset and 80% on an independent holdout dataset. Similarly, the behavioral variant of frontotemporal dementia (BV) vs. CN model attains an 88% balanced accuracy on the test subset. The average feature attribution heatmaps obtained by the Integrated Gradient method highlighted hallmark regions, i.e., temporal gray matter atrophy for AD, and insular atrophy for BV. In conclusion, our models perform comparably to state-of-the-art supervised deep learning approaches. This suggests that the SSL methodology can successfully make use of unannotated neuroimaging datasets as training data while remaining robust and interpretable.\",\"PeriodicalId\":501358,\"journal\":{\"name\":\"medRxiv - Radiology and Imaging\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Radiology and Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.07.03.24309882\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Radiology and Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.03.24309882","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

阿尔茨海默病（AD）或额颞叶变性（FTLD）等神经退行性疾病会导致特定的脑容量损失，可通过 T1 加权磁共振成像扫描在体内检测到。对神经退行性疾病进行分类的有监督机器学习方法需要每个样本的诊断标签。然而，要获得大量数据的专家标签可能很困难。自我监督学习（SSL）提供了一种无需数据标签即可训练机器学习模型的替代方法。我们研究了 SSL 模型能否以可解释的方式用于区分不同的神经退行性疾病。我们的方法包括一个特征提取器和一个下游分类头。以对比自监督方式训练的深度卷积神经网络作为特征提取器，学习潜在表征，而分类器则是一个单层感知器。我们使用了来自四个数据队列的 N=2694 张 T1 加权 MRI 扫描图像：两个 ADNI 数据集（AIBL 和 FTLDNI），包括认知正常对照组（CN）、AD 前驱和临床病例，以及区分为不同亚型的 FTLD 病例。我们的研究结果表明，以自我监督方式训练的特征提取器为下游分类提供了通用且稳健的表征。对于 AD 与 CN，我们的模型在测试子集上达到了 82% 的平衡准确率，在独立的保留数据集上达到了 80%。同样，前额颞叶痴呆症（BV）的行为变异与 CN 模型在测试子集中的平衡准确率达到了 88%。综合梯度法获得的平均特征归因热图突出了标志性区域，即AD的颞灰质萎缩和BV的岛叶萎缩。总之，我们的模型与最先进的监督深度学习方法性能相当。这表明 SSL 方法可以成功地利用未标注的神经成像数据集作为训练数据，同时保持稳健性和可解释性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Contrastive Self-supervised Learning for Neurodegenerative Disorder Classification

Neurodegenerative diseases such as Alzheimer's disease (AD) or frontotemporal lobar degeneration (FTLD) involve specific loss of brain volume, detectable in vivo using T1-weighted MRI scans. Supervised machine learning approaches classifying neurodegenerative diseases require diagnostic-labels for each sample. However, it can be difficult to obtain expert labels for a large amount of data. Self-supervised learning (SSL) offers an alternative for training machine learning models without data-labels. We investigated if the SSL models can applied to distinguish between different neurodegenerative disorders in an interpretable manner. Our method comprises a feature extractor and a downstream classification head. A deep convolutional neural network trained in a contrastive self-supervised way serves as the feature extractor, learning latent representation, while the classifier head is a single-layer perceptron. We used N=2694 T1-weighted MRI scans from four data cohorts: two ADNI datasets, AIBL and FTLDNI, including cognitively normal controls (CN), cases with prodromal and clinical AD, as well as FTLD cases differentiated into its sub-types. Our results showed that the feature extractor trained in a self-supervised way provides generalizable and robust representations for the downstream classification. For AD vs. CN, our model achieves 82% balanced accuracy on the test subset and 80% on an independent holdout dataset. Similarly, the behavioral variant of frontotemporal dementia (BV) vs. CN model attains an 88% balanced accuracy on the test subset. The average feature attribution heatmaps obtained by the Integrated Gradient method highlighted hallmark regions, i.e., temporal gray matter atrophy for AD, and insular atrophy for BV. In conclusion, our models perform comparably to state-of-the-art supervised deep learning approaches. This suggests that the SSL methodology can successfully make use of unannotated neuroimaging datasets as training data while remaining robust and interpretable.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

medRxiv - Radiology and Imaging

自引率

0.00%

发文量