帕金森病多数据集T1 MRI形态学分类的协调。

IF 1.6 Q3 CLINICAL NEUROLOGY NeuroSci Pub Date : 2024-11-29 DOI:10.3390/neurosci5040042

Mohammed Saqib, Silvina G Horovitz

{"title":"帕金森病多数据集T1 MRI形态学分类的协调。","authors":"Mohammed Saqib, Silvina G Horovitz","doi":"10.3390/neurosci5040042","DOIUrl":null,"url":null,"abstract":"Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson's disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson's disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results.","PeriodicalId":74294,"journal":{"name":"NeuroSci","volume":"5 4","pages":"600-613"},"PeriodicalIF":1.6000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11678312/pdf/","citationCount":"0","resultStr":"{\"title\":\"Harmonization for Parkinson's Disease Multi-Dataset T1 MRI Morphometry Classification.\",\"authors\":\"Mohammed Saqib, Silvina G Horovitz\",\"doi\":\"10.3390/neurosci5040042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson's disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson's disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results.\",\"PeriodicalId\":74294,\"journal\":{\"name\":\"NeuroSci\",\"volume\":\"5 4\",\"pages\":\"600-613\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11678312/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NeuroSci\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/neurosci5040042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NeuroSci","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/neurosci5040042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

摘要

疾病和健康志愿者队列的分类提供了一个有用的临床替代传统的群体统计，由于个性化，个性化的预测。神经退行性疾病的分类器可以在结构MRI形态学上进行训练，但需要大量的多扫描仪数据集，引入混淆批效应。我们在一个示例应用程序中测试了ComBat，这是一种常见的协调模型，用于从健康志愿者中对帕金森病患者进行分类，并识别包括数据泄露在内的常见缺陷。我们使用了来自11个确定的扫描仪的372名受试者（216名帕金森病患者，156名健康志愿者）的多数据集队列。我们同时提取FreeSurfer和Jacobian morphometry的行列式来比较单扫描仪和多扫描仪分类管道。我们通过运行单个扫描仪分类器来确认批处理效应的存在，该分类器可以在扫描仪特定数据集上实现非常不同的auc（平均值：0.651±0.144）。考虑站点之间神经生物学批处理效应的多扫描仪分类器可以很容易地实现0.902的测试AUC，尽管防止数据泄漏的管道只能实现0.550的测试AUC。我们得出结论，批处理效果仍然是分类问题的主要问题，因此即使是令人印象深刻的单扫描仪分类器也不太可能推广到多个扫描仪，并且在分类器问题中解决批处理效果必须避免循环和报告过于乐观的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Harmonization for Parkinson's Disease Multi-Dataset T1 MRI Morphometry Classification.

Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson's disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson's disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

NeuroSci

自引率

0.00%

发文量

审稿时长

11 weeks