Challenges in multi-task learning for fMRI-based diagnosis: Benefits for psychiatric conditions and CNVs would likely require thousands of patients

A. Harvey, Clara A Moreau, K. Kumar, Guillaume Huguet, S. Urchs, H. Sharmarke, K. Jizi, Charles-Olivier Martin, N. Younis, P. Tamer, J.-L. Martineau, P. Orban, Ana Isabel Silva, Jeremy Hall, M. B. van den Bree, Michael J. Owen, David E J Linden, Sarah Lippé, C. Bearden, Guillaume Dumas, Sébastien Jacquemont, P. Bellec
{"title":"Challenges in multi-task learning for fMRI-based diagnosis: Benefits for psychiatric conditions and CNVs would likely require thousands of patients","authors":"A. Harvey, Clara A Moreau, K. Kumar, Guillaume Huguet, S. Urchs, H. Sharmarke, K. Jizi, Charles-Olivier Martin, N. Younis, P. Tamer, J.-L. Martineau, P. Orban, Ana Isabel Silva, Jeremy Hall, M. B. van den Bree, Michael J. Owen, David E J Linden, Sarah Lippé, C. Bearden, Guillaume Dumas, Sébastien Jacquemont, P. Bellec","doi":"10.1162/imag_a_00222","DOIUrl":null,"url":null,"abstract":"Abstract There is a growing interest in using machine learning (ML) models to perform automatic diagnosis of psychiatric conditions; however, generalising the prediction of ML models to completely independent data can lead to sharp decrease in performance. Patients with different psychiatric diagnoses have traditionally been studied independently, yet there is a growing recognition of neuroimaging signatures shared across them as well as rare genetic copy number variants (CNVs). In this work, we assess the potential of multi-task learning (MTL) to improve accuracy by characterising multiple related conditions with a single model, making use of information shared across diagnostic categories and exposing the model to a larger and more diverse dataset. As a proof of concept, we first established the efficacy of MTL in a context where there is clearly information shared across tasks: the same target (age or sex) is predicted at different sites of data collection in a large functional magnetic resonance imaging (fMRI) dataset compiled from multiple studies. MTL generally led to substantial gains relative to independent prediction at each site. Performing scaling experiments on the UK Biobank, we observed that performance was highly dependent on sample size: for large sample sizes (N > 6000) sex prediction was better using MTL across three sites (N = K per site) than prediction at a single site (N = 3K), but for small samples (N < 500) MTL was actually detrimental for age prediction. We then used established machine-learning methods to benchmark the diagnostic accuracy of each of the 7 CNVs (N = 19–103) and 4 psychiatric conditions (N = 44–472) independently, replicating the accuracy previously reported in the literature on psychiatric conditions. We observed that MTL hurt performance when applied across the full set of diagnoses, and complementary analyses failed to identify pairs of conditions which would benefit from MTL. Taken together, our results show that if a successful multi-task diagnostic model of psychiatric conditions were to be developed with resting-state fMRI, it would likely require datasets with thousands of patients across different diagnoses.","PeriodicalId":507939,"journal":{"name":"Imaging Neuroscience","volume":"12 6","pages":"1-20"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Imaging Neuroscience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/imag_a_00222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract There is a growing interest in using machine learning (ML) models to perform automatic diagnosis of psychiatric conditions; however, generalising the prediction of ML models to completely independent data can lead to sharp decrease in performance. Patients with different psychiatric diagnoses have traditionally been studied independently, yet there is a growing recognition of neuroimaging signatures shared across them as well as rare genetic copy number variants (CNVs). In this work, we assess the potential of multi-task learning (MTL) to improve accuracy by characterising multiple related conditions with a single model, making use of information shared across diagnostic categories and exposing the model to a larger and more diverse dataset. As a proof of concept, we first established the efficacy of MTL in a context where there is clearly information shared across tasks: the same target (age or sex) is predicted at different sites of data collection in a large functional magnetic resonance imaging (fMRI) dataset compiled from multiple studies. MTL generally led to substantial gains relative to independent prediction at each site. Performing scaling experiments on the UK Biobank, we observed that performance was highly dependent on sample size: for large sample sizes (N > 6000) sex prediction was better using MTL across three sites (N = K per site) than prediction at a single site (N = 3K), but for small samples (N < 500) MTL was actually detrimental for age prediction. We then used established machine-learning methods to benchmark the diagnostic accuracy of each of the 7 CNVs (N = 19–103) and 4 psychiatric conditions (N = 44–472) independently, replicating the accuracy previously reported in the literature on psychiatric conditions. We observed that MTL hurt performance when applied across the full set of diagnoses, and complementary analyses failed to identify pairs of conditions which would benefit from MTL. Taken together, our results show that if a successful multi-task diagnostic model of psychiatric conditions were to be developed with resting-state fMRI, it would likely require datasets with thousands of patients across different diagnoses.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于 fMRI 诊断的多任务学习所面临的挑战:对精神疾病和 CNV 的治疗可能需要数千名患者的参与
摘要 使用机器学习(ML)模型对精神疾病进行自动诊断的兴趣日益浓厚;然而,将 ML 模型的预测推广到完全独立的数据可能会导致性能急剧下降。不同精神病诊断的患者历来都是独立研究的,但人们越来越认识到他们之间共享的神经影像特征以及罕见的基因拷贝数变异(CNV)。在这项工作中,我们评估了多任务学习(MTL)的潜力,通过使用单一模型描述多种相关病症,利用诊断类别之间共享的信息,并将模型暴露于更大更多样化的数据集,从而提高准确性。作为概念验证,我们首先确定了 MTL 在明显存在跨任务共享信息的情况下的功效:在一个由多项研究汇编而成的大型功能磁共振成像(fMRI)数据集中,在不同的数据收集地点预测同一目标(年龄或性别)。相对于每个站点的独立预测,MTL 一般都能带来巨大的收益。在对英国生物库进行扩展实验时,我们发现其性能与样本量有很大关系:对于大样本量(N > 6000),使用 MTL 在三个站点(每个站点 N = K)进行性别预测比在单个站点(N = 3K)进行预测效果更好,但对于小样本(N < 500),MTL 实际上不利于年龄预测。然后,我们使用成熟的机器学习方法对 7 个 CNV(N = 19-103)和 4 种精神疾病(N = 44-472)的诊断准确性进行了独立基准测试,复制了之前有关精神疾病的文献中报告的准确性。我们观察到,当 MTL 应用于全部诊断时,会损害性能,而补充分析未能确定可从 MTL 中获益的病症对。综上所述,我们的研究结果表明,如果要利用静息态 fMRI 建立一个成功的精神疾病多任务诊断模型,很可能需要包含数千名不同诊断患者的数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Optimization and validation of multi-echo, multi-contrast SAGE acquisition in fMRI BOLD fMRI responses to amplitude-modulated sounds across age in adult listeners Developmental trajectories of the default mode, frontoparietal, and salience networks from the third trimester through the newborn period GABA levels decline with age: A longitudinal study Unveiling hidden sources of dynamic functional connectome through a novel regularized blind source separation approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1