HarmonyTM: multi-center data harmonization applied to distributed learning for Parkinson's disease classification.

IF 1.9 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Journal of Medical Imaging Pub Date : 2024-09-01 Epub Date: 2024-09-20 DOI:10.1117/1.JMI.11.5.054502
Raissa Souza, Emma A M Stanley, Vedant Gulve, Jasmine Moore, Chris Kang, Richard Camicioli, Oury Monchi, Zahinoor Ismail, Matthias Wilms, Nils D Forkert
{"title":"HarmonyTM: multi-center data harmonization applied to distributed learning for Parkinson's disease classification.","authors":"Raissa Souza, Emma A M Stanley, Vedant Gulve, Jasmine Moore, Chris Kang, Richard Camicioli, Oury Monchi, Zahinoor Ismail, Matthias Wilms, Nils D Forkert","doi":"10.1117/1.JMI.11.5.054502","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Distributed learning is widely used to comply with data-sharing regulations and access diverse datasets for training machine learning (ML) models. The traveling model (TM) is a distributed learning approach that sequentially trains with data from one center at a time, which is especially advantageous when dealing with limited local datasets. However, a critical concern emerges when centers utilize different scanners for data acquisition, which could potentially lead models to exploit these differences as shortcuts. Although data harmonization can mitigate this issue, current methods typically rely on large or paired datasets, which can be impractical to obtain in distributed setups.</p><p><strong>Approach: </strong>We introduced HarmonyTM, a data harmonization method tailored for the TM. HarmonyTM effectively mitigates bias in the model's feature representation while retaining crucial disease-related information, all without requiring extensive datasets. Specifically, we employed adversarial training to \"unlearn\" bias from the features used in the model for classifying Parkinson's disease (PD). We evaluated HarmonyTM using multi-center three-dimensional (3D) neuroimaging datasets from 83 centers using 23 different scanners.</p><p><strong>Results: </strong>Our results show that HarmonyTM improved PD classification accuracy from 72% to 76% and reduced (unwanted) scanner classification accuracy from 53% to 30% in the TM setup.</p><p><strong>Conclusion: </strong>HarmonyTM is a method tailored for harmonizing 3D neuroimaging data within the TM approach, aiming to minimize shortcut learning in distributed setups. This prevents the disease classifier from leveraging scanner-specific details to classify patients with or without PD-a key aspect for deploying ML models for clinical applications.</p>","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413651/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.11.5.054502","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/20 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: Distributed learning is widely used to comply with data-sharing regulations and access diverse datasets for training machine learning (ML) models. The traveling model (TM) is a distributed learning approach that sequentially trains with data from one center at a time, which is especially advantageous when dealing with limited local datasets. However, a critical concern emerges when centers utilize different scanners for data acquisition, which could potentially lead models to exploit these differences as shortcuts. Although data harmonization can mitigate this issue, current methods typically rely on large or paired datasets, which can be impractical to obtain in distributed setups.

Approach: We introduced HarmonyTM, a data harmonization method tailored for the TM. HarmonyTM effectively mitigates bias in the model's feature representation while retaining crucial disease-related information, all without requiring extensive datasets. Specifically, we employed adversarial training to "unlearn" bias from the features used in the model for classifying Parkinson's disease (PD). We evaluated HarmonyTM using multi-center three-dimensional (3D) neuroimaging datasets from 83 centers using 23 different scanners.

Results: Our results show that HarmonyTM improved PD classification accuracy from 72% to 76% and reduced (unwanted) scanner classification accuracy from 53% to 30% in the TM setup.

Conclusion: HarmonyTM is a method tailored for harmonizing 3D neuroimaging data within the TM approach, aiming to minimize shortcut learning in distributed setups. This prevents the disease classifier from leveraging scanner-specific details to classify patients with or without PD-a key aspect for deploying ML models for clinical applications.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HarmonyTM:多中心数据协调应用于帕金森病分类的分布式学习。
目的:分布式学习被广泛应用于遵守数据共享法规和访问各种数据集以训练机器学习(ML)模型。巡回模型 (TM) 是一种分布式学习方法,每次使用一个中心的数据进行顺序训练,这在处理有限的本地数据集时尤其有利。然而,当各中心使用不同的扫描仪采集数据时,就会出现一个重要的问题,这可能会导致模型利用这些差异作为捷径。虽然数据协调可以缓解这一问题,但目前的方法通常依赖于大型或成对的数据集,而在分布式设置中获取这些数据集可能并不现实:我们引入了 HarmonyTM,这是一种专为 TM 量身定制的数据协调方法。HarmonyTM 能有效减少模型特征表示中的偏差,同时保留关键的疾病相关信息,而这一切都不需要大量的数据集。具体来说,我们采用对抗训练来 "消除 "用于帕金森病(PD)分类模型的特征中的偏差。我们使用来自 83 个中心、使用 23 种不同扫描仪的多中心三维(3D)神经成像数据集对 HarmonyTM 进行了评估:结果表明,在 TM 设置中,HarmonyTM 将 PD 分类准确率从 72% 提高到 76%,将(不需要的)扫描仪分类准确率从 53% 降低到 30%:HarmonyTM 是一种在 TM 方法中协调三维神经成像数据的定制方法,旨在最大限度地减少分布式设置中的捷径学习。这可以防止疾病分类器利用扫描仪的特定细节来对患有或不患有帕金森病的患者进行分类--这是在临床应用中部署 ML 模型的关键环节。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Medical Imaging
Journal of Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
4.10
自引率
4.20%
发文量
0
期刊介绍: JMI covers fundamental and translational research, as well as applications, focused on medical imaging, which continue to yield physical and biomedical advancements in the early detection, diagnostics, and therapy of disease as well as in the understanding of normal. The scope of JMI includes: Imaging physics, Tomographic reconstruction algorithms (such as those in CT and MRI), Image processing and deep learning, Computer-aided diagnosis and quantitative image analysis, Visualization and modeling, Picture archiving and communications systems (PACS), Image perception and observer performance, Technology assessment, Ultrasonic imaging, Image-guided procedures, Digital pathology, Biomedical applications of biomedical imaging. JMI allows for the peer-reviewed communication and archiving of scientific developments, translational and clinical applications, reviews, and recommendations for the field.
期刊最新文献
In-silico study of the impact of system design parameters on microcalcification detection in wide-angle digital breast tomosynthesis. Estimation of the absorbed dose in simultaneous digital breast tomosynthesis and mechanical imaging. Breathing motion compensation in chest tomosynthesis: evaluation of the effect on image quality and presence of artifacts. Impact of patient habitus and acquisition protocol on iodine quantification in dual-source photon-counting computed tomography. Spectral optimization using fast kV switching and filtration for photon counting CT with realistic detector responses: a simulation study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1