Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling

Sourav Kumar, A. Lakshminarayanan, Ken Chang, Feri Guretno, Ivan Ho Mien, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy, Praveer Singh
{"title":"Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling","authors":"Sourav Kumar, A. Lakshminarayanan, Ken Chang, Feri Guretno, Ivan Ho Mien, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy, Praveer Singh","doi":"10.48550/arXiv.2209.05424","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.","PeriodicalId":72833,"journal":{"name":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.05424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在医疗保健联邦学习中使用集成实现更有效的数据评估
联邦学习(FL)是指多个机构在不共享数据的情况下协同训练机器学习模型,它正变得越来越流行。参与机构的贡献可能不尽相同——有些机构提供的数据更多,有些提供的数据质量更好,有些提供的数据更多样化。为了公平地对不同机构的贡献进行排名,沙普利值(Shapley value, SV)作为一种选择方法出现了。精确的SV计算非常昂贵,特别是当有数百个贡献者时。现有的SV计算技术使用近似值。然而,在医疗保健领域,贡献机构的数量可能不是很大,计算精确的sv仍然非常昂贵,但并非不可能。对于这种设置,我们提出了一种高效的SV计算技术,称为SaFE (Shapley Value For Federated Learning using Ensembling)。我们的经验表明,SaFE计算的值接近精确的SV,并且它比当前的SV近似值表现得更好。这在医学成像环境中尤为重要,因为医疗机构之间存在广泛的异质性,需要快速准确的数据评估,以确定多机构协作学习中每个参与者的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Federated Learning: Fundamentals and Advances Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation. Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling. Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1