Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.

Sourav Kumar, A Lakshminarayanan, Ken Chang, Feri Guretno, Ivan Ho Mien, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy, Praveer Singh
{"title":"Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling.","authors":"Sourav Kumar,&nbsp;A Lakshminarayanan,&nbsp;Ken Chang,&nbsp;Feri Guretno,&nbsp;Ivan Ho Mien,&nbsp;Jayashree Kalpathy-Cramer,&nbsp;Pavitra Krishnaswamy,&nbsp;Praveer Singh","doi":"10.1007/978-3-031-18523-6_12","DOIUrl":null,"url":null,"abstract":"<p><p>Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.</p>","PeriodicalId":72833,"journal":{"name":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890952/pdf/nihms-1859434.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Distributed, collaborative, and federated learning, and affordable AI and healthcare for resource diverse global health : Third MICCAI Workshop, DeCaF 2022 and Second MICCAI Workshop, FAIR 2022, held in conjunction with MICCAI 2022, Sin...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-18523-6_12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/10/7 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Federated Learning (FL) wherein multiple institutions collaboratively train a machine learning model without sharing data is becoming popular. Participating institutions might not contribute equally - some contribute more data, some better quality data or some more diverse data. To fairly rank the contribution of different institutions, Shapley value (SV) has emerged as the method of choice. Exact SV computation is impossibly expensive, especially when there are hundreds of contributors. Existing SV computation techniques use approximations. However, in healthcare where the number of contributing institutions are likely not of a colossal scale, computing exact SVs is still exorbitantly expensive, but not impossible. For such settings, we propose an efficient SV computation technique called SaFE (Shapley Value for Federated Learning using Ensembling). We empirically show that SaFE computes values that are close to exact SVs, and that it performs better than current SV approximations. This is particularly relevant in medical imaging setting where widespread heterogeneity across institutions is rampant and fast accurate data valuation is required to determine the contribution of each participant in multi-institutional collaborative learning.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用Ensembling实现医疗保健联合学习中更高效的数据评估。
联合学习(FL)越来越流行,其中多个机构在不共享数据的情况下协作训练机器学习模型。参与机构的贡献可能不平等——有些机构贡献了更多的数据,有些机构贡献的数据质量更好,有些机构则贡献的数据更加多样化。为了公平地对不同机构的贡献进行排序,Shapley值(SV)已成为一种选择方法。精确的SV计算非常昂贵,尤其是在有数百个贡献者的情况下。现有的SV计算技术使用近似。然而,在医疗保健领域,贡献机构的数量可能不是很大,计算准确的SV仍然非常昂贵,但并非不可能。对于这种设置,我们提出了一种高效的SV计算技术,称为SaFE(使用Ensembling进行联合学习的Shapley值)。我们的经验表明,SaFE计算的值接近精确的SV,并且它的性能优于当前的SV近似。这在医学成像环境中尤其重要,在医学成像背景下,各机构之间普遍存在异质性,需要快速准确的数据评估来确定每个参与者在多机构协作学习中的贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Federated Learning: Fundamentals and Advances Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation. Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling. Towards More Efficient Data Valuation in Healthcare Federated Learning using Ensembling Incremental Learning Meets Transfer Learning: Application to Multi-site Prostate MRI Segmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1