Zonyin Shae, Kun-Yi Chen, Chi-Yu Chang, Yuan-Yu Tsai, C. Chou, William I. Baskett, Chi-Ren Shyu, J. J. Tsai
{"title":"Thoughts on Non-IID Data Impact in Healthcare with Federated Learning Medical Blockchain","authors":"Zonyin Shae, Kun-Yi Chen, Chi-Yu Chang, Yuan-Yu Tsai, C. Chou, William I. Baskett, Chi-Ren Shyu, J. J. Tsai","doi":"10.1109/CogMI56440.2022.00013","DOIUrl":null,"url":null,"abstract":"We share the common hypothesis/belief that the more aggregated good quality training data, the better the performance that can be attained by the resulting Artificial Intelligence (AI) model. However, this common belief, in general, is not true in the medical area, since healthcare data sets sourced from different hospitals are often not identically distributed (Non-IID). This imposes severe technical challenges for effectively aggregating the individual hospital data sets together. In this vision paper, instead of offering complete solutions, we will discuss some questions and food for thought with the goal of aiding effective data aggregation and improving federated learning (FL) AI model performance: (1) benchmark and measure the Non-IID degree of medical data sets. (2) include the Non-IID degree metrics in the FL data aggregation mechanism. (3) search for the optimal global model creation strategy among a group of many medical data sets. (4) investigate FL performance better than the centralized learning. This paper will discuss these questions by outlining a visionary approach for exploring a medical blockchain FL mechanism to effectively aggregate medical data across multiple healthcare systems to serve large populations with broad demographics.","PeriodicalId":211430,"journal":{"name":"2022 IEEE 4th International Conference on Cognitive Machine Intelligence (CogMI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 4th International Conference on Cognitive Machine Intelligence (CogMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CogMI56440.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We share the common hypothesis/belief that the more aggregated good quality training data, the better the performance that can be attained by the resulting Artificial Intelligence (AI) model. However, this common belief, in general, is not true in the medical area, since healthcare data sets sourced from different hospitals are often not identically distributed (Non-IID). This imposes severe technical challenges for effectively aggregating the individual hospital data sets together. In this vision paper, instead of offering complete solutions, we will discuss some questions and food for thought with the goal of aiding effective data aggregation and improving federated learning (FL) AI model performance: (1) benchmark and measure the Non-IID degree of medical data sets. (2) include the Non-IID degree metrics in the FL data aggregation mechanism. (3) search for the optimal global model creation strategy among a group of many medical data sets. (4) investigate FL performance better than the centralized learning. This paper will discuss these questions by outlining a visionary approach for exploring a medical blockchain FL mechanism to effectively aggregate medical data across multiple healthcare systems to serve large populations with broad demographics.