数据不足医院住院风险预测的分布式隐私保护决策支持系统

2012 11th International Conference on Machine Learning and Applications Pub Date : 2012-12-12 DOI:10.1109/ICMLA.2012.180

George Mathew, Zoran Obradovic

{"title":"数据不足医院住院风险预测的分布式隐私保护决策支持系统","authors":"George Mathew, Zoran Obradovic","doi":"10.1109/ICMLA.2012.180","DOIUrl":null,"url":null,"abstract":"Building prediction models for suggestive knowledge from multiple sources dynamically is of great interest from a clinical decision support point of view. This is valuable in situations where the local clinical data repository does not have sufficient number of records to draw conclusions from. However, due to privacy concerns, hospitals are reluctant to divulge patient records. Consequently, a distributed model building mechanism that can use just the statistics from multiple hospitals' databases is valuable. Our DIDT algorithm builds a model in that fashion. In this study, using National Inpatient Sample (NIS) data for 2009, we demonstrate that DIDT algorithm can be used to help collaboratively build a better decision-making model in situations where hospitals have small number of records that are insufficient to make good local models. Based on 262 attributes used for model building, we showed that 9 collaborating hospitals each with less than 100 cases of hospitalizations related to diabetes were able to achieve 9.9% improvement in accuracies of hospitalization prediction collectively using a distributed model as compared to relying on local models developed on their own. When relying on local risk prediction models for diabetes at these 9 hospitals, 159 of 357 patients were misclassified and prediction was impossible for another 16 patients. Our integrated model reduced the misclassification to 138 effectively providing accurate early diagnostics to 37 additional patients. We also introduce the concept of banding to improve DIDT algorithm so as to logically combine multiple hospitals when large number of hospitals is involved for reduction in cross-validation folds.","PeriodicalId":157399,"journal":{"name":"2012 11th International Conference on Machine Learning and Applications","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Distributed Privacy Preserving Decision Support System for Predicting Hospitalization Risk in Hospitals with Insufficient Data\",\"authors\":\"George Mathew, Zoran Obradovic\",\"doi\":\"10.1109/ICMLA.2012.180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Building prediction models for suggestive knowledge from multiple sources dynamically is of great interest from a clinical decision support point of view. This is valuable in situations where the local clinical data repository does not have sufficient number of records to draw conclusions from. However, due to privacy concerns, hospitals are reluctant to divulge patient records. Consequently, a distributed model building mechanism that can use just the statistics from multiple hospitals' databases is valuable. Our DIDT algorithm builds a model in that fashion. In this study, using National Inpatient Sample (NIS) data for 2009, we demonstrate that DIDT algorithm can be used to help collaboratively build a better decision-making model in situations where hospitals have small number of records that are insufficient to make good local models. Based on 262 attributes used for model building, we showed that 9 collaborating hospitals each with less than 100 cases of hospitalizations related to diabetes were able to achieve 9.9% improvement in accuracies of hospitalization prediction collectively using a distributed model as compared to relying on local models developed on their own. When relying on local risk prediction models for diabetes at these 9 hospitals, 159 of 357 patients were misclassified and prediction was impossible for another 16 patients. Our integrated model reduced the misclassification to 138 effectively providing accurate early diagnostics to 37 additional patients. We also introduce the concept of banding to improve DIDT algorithm so as to logically combine multiple hospitals when large number of hospitals is involved for reduction in cross-validation folds.\",\"PeriodicalId\":157399,\"journal\":{\"name\":\"2012 11th International Conference on Machine Learning and Applications\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 11th International Conference on Machine Learning and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2012.180\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 11th International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2012.180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

从临床决策支持的角度来看，动态地为来自多个来源的暗示性知识建立预测模型是非常有趣的。在当地临床数据存储库没有足够数量的记录来得出结论的情况下，这是有价值的。然而，出于隐私考虑，医院不愿透露患者记录。因此，可以只使用来自多个医院数据库的统计数据的分布式模型构建机制是有价值的。我们的DIDT算法以这种方式构建一个模型。在本研究中，我们使用2009年的国家住院病人样本(NIS)数据，我们证明了在医院记录较少，不足以建立良好的本地模型的情况下，DIDT算法可以帮助协同构建更好的决策模型。基于用于模型构建的262个属性，我们表明，与依赖各自开发的本地模型相比，使用分布式模型的9家合作医院(每家医院与糖尿病相关的住院病例少于100例)能够实现9.9%的住院预测准确性提高。9家医院在依赖当地糖尿病风险预测模型时，357例患者中有159例分类错误，另有16例患者无法预测。我们的综合模型将错误分类减少到138例，有效地为另外37例患者提供了准确的早期诊断。我们还引入了条带的概念来改进DIDT算法，以便在涉及大量医院时将多个医院进行逻辑组合，减少交叉验证折叠。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Distributed Privacy Preserving Decision Support System for Predicting Hospitalization Risk in Hospitals with Insufficient Data

Building prediction models for suggestive knowledge from multiple sources dynamically is of great interest from a clinical decision support point of view. This is valuable in situations where the local clinical data repository does not have sufficient number of records to draw conclusions from. However, due to privacy concerns, hospitals are reluctant to divulge patient records. Consequently, a distributed model building mechanism that can use just the statistics from multiple hospitals' databases is valuable. Our DIDT algorithm builds a model in that fashion. In this study, using National Inpatient Sample (NIS) data for 2009, we demonstrate that DIDT algorithm can be used to help collaboratively build a better decision-making model in situations where hospitals have small number of records that are insufficient to make good local models. Based on 262 attributes used for model building, we showed that 9 collaborating hospitals each with less than 100 cases of hospitalizations related to diabetes were able to achieve 9.9% improvement in accuracies of hospitalization prediction collectively using a distributed model as compared to relying on local models developed on their own. When relying on local risk prediction models for diabetes at these 9 hospitals, 159 of 357 patients were misclassified and prediction was impossible for another 16 patients. Our integrated model reduced the misclassification to 138 effectively providing accurate early diagnostics to 37 additional patients. We also introduce the concept of banding to improve DIDT algorithm so as to logically combine multiple hospitals when large number of hospitals is involved for reduction in cross-validation folds.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 11th International Conference on Machine Learning and Applications

自引率

0.00%

发文量