大型分布式数据网络中可自动化分布式回归分析的查询工作流设计。

Qoua L Her, Jessica M Malenfant, Sarah Malek, Yury Vilk, Jessica Young, Lingling Li, Jeffery Brown, Sengwee Toh
{"title":"大型分布式数据网络中可自动化分布式回归分析的查询工作流设计。","authors":"Qoua L Her, Jessica M Malenfant, Sarah Malek, Yury Vilk, Jessica Young, Lingling Li, Jeffery Brown, Sengwee Toh","doi":"10.5334/egems.209","DOIUrl":null,"url":null,"abstract":"Introduction: Patient privacy and data security concerns often limit the feasibility of pooling patient-level data from multiple sources for analysis. Distributed data networks (DDNs) that employ privacy-protecting analytical methods, such as distributed regression analysis (DRA), can mitigate these concerns. However, DRA is not routinely implemented in large DDNs. Objective: We describe the design and implementation of a process framework and query workflow that allow automatable DRA in real-world DDNs that use PopMedNet™, an open-source distributed networking software platform. Methods: We surveyed and catalogued existing hardware and software configurations at all data partners in the Sentinel System, a PopMedNet-driven DDN. Key guiding principles for the design included minimal disruptions to the current PopMedNet query workflow and minimal modifications to data partners’ hardware configurations and software requirements. Results: We developed and implemented a three-step process framework and PopMedNet query workflow that enables automatable DRA: 1) assembling a de-identified patient-level dataset at each data partner, 2) distributing a DRA package to data partners for local iterative analysis, and 3) iteratively transferring intermediate files between data partners and analysis center. The DRA query workflow is agnostic to statistical software, accommodates different regression models, and allows different levels of user-specified automation. Discussion: The process framework can be generalized to and the query workflow can be adopted by other PopMedNet-based DDNs. Conclusion: DRA has great potential to change the paradigm of data analysis in DDNs. Successful implementation of DRA in Sentinel will facilitate adoption of the analytic approach in other DDNs.","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":" ","pages":"11"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.5334/egems.209","citationCount":"19","resultStr":"{\"title\":\"A Query Workflow Design to Perform Automatable Distributed Regression Analysis in Large Distributed Data Networks.\",\"authors\":\"Qoua L Her, Jessica M Malenfant, Sarah Malek, Yury Vilk, Jessica Young, Lingling Li, Jeffery Brown, Sengwee Toh\",\"doi\":\"10.5334/egems.209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Patient privacy and data security concerns often limit the feasibility of pooling patient-level data from multiple sources for analysis. Distributed data networks (DDNs) that employ privacy-protecting analytical methods, such as distributed regression analysis (DRA), can mitigate these concerns. However, DRA is not routinely implemented in large DDNs. Objective: We describe the design and implementation of a process framework and query workflow that allow automatable DRA in real-world DDNs that use PopMedNet™, an open-source distributed networking software platform. Methods: We surveyed and catalogued existing hardware and software configurations at all data partners in the Sentinel System, a PopMedNet-driven DDN. Key guiding principles for the design included minimal disruptions to the current PopMedNet query workflow and minimal modifications to data partners’ hardware configurations and software requirements. Results: We developed and implemented a three-step process framework and PopMedNet query workflow that enables automatable DRA: 1) assembling a de-identified patient-level dataset at each data partner, 2) distributing a DRA package to data partners for local iterative analysis, and 3) iteratively transferring intermediate files between data partners and analysis center. The DRA query workflow is agnostic to statistical software, accommodates different regression models, and allows different levels of user-specified automation. Discussion: The process framework can be generalized to and the query workflow can be adopted by other PopMedNet-based DDNs. Conclusion: DRA has great potential to change the paradigm of data analysis in DDNs. Successful implementation of DRA in Sentinel will facilitate adoption of the analytic approach in other DDNs.\",\"PeriodicalId\":72880,\"journal\":{\"name\":\"EGEMS (Washington, DC)\",\"volume\":\" \",\"pages\":\"11\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.5334/egems.209\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EGEMS (Washington, DC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5334/egems.209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EGEMS (Washington, DC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5334/egems.209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

患者隐私和数据安全问题往往限制了从多个来源汇集患者级数据进行分析的可行性。采用隐私保护分析方法(如分布式回归分析(DRA))的分布式数据网络(ddn)可以减轻这些担忧。然而,在大型ddn中通常不会实现DRA。目的:我们描述了一个流程框架和查询工作流的设计和实现,该流程框架和查询工作流允许在使用PopMedNet™(一个开源分布式网络软件平台)的真实ddn中实现自动化DRA。方法:我们对哨兵系统(一个popmednet驱动的DDN)中所有数据合作伙伴的现有硬件和软件配置进行了调查和分类。设计的主要指导原则包括对当前PopMedNet查询工作流程的干扰最小,对数据合作伙伴的硬件配置和软件需求的修改最小。结果:我们开发并实现了一个三步流程框架和PopMedNet查询工作流,实现了DRA的自动化:1)在每个数据合作伙伴处组装去标识的患者级数据集,2)向数据合作伙伴分发DRA包进行本地迭代分析,3)在数据合作伙伴和分析中心之间迭代传输中间文件。DRA查询工作流与统计软件无关,它支持不同的回归模型,并允许不同级别的用户指定的自动化。讨论:流程框架可以推广到其他基于popmednet的ddn,查询工作流可以被其他基于popmednet的ddn采用。结论:DRA具有改变DDNs数据分析范式的巨大潜力。在Sentinel中成功实施DRA将促进在其他DDNs中采用分析方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Query Workflow Design to Perform Automatable Distributed Regression Analysis in Large Distributed Data Networks.
Introduction: Patient privacy and data security concerns often limit the feasibility of pooling patient-level data from multiple sources for analysis. Distributed data networks (DDNs) that employ privacy-protecting analytical methods, such as distributed regression analysis (DRA), can mitigate these concerns. However, DRA is not routinely implemented in large DDNs. Objective: We describe the design and implementation of a process framework and query workflow that allow automatable DRA in real-world DDNs that use PopMedNet™, an open-source distributed networking software platform. Methods: We surveyed and catalogued existing hardware and software configurations at all data partners in the Sentinel System, a PopMedNet-driven DDN. Key guiding principles for the design included minimal disruptions to the current PopMedNet query workflow and minimal modifications to data partners’ hardware configurations and software requirements. Results: We developed and implemented a three-step process framework and PopMedNet query workflow that enables automatable DRA: 1) assembling a de-identified patient-level dataset at each data partner, 2) distributing a DRA package to data partners for local iterative analysis, and 3) iteratively transferring intermediate files between data partners and analysis center. The DRA query workflow is agnostic to statistical software, accommodates different regression models, and allows different levels of user-specified automation. Discussion: The process framework can be generalized to and the query workflow can be adopted by other PopMedNet-based DDNs. Conclusion: DRA has great potential to change the paradigm of data analysis in DDNs. Successful implementation of DRA in Sentinel will facilitate adoption of the analytic approach in other DDNs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Implementing a Novel Quality Improvement-Based Approach to Data Quality Monitoring and Enhancement in a Multipurpose Clinical Registry. A Spatial Analysis of Health Disparities Associated with Antibiotic Resistant Infections in Children Living in Atlanta (2002–2010) Predicting the Incidence of Pressure Ulcers in the Intensive Care Unit Using Machine Learning Applying a Commercialization-Readiness Framework to Optimize Value for Achieving Sustainability of an Electronic Health Data Research Network and Its Data Capabilities: The SAFTINet Experience. Innovative Data Science to Transform Health Care: All the Pieces Matter
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1