公共医疗研究数据仓库集成和运营方面的挑战。

IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES International Journal of Population Data Science Pub Date : 2022-08-25 DOI:10.23889/ijpds.v7i3.1859
Tanya Ravipati, N. Andrew, V. Srikanth, R. Beare
{"title":"公共医疗研究数据仓库集成和运营方面的挑战。","authors":"Tanya Ravipati, N. Andrew, V. Srikanth, R. Beare","doi":"10.23889/ijpds.v7i3.1859","DOIUrl":null,"url":null,"abstract":"ObjectivesPublic health service organisations use multiple patient administration and electronic health record systems. We describe the implementation of a data warehouse automation tool within the National Centre for Healthy Ageing (NCHA) data platform to operationalise a research data warehouse to optimise data quality and data provision for health services research. \nApproachThe traditional data warehouse life cycle comprises repetitive manual tasks and dependency on specialist developers. Automation tools overcome most of these inefficiencies. We conducted an internal risk benefit analysis which was validated by published literature containing data warehouse optimisation and automation. Industry-based data warehouse automation tools were reviewed to align the NCHA requirements with the tool’s functionality. Tools were then shortlisted and evaluated over a six-week period: (1) automation of standard tasks; (2) data pipeline alignment with the World Health Organization’s (WHO) Data Quality Review Framework; and (3) resource dependency risk mitigation through a Proof of Concept (PoC). \nResultsThe priority areas identified by the risk benefit analysis included: end-to-end data warehouse automation; auto scripting; connectivity/linkage with multiple sources, reverse/forward engineering, audit trail conformance, scalability, multiple data warehouse architectures support, automated documentation; data management including data quality; and post-subscription independence. Twenty scientific publications were included in the final literature review (10% within healthcare) and supported the majority of identified priority areas. The industry-based review identified 11 suitable data warehouse/Extract-Transform-Load (ETL) automation tools. Five tools demonstrated adequate performance for task automation, data quality management, reduced dependency on specialist developers and on-premise linkage compatibility. Two automation tools were tested each for 6 weeks through PoC development. One automation tool met 8 out of the 10 automation requirements and was selected for implementation. \nConclusionData warehouse development processes are complex and time consuming. Tools that offer automation of repetitive tasks and scripting increase the consistency while reducing the dependency on specialist staff.  Integrated data quality management minimises the time researchers spend in pre-processing patient level data sourced through a semi-automated data warehouse.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Challenges in public healthcare research data warehouse integration and operationalisation.\",\"authors\":\"Tanya Ravipati, N. Andrew, V. Srikanth, R. Beare\",\"doi\":\"10.23889/ijpds.v7i3.1859\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ObjectivesPublic health service organisations use multiple patient administration and electronic health record systems. We describe the implementation of a data warehouse automation tool within the National Centre for Healthy Ageing (NCHA) data platform to operationalise a research data warehouse to optimise data quality and data provision for health services research. \\nApproachThe traditional data warehouse life cycle comprises repetitive manual tasks and dependency on specialist developers. Automation tools overcome most of these inefficiencies. We conducted an internal risk benefit analysis which was validated by published literature containing data warehouse optimisation and automation. Industry-based data warehouse automation tools were reviewed to align the NCHA requirements with the tool’s functionality. Tools were then shortlisted and evaluated over a six-week period: (1) automation of standard tasks; (2) data pipeline alignment with the World Health Organization’s (WHO) Data Quality Review Framework; and (3) resource dependency risk mitigation through a Proof of Concept (PoC). \\nResultsThe priority areas identified by the risk benefit analysis included: end-to-end data warehouse automation; auto scripting; connectivity/linkage with multiple sources, reverse/forward engineering, audit trail conformance, scalability, multiple data warehouse architectures support, automated documentation; data management including data quality; and post-subscription independence. Twenty scientific publications were included in the final literature review (10% within healthcare) and supported the majority of identified priority areas. The industry-based review identified 11 suitable data warehouse/Extract-Transform-Load (ETL) automation tools. Five tools demonstrated adequate performance for task automation, data quality management, reduced dependency on specialist developers and on-premise linkage compatibility. Two automation tools were tested each for 6 weeks through PoC development. One automation tool met 8 out of the 10 automation requirements and was selected for implementation. \\nConclusionData warehouse development processes are complex and time consuming. Tools that offer automation of repetitive tasks and scripting increase the consistency while reducing the dependency on specialist staff.  Integrated data quality management minimises the time researchers spend in pre-processing patient level data sourced through a semi-automated data warehouse.\",\"PeriodicalId\":36483,\"journal\":{\"name\":\"International Journal of Population Data Science\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2022-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Population Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23889/ijpds.v7i3.1859\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Population Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23889/ijpds.v7i3.1859","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

目的公共卫生服务机构使用多病人管理和电子健康记录系统。我们描述了在国家健康老龄化中心(NCHA)数据平台内实施数据仓库自动化工具,以运行研究数据仓库,优化数据质量和卫生服务研究的数据提供。方法传统的数据仓库生命周期包括重复的手动任务和对专业开发人员的依赖。自动化工具克服了大多数效率低下的问题。我们进行了内部风险收益分析,该分析通过包含数据仓库优化和自动化的已发表文献进行了验证。对基于行业的数据仓库自动化工具进行了审查,以使NCHA要求与该工具的功能保持一致。然后在六周的时间内对工具进行了入围和评估:(1)标准任务的自动化;(2) 与世界卫生组织(世界卫生组织)数据质量审查框架保持一致的数据管道;以及(3)通过概念验证(PoC)减轻资源依赖性风险。结果风险收益分析确定的优先领域包括:端到端数据仓库自动化;自动脚本;与多个来源的连接/链接、反向/正向工程、审计跟踪一致性、可扩展性、多数据仓库架构支持、自动化文档;数据管理,包括数据质量;以及订阅后的独立性。20篇科学出版物被纳入最终文献综述(10%在医疗保健领域),并支持大多数已确定的优先领域。基于行业的审查确定了11个合适的数据仓库/提取转换负载(ETL)自动化工具。五个工具在任务自动化、数据质量管理、减少对专业开发人员的依赖以及内部链接兼容性方面表现出了足够的性能。通过PoC开发,对两个自动化工具分别进行了为期6周的测试。一个自动化工具满足了10个自动化要求中的8个,并被选中实施。结论数据仓库开发过程复杂且耗时。提供重复任务和脚本自动化的工具可以提高一致性,同时减少对专业人员的依赖。集成的数据质量管理最大限度地减少了研究人员在预处理通过半自动化数据仓库获得的患者级数据方面花费的时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Challenges in public healthcare research data warehouse integration and operationalisation.
ObjectivesPublic health service organisations use multiple patient administration and electronic health record systems. We describe the implementation of a data warehouse automation tool within the National Centre for Healthy Ageing (NCHA) data platform to operationalise a research data warehouse to optimise data quality and data provision for health services research. ApproachThe traditional data warehouse life cycle comprises repetitive manual tasks and dependency on specialist developers. Automation tools overcome most of these inefficiencies. We conducted an internal risk benefit analysis which was validated by published literature containing data warehouse optimisation and automation. Industry-based data warehouse automation tools were reviewed to align the NCHA requirements with the tool’s functionality. Tools were then shortlisted and evaluated over a six-week period: (1) automation of standard tasks; (2) data pipeline alignment with the World Health Organization’s (WHO) Data Quality Review Framework; and (3) resource dependency risk mitigation through a Proof of Concept (PoC). ResultsThe priority areas identified by the risk benefit analysis included: end-to-end data warehouse automation; auto scripting; connectivity/linkage with multiple sources, reverse/forward engineering, audit trail conformance, scalability, multiple data warehouse architectures support, automated documentation; data management including data quality; and post-subscription independence. Twenty scientific publications were included in the final literature review (10% within healthcare) and supported the majority of identified priority areas. The industry-based review identified 11 suitable data warehouse/Extract-Transform-Load (ETL) automation tools. Five tools demonstrated adequate performance for task automation, data quality management, reduced dependency on specialist developers and on-premise linkage compatibility. Two automation tools were tested each for 6 weeks through PoC development. One automation tool met 8 out of the 10 automation requirements and was selected for implementation. ConclusionData warehouse development processes are complex and time consuming. Tools that offer automation of repetitive tasks and scripting increase the consistency while reducing the dependency on specialist staff.  Integrated data quality management minimises the time researchers spend in pre-processing patient level data sourced through a semi-automated data warehouse.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
386
审稿时长
20 weeks
期刊最新文献
Neonates With In-Utero SSRI Exposure (NeoWISE): a retrospective cohort study examining the effect of newborn feeding method on newborn withdrawal. Secondary use of routinely collected administrative health data for epidemiologic research: Answering research questions using data collected for a different purpose. Validity of heart failure diagnoses, treatments, and readmissions in the Danish National Patient Registry. Creating an 11-year longitudinal substance use harm cohort from linked health and census data to analyse social drivers of health. Research data use in a digital society: a deliberative public engagement.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1