INSPIRE datahub: a pan-African integrated suite of services for harmonising longitudinal population health data using OHDSI tools.

IF 3.2 Q1 HEALTH CARE SCIENCES & SERVICES Frontiers in digital health Pub Date : 2024-01-29 eCollection Date: 2024-01-01 DOI:10.3389/fdgth.2024.1329630
Tathagata Bhattacharjee, Sylvia Kiwuwa-Muyingo, Chifundo Kanjala, Molulaqhooa L Maoyi, David Amadi, Michael Ochola, Damazo Kadengye, Arofan Gregory, Agnes Kiragga, Amelia Taylor, Jay Greenfield, Emma Slaymaker, Jim Todd
{"title":"INSPIRE datahub: a pan-African integrated suite of services for harmonising longitudinal population health data using OHDSI tools.","authors":"Tathagata Bhattacharjee, Sylvia Kiwuwa-Muyingo, Chifundo Kanjala, Molulaqhooa L Maoyi, David Amadi, Michael Ochola, Damazo Kadengye, Arofan Gregory, Agnes Kiragga, Amelia Taylor, Jay Greenfield, Emma Slaymaker, Jim Todd","doi":"10.3389/fdgth.2024.1329630","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Population health data integration remains a critical challenge in low- and middle-income countries (LMIC), hindering the generation of actionable insights to inform policy and decision-making. This paper proposes a pan-African, Findable, Accessible, Interoperable, and Reusable (FAIR) research architecture and infrastructure named the INSPIRE datahub. This cloud-based Platform-as-a-Service (PaaS) and on-premises setup aims to enhance the discovery, integration, and analysis of clinical, population-based surveys, and other health data sources.</p><p><strong>Methods: </strong>The INSPIRE datahub, part of the Implementation Network for Sharing Population Information from Research Entities (INSPIRE), employs the Observational Health Data Sciences and Informatics (OHDSI) open-source stack of tools and the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) to harmonise data from African longitudinal population studies. Operating on Microsoft Azure and Amazon Web Services cloud platforms, and on on-premises servers, the architecture offers adaptability and scalability for other cloud providers and technology infrastructure. The OHDSI-based tools enable a comprehensive suite of services for data pipeline development, profiling, mapping, extraction, transformation, loading, documentation, anonymization, and analysis.</p><p><strong>Results: </strong>The INSPIRE datahub's \"On-ramp\" services facilitate the integration of data and metadata from diverse sources into the OMOP CDM. The datahub supports the implementation of OMOP CDM across data producers, harmonizing source data semantically with standard vocabularies and structurally conforming to OMOP table structures. Leveraging OHDSI tools, the datahub performs quality assessment and analysis of the transformed data. It ensures FAIR data by establishing metadata flows, capturing provenance throughout the ETL processes, and providing accessible metadata for potential users. The ETL provenance is documented in a machine- and human-readable Implementation Guide (IG), enhancing transparency and usability.</p><p><strong>Conclusion: </strong>The pan-African INSPIRE datahub presents a scalable and systematic solution for integrating health data in LMICs. By adhering to FAIR principles and leveraging established standards like OMOP CDM, this architecture addresses the current gap in generating evidence to support policy and decision-making for improving the well-being of LMIC populations. The federated research network provisions allow data producers to maintain control over their data, fostering collaboration while respecting data privacy and security concerns. A use-case demonstrated the pipeline using OHDSI and other open-source tools.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"6 ","pages":"1329630"},"PeriodicalIF":3.2000,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10859396/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdgth.2024.1329630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Population health data integration remains a critical challenge in low- and middle-income countries (LMIC), hindering the generation of actionable insights to inform policy and decision-making. This paper proposes a pan-African, Findable, Accessible, Interoperable, and Reusable (FAIR) research architecture and infrastructure named the INSPIRE datahub. This cloud-based Platform-as-a-Service (PaaS) and on-premises setup aims to enhance the discovery, integration, and analysis of clinical, population-based surveys, and other health data sources.

Methods: The INSPIRE datahub, part of the Implementation Network for Sharing Population Information from Research Entities (INSPIRE), employs the Observational Health Data Sciences and Informatics (OHDSI) open-source stack of tools and the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) to harmonise data from African longitudinal population studies. Operating on Microsoft Azure and Amazon Web Services cloud platforms, and on on-premises servers, the architecture offers adaptability and scalability for other cloud providers and technology infrastructure. The OHDSI-based tools enable a comprehensive suite of services for data pipeline development, profiling, mapping, extraction, transformation, loading, documentation, anonymization, and analysis.

Results: The INSPIRE datahub's "On-ramp" services facilitate the integration of data and metadata from diverse sources into the OMOP CDM. The datahub supports the implementation of OMOP CDM across data producers, harmonizing source data semantically with standard vocabularies and structurally conforming to OMOP table structures. Leveraging OHDSI tools, the datahub performs quality assessment and analysis of the transformed data. It ensures FAIR data by establishing metadata flows, capturing provenance throughout the ETL processes, and providing accessible metadata for potential users. The ETL provenance is documented in a machine- and human-readable Implementation Guide (IG), enhancing transparency and usability.

Conclusion: The pan-African INSPIRE datahub presents a scalable and systematic solution for integrating health data in LMICs. By adhering to FAIR principles and leveraging established standards like OMOP CDM, this architecture addresses the current gap in generating evidence to support policy and decision-making for improving the well-being of LMIC populations. The federated research network provisions allow data producers to maintain control over their data, fostering collaboration while respecting data privacy and security concerns. A use-case demonstrated the pipeline using OHDSI and other open-source tools.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
INSPIRE datahub:利用 OHDSI 工具协调纵向人口健康数据的泛非综合服务套件。
导言:人口健康数据整合仍然是中低收入国家(LMIC)面临的一项严峻挑战,它阻碍了为政策和决策提供可操作的见解。本文提出了一个泛非可查找、可访问、可互操作和可重用(FAIR)研究架构和基础设施,名为 INSPIRE datahub。这种基于云的平台即服务(PaaS)和内部设置旨在加强临床、基于人口的调查和其他健康数据源的发现、整合和分析:INSPIRE datahub是研究实体人口信息共享实施网络(INSPIRE)的一部分,它采用了观察性健康数据科学与信息学(OHDSI)开源堆栈工具和观察性医疗结果合作组织(OMOP)通用数据模型(CDM)来协调来自非洲纵向人口研究的数据。该架构在微软 Azure 和亚马逊网络服务云平台以及企业内部服务器上运行,为其他云提供商和技术基础设施提供了适应性和可扩展性。基于 OHDSI 的工具可为数据管道开发、剖析、映射、提取、转换、加载、记录、匿名化和分析提供一整套服务:结果:INSPIRE 数据中心的 "On-ramp "服务有助于将不同来源的数据和元数据整合到 OMOP CDM 中。数据枢纽支持数据生产者实施 OMOP CDM,在语义上协调源数据与标准词汇表,在结构上符合 OMOP 表结构。数据集线器利用 OHDSI 工具,对转换后的数据进行质量评估和分析。它通过建立元数据流、在整个 ETL 流程中捕获出处以及为潜在用户提供可访问的元数据,来确保 FAIR 数据。ETL 的出处记录在机器和人可读的实施指南 (IG) 中,从而提高了透明度和可用性:泛非 INSPIRE 数据枢纽为 LMICs 的健康数据整合提供了一个可扩展的系统性解决方案。通过遵循 FAIR 原则和利用 OMOP CDM 等既定标准,该架构弥补了目前在生成证据以支持政策和决策方面的不足,从而改善了低收入和中等收入国家人口的福祉。联合研究网络的规定允许数据生产者保持对其数据的控制,在尊重数据隐私和安全问题的同时促进合作。一个使用案例展示了使用 OHDSI 和其他开源工具的管道。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.20
自引率
0.00%
发文量
0
审稿时长
13 weeks
期刊最新文献
Promoting appropriate medication use by leveraging medical big data. Prospects for AI clinical summarization to reduce the burden of patient chart review. Data management practice of health extension workers and associated factors in Central Gondar Zone, northwest Ethiopia. Developing remote patient monitoring infrastructure using commercially available cloud platforms. AI's pivotal impact on redefining stakeholder roles and their interactions in medical education and health care.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1