Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs

Juan Espinoza , Sab Sikder , Armine Lulejian , Barry Levine
{"title":"Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs","authors":"Juan Espinoza ,&nbsp;Sab Sikder ,&nbsp;Armine Lulejian ,&nbsp;Barry Levine","doi":"10.1016/j.cmpbup.2023.100119","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>As more low and middle-income countries (LMICs) implement electronic health record systems (EHRs), informatics has become an important component of global health. OpenMRS is a popular open-source EHR that has been implemented in over 60 countries. As in high income countries, interoperability and research capabilities remain a challenge. The Observational Medical Outcomes Partnership (OMOP) is one of the most relevant common data models (CDM) to support EHR-based research and data sharing, but its adoption has been limited in LMICs. To address this gap, we developed an OpenMRS to OMOP extract, transform, and load (ETL) tool using Talend.</p></div><div><h3>Methods</h3><p>We built on existing documentation to develop a comprehensive concept map from OpenMRS to OMOP. The OMOP domains were reviewed for overlapping concepts in OpenMRS, and a core set of tables were selected for ETL development. Specific variables were then identified from OpenMRS tables which mapped to OMOP domain fields. Afterwards, the ETL tool was developed using MySQL Workbench, PostgreSQL, and Talend.</p></div><div><h3>Results</h3><p>Seven of 14 OMOP domains were selected for ETL pipeline development . The location, person, and provider domains required the least amount of Talend job components, which involved ≤2 tDBInputs, 1 tMap, and 1 tDBOutput. Care_site, observation_period, observation, and person_death all required additional Talend components to properly transform the respective data fields. It took 15 min to transform 9,932 OpenMRS observation records to OMOP.</p></div><div><h3>Conclusions</h3><p>It is feasible to develop a free, open-source ETL pipeline to transform clinical data in OpenMRS instances into OMOP. Processing large datasets is swift and scalable with potential for more improvement. Using this tool alongside OpenMRS can dramatically increase the potential for global health informatics collaborations and building local infrastructure and research capacity. Further testing and development will be required prior to widespread dissemination, along with appropriate documentation and training resources.</p></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"4 ","pages":"Article 100119"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine update","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666990023000277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

As more low and middle-income countries (LMICs) implement electronic health record systems (EHRs), informatics has become an important component of global health. OpenMRS is a popular open-source EHR that has been implemented in over 60 countries. As in high income countries, interoperability and research capabilities remain a challenge. The Observational Medical Outcomes Partnership (OMOP) is one of the most relevant common data models (CDM) to support EHR-based research and data sharing, but its adoption has been limited in LMICs. To address this gap, we developed an OpenMRS to OMOP extract, transform, and load (ETL) tool using Talend.

Methods

We built on existing documentation to develop a comprehensive concept map from OpenMRS to OMOP. The OMOP domains were reviewed for overlapping concepts in OpenMRS, and a core set of tables were selected for ETL development. Specific variables were then identified from OpenMRS tables which mapped to OMOP domain fields. Afterwards, the ETL tool was developed using MySQL Workbench, PostgreSQL, and Talend.

Results

Seven of 14 OMOP domains were selected for ETL pipeline development . The location, person, and provider domains required the least amount of Talend job components, which involved ≤2 tDBInputs, 1 tMap, and 1 tDBOutput. Care_site, observation_period, observation, and person_death all required additional Talend components to properly transform the respective data fields. It took 15 min to transform 9,932 OpenMRS observation records to OMOP.

Conclusions

It is feasible to develop a free, open-source ETL pipeline to transform clinical data in OpenMRS instances into OMOP. Processing large datasets is swift and scalable with potential for more improvement. Using this tool alongside OpenMRS can dramatically increase the potential for global health informatics collaborations and building local infrastructure and research capacity. Further testing and development will be required prior to widespread dissemination, along with appropriate documentation and training resources.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
开发OpenMRS-OMOP ETL工具,支持中低收入国家的信息学研究和协作
背景随着越来越多的中低收入国家实施电子健康记录系统,信息学已成为全球健康的重要组成部分。OpenMRS是一种流行的开源EHR,已在60多个国家实施。与高收入国家一样,互操作性和研究能力仍然是一个挑战。观察医学结果伙伴关系(OMOP)是支持基于EHR的研究和数据共享的最相关的通用数据模型(CDM)之一,但其在LMIC中的采用受到限制。为了解决这一差距,我们使用Talend.Methods开发了一个OpenMRS到OMOP的提取、转换和加载(ETL)工具。我们建立在现有文档的基础上,开发了从OpenMRS至OMOP的全面概念图。审查了OpenMRS中OMOP域的重叠概念,并为ETL开发选择了一组核心表。然后从映射到OMOP域字段的OpenMRS表中识别特定变量。之后,使用MySQL Workbench、PostgreSQL和Talend.开发了ETL工具。结果选择了14个OMOP域进行ETL管道开发。位置、人员和提供者域需要最少的Talend作业组件,其涉及≤2 tDBInputs、1 tMap和1 tDBOuts。Care_site、observation_period、observation和person_death都需要额外的Talend组件来正确转换相应的数据字段。将9932个OpenMRS观察记录转换为OMOP需要15分钟。结论开发一个免费的开源ETL管道将OpenMRS实例中的临床数据转换为OMOP是可行的。处理大型数据集快速且可扩展,有可能进行更多改进。与OpenMRS一起使用该工具可以极大地增加全球卫生信息学合作以及建设当地基础设施和研究能力的潜力。在广泛传播之前,需要进行进一步的测试和开发,并提供适当的文件和培训资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.90
自引率
0.00%
发文量
0
审稿时长
10 weeks
期刊最新文献
Fostering digital health literacy to enhance trust and improve health outcomes Machine learning from real data: A mental health registry case study ResfEANet: ResNet-fused External Attention Network for Tuberculosis Diagnosis using Chest X-ray Images Role-playing recovery in social virtual worlds: Adult use of child avatars as PTSD therapy Comparative evaluation of low-cost 3D scanning devices for ear acquisition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1