Background
As more low and middle-income countries (LMICs) implement electronic health record systems (EHRs), informatics has become an important component of global health. OpenMRS is a popular open-source EHR that has been implemented in over 60 countries. As in high income countries, interoperability and research capabilities remain a challenge. The Observational Medical Outcomes Partnership (OMOP) is one of the most relevant common data models (CDM) to support EHR-based research and data sharing, but its adoption has been limited in LMICs. To address this gap, we developed an OpenMRS to OMOP extract, transform, and load (ETL) tool using Talend.
Methods
We built on existing documentation to develop a comprehensive concept map from OpenMRS to OMOP. The OMOP domains were reviewed for overlapping concepts in OpenMRS, and a core set of tables were selected for ETL development. Specific variables were then identified from OpenMRS tables which mapped to OMOP domain fields. Afterwards, the ETL tool was developed using MySQL Workbench, PostgreSQL, and Talend.
Results
Seven of 14 OMOP domains were selected for ETL pipeline development . The location, person, and provider domains required the least amount of Talend job components, which involved ≤2 tDBInputs, 1 tMap, and 1 tDBOutput. Care_site, observation_period, observation, and person_death all required additional Talend components to properly transform the respective data fields. It took 15 min to transform 9,932 OpenMRS observation records to OMOP.
Conclusions
It is feasible to develop a free, open-source ETL pipeline to transform clinical data in OpenMRS instances into OMOP. Processing large datasets is swift and scalable with potential for more improvement. Using this tool alongside OpenMRS can dramatically increase the potential for global health informatics collaborations and building local infrastructure and research capacity. Further testing and development will be required prior to widespread dissemination, along with appropriate documentation and training resources.