Esther Thea Inau, Dörte Radke, Linda Bird, Susanne Westphal, Till Ittermann, Christian Schäfer, Matthias Nauck, Atinkut Alamirrew Zeleke, Carsten Oliver Schmidt, Dagmar Waltemath
{"title":"Semantic enrichment of Pomeranian health study data using LOINC and WHO-FIC terminology mapping principles.","authors":"Esther Thea Inau, Dörte Radke, Linda Bird, Susanne Westphal, Till Ittermann, Christian Schäfer, Matthias Nauck, Atinkut Alamirrew Zeleke, Carsten Oliver Schmidt, Dagmar Waltemath","doi":"10.1093/jamiaopen/ooaf010","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To semantically enrich the laboratory data dictionary of the Study of Health in Pomerania (SHIP), a population-based cohort study, with LOINC to achieve better compliance with the FAIR principles for data stewardship.</p><p><strong>Materials and methods: </strong>We employed a workflow that maps codes from the SHIP-START-4 laboratory data dictionary to LOINC codes following the terminology mapping principles and best practices recommended by the World Health Organization Family of International Classifications (WHO-FIC) Network.</p><p><strong>Results: </strong>We were able to annotate 71 out of 72 (98.6%) of the source codes in the SHIP-START-4 laboratory data dictionary with LOINC codes. 32 source codes were mapped to a single LOINC code (cardinality 1:1) and 39 resulted in a complex mapping. All of the successful mappings are equivalent (=) matches.</p><p><strong>Discussion: </strong>We increased the FAIRness of the SHIP laboratory data dictionary by semantically enriching laboratory items with links to an accessible, established, and machine-readable language for knowledge representation (LOINC). Our mapping improves semantic data retrieval and integration. However, not all clinically and significantly relevant data are included in the LOINC code. Therefore, these missing aspects have to be considered in data interpretation as well.</p><p><strong>Conclusion: </strong>Semantically enriching the SHIP-START-4 laboratory data dictionary has contributed to its improved data interoperability and reuse. We recommend that data owners and standardization experts collaboratively perform annotations before data collection starts instead of doing this retrospectively. These experiences may inform the development of standard operating procedures for annotating data dictionaries developed for other population-based cohort studies.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf010"},"PeriodicalIF":2.5000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11884810/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooaf010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: To semantically enrich the laboratory data dictionary of the Study of Health in Pomerania (SHIP), a population-based cohort study, with LOINC to achieve better compliance with the FAIR principles for data stewardship.
Materials and methods: We employed a workflow that maps codes from the SHIP-START-4 laboratory data dictionary to LOINC codes following the terminology mapping principles and best practices recommended by the World Health Organization Family of International Classifications (WHO-FIC) Network.
Results: We were able to annotate 71 out of 72 (98.6%) of the source codes in the SHIP-START-4 laboratory data dictionary with LOINC codes. 32 source codes were mapped to a single LOINC code (cardinality 1:1) and 39 resulted in a complex mapping. All of the successful mappings are equivalent (=) matches.
Discussion: We increased the FAIRness of the SHIP laboratory data dictionary by semantically enriching laboratory items with links to an accessible, established, and machine-readable language for knowledge representation (LOINC). Our mapping improves semantic data retrieval and integration. However, not all clinically and significantly relevant data are included in the LOINC code. Therefore, these missing aspects have to be considered in data interpretation as well.
Conclusion: Semantically enriching the SHIP-START-4 laboratory data dictionary has contributed to its improved data interoperability and reuse. We recommend that data owners and standardization experts collaboratively perform annotations before data collection starts instead of doing this retrospectively. These experiences may inform the development of standard operating procedures for annotating data dictionaries developed for other population-based cohort studies.