Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data

IF 1.6 3区 工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Journal of Biomedical Semantics Pub Date : 2022-03-15 DOI:10.1186/s13326-022-00264-6
Kaliyaperumal, Rajaram, Wilkinson, Mark D., Moreno, Pablo Alarcón, Benis, Nirupama, Cornet, Ronald, dos Santos Vieira, Bruna, Dumontier, Michel, Bernabé, César Henrique, Jacobsen, Annika, Le Cornec, Clémence M. A., Godoy, Mario Prieto, Queralt-Rosinach, Núria, Schultze Kool, Leo J., Swertz, Morris A., van Damme, Philip, van der Velde, K. Joeri, Lalout, Nawel, Zhang, Shuxin, Roos, Marco
{"title":"Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data","authors":"Kaliyaperumal, Rajaram, Wilkinson, Mark D., Moreno, Pablo Alarcón, Benis, Nirupama, Cornet, Ronald, dos Santos Vieira, Bruna, Dumontier, Michel, Bernabé, César Henrique, Jacobsen, Annika, Le Cornec, Clémence M. A., Godoy, Mario Prieto, Queralt-Rosinach, Núria, Schultze Kool, Leo J., Swertz, Morris A., van Damme, Philip, van der Velde, K. Joeri, Lalout, Nawel, Zhang, Shuxin, Roos, Marco","doi":"10.1186/s13326-022-00264-6","DOIUrl":null,"url":null,"abstract":"The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Diseases (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them.","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"32 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Semantics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s13326-022-00264-6","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 16

Abstract

The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Diseases (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
罕见病注册表常用数据元素的语义建模,以及在注册表数据上部署这些元素的原型工作流
欧洲罕见病注册平台(EU RD Platform)旨在通过建立整合和互操作性标准,解决欧洲罕见病(RD)患者数据分散在数百个独立和非协调注册中心的问题。这项工作的第一个实际输出是一组16个公共数据元素(cde),应该由所有RD注册中心实现。然而,互操作性需要超越数据元素的决策——包括数据模型、格式和语义。在欧洲罕见病联合计划(EJP RD)中,我们的目标是通过生成遵循FAIR数据原则的可重用RD语义模型模板来进一步实现欧盟研发平台的目标。通过基于团队的迭代方法,我们创建了基于语义的模型来表示每个cde,使用SemanticScience集成本体作为表示实体及其关系的核心框架。在该框架内,我们将cde中表示的概念及其可能的值映射到领域本体中,例如孤儿罕见疾病本体、人类表型本体和国家癌症研究所同义词库。最后,我们创建了一个范例,可重用的ETL管道,我们将在这些非协调数据存储库上部署它,以帮助他们创建模型兼容的FAIR数据,而不需要特定于站点的编码,也不需要关联数据或FAIR方面的专业知识。在EJP RD项目中,我们确定创建可重用的、专家设计的模板减少或消除了我们参与的生物医学领域专家和罕见疾病数据主机理解OWL语义的需求。这使他们能够使用他们已经熟悉的工具和方法发布具有高度表现力的FAIR数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Biomedical Semantics
Journal of Biomedical Semantics MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
4.20
自引率
5.30%
发文量
28
审稿时长
30 weeks
期刊介绍: Journal of Biomedical Semantics addresses issues of semantic enrichment and semantic processing in the biomedical domain. The scope of the journal covers two main areas: Infrastructure for biomedical semantics: focusing on semantic resources and repositories, meta-data management and resource description, knowledge representation and semantic frameworks, the Biomedical Semantic Web, and semantic interoperability. Semantic mining, annotation, and analysis: focusing on approaches and applications of semantic resources; and tools for investigation, reasoning, prediction, and discoveries in biomedicine.
期刊最新文献
Expanding the concept of ID conversion in TogoID by introducing multi-semantic and label features. FAIR Data Cube, a FAIR data infrastructure for integrated multi-omics data analysis. Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI). MeSH2Matrix: combining MeSH keywords and machine learning for biomedical relation classification based on PubMed. Annotation of epilepsy clinic letters for natural language processing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1