如何定制罕见病通用数据模型:基于 OMOP 的实施和经验教训。

IF 3.4 2区 医学 Q2 GENETICS & HEREDITY Orphanet Journal of Rare Diseases Pub Date : 2024-08-14 DOI:10.1186/s13023-024-03312-9
Najia Ahmadi, Michele Zoch, Oya Guengoeze, Carlo Facchinello, Antonia Mondorf, Katharina Stratmann, Khader Musleh, Hans-Peter Erasmus, Jana Tchertov, Richard Gebler, Jannik Schaaf, Lena S Frischen, Azadeh Nasirian, Jiabin Dai, Elisa Henke, Douglas Tremblay, Andrew Srisuwananukorn, Martin Bornhäuser, Christoph Röllig, Jan-Niklas Eckardt, Jan Moritz Middeke, Markus Wolfien, Martin Sedlmayr
{"title":"如何定制罕见病通用数据模型:基于 OMOP 的实施和经验教训。","authors":"Najia Ahmadi, Michele Zoch, Oya Guengoeze, Carlo Facchinello, Antonia Mondorf, Katharina Stratmann, Khader Musleh, Hans-Peter Erasmus, Jana Tchertov, Richard Gebler, Jannik Schaaf, Lena S Frischen, Azadeh Nasirian, Jiabin Dai, Elisa Henke, Douglas Tremblay, Andrew Srisuwananukorn, Martin Bornhäuser, Christoph Röllig, Jan-Niklas Eckardt, Jan Moritz Middeke, Markus Wolfien, Martin Sedlmayr","doi":"10.1186/s13023-024-03312-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Given the geographical sparsity of Rare Diseases (RDs), assembling a cohort is often a challenging task. Common data models (CDM) can harmonize disparate sources of data that can be the basis of decision support systems and artificial intelligence-based studies, leading to new insights in the field. This work is sought to support the design of large-scale multi-center studies for rare diseases.</p><p><strong>Methods: </strong>In an interdisciplinary group, we derived a list of elements of RDs in three medical domains (endocrinology, gastroenterology, and pneumonology) according to specialist knowledge and clinical guidelines in an iterative process. We then defined a RDs data structure that matched all our data elements and built Extract, Transform, Load (ETL) processes to transfer the structure to a joint CDM. To ensure interoperability of our developed CDM and its subsequent usage for further RDs domains, we ultimately mapped it to Observational Medical Outcomes Partnership (OMOP) CDM. We then included a fourth domain, hematology, as a proof-of-concept and mapped an acute myeloid leukemia (AML) dataset to the developed CDM.</p><p><strong>Results: </strong>We have developed an OMOP-based rare diseases common data model (RD-CDM) using data elements from the three domains (endocrinology, gastroenterology, and pneumonology) and tested the CDM using data from the hematology domain. The total study cohort included 61,697 patients. After aligning our modules with those of Medical Informatics Initiative (MII) Core Dataset (CDS) modules, we leveraged its ETL process. This facilitated the seamless transfer of demographic information, diagnoses, procedures, laboratory results, and medication modules from our RD-CDM to the OMOP. For the phenotypes and genotypes, we developed a second ETL process. We finally derived lessons learned for customizing our RD-CDM for different RDs.</p><p><strong>Discussion: </strong>This work can serve as a blueprint for other domains as its modularized structure could be extended towards novel data types. An interdisciplinary group of stakeholders that are actively supporting the project's progress is necessary to reach a comprehensive CDM.</p><p><strong>Conclusion: </strong>The customized data structure related to our RD-CDM can be used to perform multi-center studies to test data-driven hypotheses on a larger scale and take advantage of the analytical tools offered by the OHDSI community.</p>","PeriodicalId":19651,"journal":{"name":"Orphanet Journal of Rare Diseases","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11325822/pdf/","citationCount":"0","resultStr":"{\"title\":\"How to customize common data models for rare diseases: an OMOP-based implementation and lessons learned.\",\"authors\":\"Najia Ahmadi, Michele Zoch, Oya Guengoeze, Carlo Facchinello, Antonia Mondorf, Katharina Stratmann, Khader Musleh, Hans-Peter Erasmus, Jana Tchertov, Richard Gebler, Jannik Schaaf, Lena S Frischen, Azadeh Nasirian, Jiabin Dai, Elisa Henke, Douglas Tremblay, Andrew Srisuwananukorn, Martin Bornhäuser, Christoph Röllig, Jan-Niklas Eckardt, Jan Moritz Middeke, Markus Wolfien, Martin Sedlmayr\",\"doi\":\"10.1186/s13023-024-03312-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Given the geographical sparsity of Rare Diseases (RDs), assembling a cohort is often a challenging task. Common data models (CDM) can harmonize disparate sources of data that can be the basis of decision support systems and artificial intelligence-based studies, leading to new insights in the field. This work is sought to support the design of large-scale multi-center studies for rare diseases.</p><p><strong>Methods: </strong>In an interdisciplinary group, we derived a list of elements of RDs in three medical domains (endocrinology, gastroenterology, and pneumonology) according to specialist knowledge and clinical guidelines in an iterative process. We then defined a RDs data structure that matched all our data elements and built Extract, Transform, Load (ETL) processes to transfer the structure to a joint CDM. To ensure interoperability of our developed CDM and its subsequent usage for further RDs domains, we ultimately mapped it to Observational Medical Outcomes Partnership (OMOP) CDM. We then included a fourth domain, hematology, as a proof-of-concept and mapped an acute myeloid leukemia (AML) dataset to the developed CDM.</p><p><strong>Results: </strong>We have developed an OMOP-based rare diseases common data model (RD-CDM) using data elements from the three domains (endocrinology, gastroenterology, and pneumonology) and tested the CDM using data from the hematology domain. The total study cohort included 61,697 patients. After aligning our modules with those of Medical Informatics Initiative (MII) Core Dataset (CDS) modules, we leveraged its ETL process. This facilitated the seamless transfer of demographic information, diagnoses, procedures, laboratory results, and medication modules from our RD-CDM to the OMOP. For the phenotypes and genotypes, we developed a second ETL process. We finally derived lessons learned for customizing our RD-CDM for different RDs.</p><p><strong>Discussion: </strong>This work can serve as a blueprint for other domains as its modularized structure could be extended towards novel data types. An interdisciplinary group of stakeholders that are actively supporting the project's progress is necessary to reach a comprehensive CDM.</p><p><strong>Conclusion: </strong>The customized data structure related to our RD-CDM can be used to perform multi-center studies to test data-driven hypotheses on a larger scale and take advantage of the analytical tools offered by the OHDSI community.</p>\",\"PeriodicalId\":19651,\"journal\":{\"name\":\"Orphanet Journal of Rare Diseases\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11325822/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Orphanet Journal of Rare Diseases\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s13023-024-03312-9\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Orphanet Journal of Rare Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13023-024-03312-9","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

背景:由于罕见病(RDs)在地域上的稀缺性,组建队列往往是一项具有挑战性的任务。通用数据模型(CDM)可以协调不同来源的数据,这些数据可以作为决策支持系统和基于人工智能的研究的基础,从而为该领域带来新的见解。这项工作旨在为罕见病大规模多中心研究的设计提供支持:在一个跨学科小组中,我们根据专业知识和临床指南,通过迭代过程得出了三个医学领域(内分泌学、胃肠病学和肺病学)的 RDs 要素列表。然后,我们定义了与所有数据元素相匹配的 RDs 数据结构,并建立了提取、转换、加载(ETL)流程,将该结构转移到联合 CDM 中。为了确保我们开发的 CDM 的互操作性,以及后续用于更多 RDs 领域,我们最终将其映射到观察性医疗结果合作组织 (OMOP) CDM。作为概念验证,我们将第四个领域(血液学)纳入其中,并将急性髓性白血病(AML)数据集映射到开发的 CDM 中:我们利用三个领域(内分泌学、胃肠病学和肺病学)的数据元素开发了基于 OMOP 的罕见病通用数据模型(RD-CDM),并利用血液学领域的数据对 CDM 进行了测试。研究队列共包括 61,697 名患者。在将我们的模块与医学信息学倡议(MII)核心数据集(CDS)的模块进行统一后,我们利用了其 ETL 流程。这有助于将人口统计信息、诊断、手术、实验室结果和药物模块从我们的 RD-CDM 无缝转移到 OMOP。对于表型和基因型,我们开发了第二个 ETL 流程。最后,我们总结了经验教训,为不同的 RD 定制了 RD-CDM:讨论:这项工作可以作为其他领域的蓝图,因为其模块化结构可以扩展到新的数据类型。要实现全面的 CDM,需要一个由积极支持项目进展的利益相关者组成的跨学科小组:与我们的 RD-CDM 相关的定制数据结构可用于开展多中心研究,在更大范围内测试数据驱动的假设,并利用 OHDSI 社区提供的分析工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
How to customize common data models for rare diseases: an OMOP-based implementation and lessons learned.

Background: Given the geographical sparsity of Rare Diseases (RDs), assembling a cohort is often a challenging task. Common data models (CDM) can harmonize disparate sources of data that can be the basis of decision support systems and artificial intelligence-based studies, leading to new insights in the field. This work is sought to support the design of large-scale multi-center studies for rare diseases.

Methods: In an interdisciplinary group, we derived a list of elements of RDs in three medical domains (endocrinology, gastroenterology, and pneumonology) according to specialist knowledge and clinical guidelines in an iterative process. We then defined a RDs data structure that matched all our data elements and built Extract, Transform, Load (ETL) processes to transfer the structure to a joint CDM. To ensure interoperability of our developed CDM and its subsequent usage for further RDs domains, we ultimately mapped it to Observational Medical Outcomes Partnership (OMOP) CDM. We then included a fourth domain, hematology, as a proof-of-concept and mapped an acute myeloid leukemia (AML) dataset to the developed CDM.

Results: We have developed an OMOP-based rare diseases common data model (RD-CDM) using data elements from the three domains (endocrinology, gastroenterology, and pneumonology) and tested the CDM using data from the hematology domain. The total study cohort included 61,697 patients. After aligning our modules with those of Medical Informatics Initiative (MII) Core Dataset (CDS) modules, we leveraged its ETL process. This facilitated the seamless transfer of demographic information, diagnoses, procedures, laboratory results, and medication modules from our RD-CDM to the OMOP. For the phenotypes and genotypes, we developed a second ETL process. We finally derived lessons learned for customizing our RD-CDM for different RDs.

Discussion: This work can serve as a blueprint for other domains as its modularized structure could be extended towards novel data types. An interdisciplinary group of stakeholders that are actively supporting the project's progress is necessary to reach a comprehensive CDM.

Conclusion: The customized data structure related to our RD-CDM can be used to perform multi-center studies to test data-driven hypotheses on a larger scale and take advantage of the analytical tools offered by the OHDSI community.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Orphanet Journal of Rare Diseases
Orphanet Journal of Rare Diseases 医学-医学:研究与实验
CiteScore
6.30
自引率
8.10%
发文量
418
审稿时长
4-8 weeks
期刊介绍: Orphanet Journal of Rare Diseases is an open access, peer-reviewed journal that encompasses all aspects of rare diseases and orphan drugs. The journal publishes high-quality reviews on specific rare diseases. In addition, the journal may consider articles on clinical trial outcome reports, either positive or negative, and articles on public health issues in the field of rare diseases and orphan drugs. The journal does not accept case reports.
期刊最新文献
International expert opinion on the considerations for combining vosoritide and limb surgery: a modified delphi study Do we care? Reporting of genetic diagnoses in multidisciplinary intellectual disability care: a retrospective chart review Real-world multidisciplinary outcomes of onasemnogene abeparvovec monotherapy in patients with spinal muscular atrophy type 1: experience of the French cohort in the first three years of treatment CFTR modulators response of S737F and T465N CFTR variants on patient-derived rectal organoids Genomic and phenotypic landscapes of X-linked hereditary hearing loss in the Chinese population
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1