FAIR 数据点填充器：合作 FAIR 化和 FAIR 数据点填充

bioRxiv - Bioinformatics Pub Date : 2024-09-10 DOI:10.1101/2024.09.06.611505

Daphne Wijnbergen, Rajaram Kaliyaperumal, Kees Burger, Luiz Olavo Bonino da Silva Santos, Barend Mons, Marco Roos, Eleni Mina

{"title":"FAIR 数据点填充器：合作 FAIR 化和 FAIR 数据点填充","authors":"Daphne Wijnbergen, Rajaram Kaliyaperumal, Kees Burger, Luiz Olavo Bonino da Silva Santos, Barend Mons, Marco Roos, Eleni Mina","doi":"10.1101/2024.09.06.611505","DOIUrl":null,"url":null,"abstract":"Background Use of the FAIR principles (Findable, Accessible, Interoperable and Reusable) allows the rapidly growing number of biomedical datasets to be optimally (re)used. An important aspect of the FAIR principles is metadata. The FAIR Data Point specifications and reference implementation have been designed as an example on how to publish metadata according to the FAIR principles. Various tools to create metadata have been created, but many of these have limitations, such as interfaces that are not intuitive, metadata that does not adhere to a common metadata schema, limited scalability, and inefficient collaboration. We aim to address these limitations in the FAIR Data Point Populator. Results The FAIR Data Point Populator consists of a GitHub workflow together with Excel templates that have tooltips, validation and documentation. The Excel templates are targeted towards non-technical users, and can be used collaboratively in online spreadsheet software. A more technical user then uses the GitHub workflow to read multiple entries in the Excel sheets, and transform it into machine readable metadata. This metadata is then automatically uploaded to a connected FAIR Data Point. We applied the FAIR Data Point Populator on the metadata of two datasets, and a patient registry. We were then able to run a query on the FAIR Data Point Index, in order to retrieve one of the datasets. Conclusion The FAIR Data Point Populator addresses several limitations of other tools. It makes creating metadata easier, ensures adherence to a common metadata schema, allows bulk creation of metadata entries and increases collaboration. As a result of this, the barrier of entry for FAIRification is lower, which enables the creation of FAIR data by more people.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The FAIR Data Point Populator: collaborative FAIRification and population of FAIR Data Points\",\"authors\":\"Daphne Wijnbergen, Rajaram Kaliyaperumal, Kees Burger, Luiz Olavo Bonino da Silva Santos, Barend Mons, Marco Roos, Eleni Mina\",\"doi\":\"10.1101/2024.09.06.611505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background Use of the FAIR principles (Findable, Accessible, Interoperable and Reusable) allows the rapidly growing number of biomedical datasets to be optimally (re)used. An important aspect of the FAIR principles is metadata. The FAIR Data Point specifications and reference implementation have been designed as an example on how to publish metadata according to the FAIR principles. Various tools to create metadata have been created, but many of these have limitations, such as interfaces that are not intuitive, metadata that does not adhere to a common metadata schema, limited scalability, and inefficient collaboration. We aim to address these limitations in the FAIR Data Point Populator. Results The FAIR Data Point Populator consists of a GitHub workflow together with Excel templates that have tooltips, validation and documentation. The Excel templates are targeted towards non-technical users, and can be used collaboratively in online spreadsheet software. A more technical user then uses the GitHub workflow to read multiple entries in the Excel sheets, and transform it into machine readable metadata. This metadata is then automatically uploaded to a connected FAIR Data Point. We applied the FAIR Data Point Populator on the metadata of two datasets, and a patient registry. We were then able to run a query on the FAIR Data Point Index, in order to retrieve one of the datasets. Conclusion The FAIR Data Point Populator addresses several limitations of other tools. It makes creating metadata easier, ensures adherence to a common metadata schema, allows bulk creation of metadata entries and increases collaboration. As a result of this, the barrier of entry for FAIRification is lower, which enables the creation of FAIR data by more people.\",\"PeriodicalId\":501307,\"journal\":{\"name\":\"bioRxiv - Bioinformatics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.09.06.611505\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.06.611505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

背景 FAIR 原则（可查找、可访问、可互操作和可重用）的使用使数量迅速增长的生物医学数据集得以优化（再）利用。FAIR 原则的一个重要方面就是元数据。FAIR 数据点规范和参考实施是作为如何根据 FAIR 原则发布元数据的范例而设计的。目前已开发出各种创建元数据的工具，但其中许多都存在局限性，例如界面不够直观、元数据不符合通用元数据模式、可扩展性有限以及协作效率低下。我们的目标是在 FAIR 数据点填充器中解决这些局限性。结果 FAIR 数据点弹出器由 GitHub 工作流和带有工具提示、验证和文档的 Excel 模板组成。Excel 模板面向非技术用户，可在在线电子表格软件中协同使用。然后，技术水平较高的用户使用 GitHub 工作流读取 Excel 表单中的多个条目，并将其转换为机器可读的元数据。然后，这些元数据会自动上传到连接的 FAIR 数据点。我们在两个数据集和一个患者登记处的元数据上应用了 FAIR 数据点填充器。然后，我们可以在 FAIR 数据点索引上运行查询，以检索其中一个数据集。结论 FAIR 数据点填充器解决了其他工具的一些局限性。它使创建元数据变得更容易，确保遵守通用的元数据模式，允许批量创建元数据条目，并加强协作。因此，FAIR 化的准入门槛降低了，从而使更多人能够创建 FAIR 数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The FAIR Data Point Populator: collaborative FAIRification and population of FAIR Data Points

Background Use of the FAIR principles (Findable, Accessible, Interoperable and Reusable) allows the rapidly growing number of biomedical datasets to be optimally (re)used. An important aspect of the FAIR principles is metadata. The FAIR Data Point specifications and reference implementation have been designed as an example on how to publish metadata according to the FAIR principles. Various tools to create metadata have been created, but many of these have limitations, such as interfaces that are not intuitive, metadata that does not adhere to a common metadata schema, limited scalability, and inefficient collaboration. We aim to address these limitations in the FAIR Data Point Populator. Results The FAIR Data Point Populator consists of a GitHub workflow together with Excel templates that have tooltips, validation and documentation. The Excel templates are targeted towards non-technical users, and can be used collaboratively in online spreadsheet software. A more technical user then uses the GitHub workflow to read multiple entries in the Excel sheets, and transform it into machine readable metadata. This metadata is then automatically uploaded to a connected FAIR Data Point. We applied the FAIR Data Point Populator on the metadata of two datasets, and a patient registry. We were then able to run a query on the FAIR Data Point Index, in order to retrieve one of the datasets. Conclusion The FAIR Data Point Populator addresses several limitations of other tools. It makes creating metadata easier, ensures adherence to a common metadata schema, allows bulk creation of metadata entries and increases collaboration. As a result of this, the barrier of entry for FAIRification is lower, which enables the creation of FAIR data by more people.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

bioRxiv - Bioinformatics

自引率

0.00%

发文量