FAIR 数据点填充器:合作 FAIR 化和 FAIR 数据点填充

Daphne Wijnbergen, Rajaram Kaliyaperumal, Kees Burger, Luiz Olavo Bonino da Silva Santos, Barend Mons, Marco Roos, Eleni Mina
{"title":"FAIR 数据点填充器:合作 FAIR 化和 FAIR 数据点填充","authors":"Daphne Wijnbergen, Rajaram Kaliyaperumal, Kees Burger, Luiz Olavo Bonino da Silva Santos, Barend Mons, Marco Roos, Eleni Mina","doi":"10.1101/2024.09.06.611505","DOIUrl":null,"url":null,"abstract":"Background Use of the FAIR principles (Findable, Accessible, Interoperable and Reusable) allows the rapidly growing number of biomedical datasets to be optimally (re)used. An important aspect of the FAIR principles is metadata. The FAIR Data Point specifications and reference implementation have been designed as an example on how to publish metadata according to the FAIR principles. Various tools to create metadata have been created, but many of these have limitations, such as interfaces that are not intuitive, metadata that does not adhere to a common metadata schema, limited scalability, and inefficient collaboration. We aim to address these limitations in the FAIR Data Point Populator. Results The FAIR Data Point Populator consists of a GitHub workflow together with Excel templates that have tooltips, validation and documentation. The Excel templates are targeted towards non-technical users, and can be used collaboratively in online spreadsheet software. A more technical user then uses the GitHub workflow to read multiple entries in the Excel sheets, and transform it into machine readable metadata. This metadata is then automatically uploaded to a connected FAIR Data Point. We applied the FAIR Data Point Populator on the metadata of two datasets, and a patient registry. We were then able to run a query on the FAIR Data Point Index, in order to retrieve one of the datasets. Conclusion The FAIR Data Point Populator addresses several limitations of other tools. It makes creating metadata easier, ensures adherence to a common metadata schema, allows bulk creation of metadata entries and increases collaboration. As a result of this, the barrier of entry for FAIRification is lower, which enables the creation of FAIR data by more people.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The FAIR Data Point Populator: collaborative FAIRification and population of FAIR Data Points\",\"authors\":\"Daphne Wijnbergen, Rajaram Kaliyaperumal, Kees Burger, Luiz Olavo Bonino da Silva Santos, Barend Mons, Marco Roos, Eleni Mina\",\"doi\":\"10.1101/2024.09.06.611505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background Use of the FAIR principles (Findable, Accessible, Interoperable and Reusable) allows the rapidly growing number of biomedical datasets to be optimally (re)used. An important aspect of the FAIR principles is metadata. The FAIR Data Point specifications and reference implementation have been designed as an example on how to publish metadata according to the FAIR principles. Various tools to create metadata have been created, but many of these have limitations, such as interfaces that are not intuitive, metadata that does not adhere to a common metadata schema, limited scalability, and inefficient collaboration. We aim to address these limitations in the FAIR Data Point Populator. Results The FAIR Data Point Populator consists of a GitHub workflow together with Excel templates that have tooltips, validation and documentation. The Excel templates are targeted towards non-technical users, and can be used collaboratively in online spreadsheet software. A more technical user then uses the GitHub workflow to read multiple entries in the Excel sheets, and transform it into machine readable metadata. This metadata is then automatically uploaded to a connected FAIR Data Point. We applied the FAIR Data Point Populator on the metadata of two datasets, and a patient registry. We were then able to run a query on the FAIR Data Point Index, in order to retrieve one of the datasets. Conclusion The FAIR Data Point Populator addresses several limitations of other tools. It makes creating metadata easier, ensures adherence to a common metadata schema, allows bulk creation of metadata entries and increases collaboration. As a result of this, the barrier of entry for FAIRification is lower, which enables the creation of FAIR data by more people.\",\"PeriodicalId\":501307,\"journal\":{\"name\":\"bioRxiv - Bioinformatics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.09.06.611505\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.06.611505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景 FAIR 原则(可查找、可访问、可互操作和可重用)的使用使数量迅速增长的生物医学数据集得以优化(再)利用。FAIR 原则的一个重要方面就是元数据。FAIR 数据点规范和参考实施是作为如何根据 FAIR 原则发布元数据的范例而设计的。目前已开发出各种创建元数据的工具,但其中许多都存在局限性,例如界面不够直观、元数据不符合通用元数据模式、可扩展性有限以及协作效率低下。我们的目标是在 FAIR 数据点填充器中解决这些局限性。结果 FAIR 数据点弹出器由 GitHub 工作流和带有工具提示、验证和文档的 Excel 模板组成。Excel 模板面向非技术用户,可在在线电子表格软件中协同使用。然后,技术水平较高的用户使用 GitHub 工作流读取 Excel 表单中的多个条目,并将其转换为机器可读的元数据。然后,这些元数据会自动上传到连接的 FAIR 数据点。我们在两个数据集和一个患者登记处的元数据上应用了 FAIR 数据点填充器。然后,我们可以在 FAIR 数据点索引上运行查询,以检索其中一个数据集。结论 FAIR 数据点填充器解决了其他工具的一些局限性。它使创建元数据变得更容易,确保遵守通用的元数据模式,允许批量创建元数据条目,并加强协作。因此,FAIR 化的准入门槛降低了,从而使更多人能够创建 FAIR 数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The FAIR Data Point Populator: collaborative FAIRification and population of FAIR Data Points
Background Use of the FAIR principles (Findable, Accessible, Interoperable and Reusable) allows the rapidly growing number of biomedical datasets to be optimally (re)used. An important aspect of the FAIR principles is metadata. The FAIR Data Point specifications and reference implementation have been designed as an example on how to publish metadata according to the FAIR principles. Various tools to create metadata have been created, but many of these have limitations, such as interfaces that are not intuitive, metadata that does not adhere to a common metadata schema, limited scalability, and inefficient collaboration. We aim to address these limitations in the FAIR Data Point Populator. Results The FAIR Data Point Populator consists of a GitHub workflow together with Excel templates that have tooltips, validation and documentation. The Excel templates are targeted towards non-technical users, and can be used collaboratively in online spreadsheet software. A more technical user then uses the GitHub workflow to read multiple entries in the Excel sheets, and transform it into machine readable metadata. This metadata is then automatically uploaded to a connected FAIR Data Point. We applied the FAIR Data Point Populator on the metadata of two datasets, and a patient registry. We were then able to run a query on the FAIR Data Point Index, in order to retrieve one of the datasets. Conclusion The FAIR Data Point Populator addresses several limitations of other tools. It makes creating metadata easier, ensures adherence to a common metadata schema, allows bulk creation of metadata entries and increases collaboration. As a result of this, the barrier of entry for FAIRification is lower, which enables the creation of FAIR data by more people.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ECSFinder: Optimized prediction of evolutionarily conserved RNA secondary structures from genome sequences GeneSpectra: a method for context-aware comparison of cell type gene expression across species A Bioinformatician, Computer Scientist, and Geneticist lead bioinformatic tool development - which one is better? Interpretable high-resolution dimension reduction of spatial transcriptomics data by DeepFuseNMF Pangenomics to understand prophage dynamics in the Pectobacterium genus and the radiating lineages of P. brasiliense
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1