通用工作流语言和软件实现几何学习和 FAIR 科学协议报告

Robert W. Epps, Amanda A. Volk, Robert R. White, Robert Tirawat, Rosemary C. Bramante, Joseph J. Berry
{"title":"通用工作流语言和软件实现几何学习和 FAIR 科学协议报告","authors":"Robert W. Epps, Amanda A. Volk, Robert R. White, Robert Tirawat, Rosemary C. Bramante, Joseph J. Berry","doi":"arxiv-2409.05899","DOIUrl":null,"url":null,"abstract":"The modern technological landscape has trended towards increased precision\nand greater digitization of information. However, the methods used to record\nand communicate scientific procedures have remained largely unchanged over the\nlast century. Written text as the primary means for communicating scientific\nprotocols poses notable limitations in human and machine information transfer.\nIn this work, we present the Universal Workflow Language (UWL) and the\nopen-source Universal Workflow Language interface (UWLi). UWL is a graph-based\ndata architecture that can capture arbitrary scientific procedures through\nworkflow representation of protocol steps and embedded procedure metadata. It\nis machine readable, discipline agnostic, and compatible with FAIR reporting\nstandards. UWLi is an accompanying software package for building and\nmanipulating UWL files into tabular and plain text representations in a\ncontrolled, detailed, and multilingual format. UWL transcription of protocols\nfrom three high-impact publications resulted in the identification of\nsubstantial deficiencies in the detail of the reported procedures. UWL\ntranscription of these publications identified seventeen procedural ambiguities\nand thirty missing parameters for every one hundred words in published\nprocedures. In addition to preventing and identifying procedural omission, UWL\nfiles were found to be compatible with geometric learning techniques for\nrepresenting scientific protocols. In a surrogate function designed to\nrepresent an arbitrary multi-step experimental process, graph transformer\nnetworks were able to predict outcomes in approximately 6,000 fewer experiments\nthan equivalent linear models. Implementation of UWL and UWLi into the\nscientific reporting process will result in higher reproducibility between both\nexperimentalists and machines, thus proving an avenue to more effective\nmodeling and control of complex systems.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Universal Workflow Language and Software Enables Geometric Learning and FAIR Scientific Protocol Reporting\",\"authors\":\"Robert W. Epps, Amanda A. Volk, Robert R. White, Robert Tirawat, Rosemary C. Bramante, Joseph J. Berry\",\"doi\":\"arxiv-2409.05899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The modern technological landscape has trended towards increased precision\\nand greater digitization of information. However, the methods used to record\\nand communicate scientific procedures have remained largely unchanged over the\\nlast century. Written text as the primary means for communicating scientific\\nprotocols poses notable limitations in human and machine information transfer.\\nIn this work, we present the Universal Workflow Language (UWL) and the\\nopen-source Universal Workflow Language interface (UWLi). UWL is a graph-based\\ndata architecture that can capture arbitrary scientific procedures through\\nworkflow representation of protocol steps and embedded procedure metadata. It\\nis machine readable, discipline agnostic, and compatible with FAIR reporting\\nstandards. UWLi is an accompanying software package for building and\\nmanipulating UWL files into tabular and plain text representations in a\\ncontrolled, detailed, and multilingual format. UWL transcription of protocols\\nfrom three high-impact publications resulted in the identification of\\nsubstantial deficiencies in the detail of the reported procedures. UWL\\ntranscription of these publications identified seventeen procedural ambiguities\\nand thirty missing parameters for every one hundred words in published\\nprocedures. In addition to preventing and identifying procedural omission, UWL\\nfiles were found to be compatible with geometric learning techniques for\\nrepresenting scientific protocols. In a surrogate function designed to\\nrepresent an arbitrary multi-step experimental process, graph transformer\\nnetworks were able to predict outcomes in approximately 6,000 fewer experiments\\nthan equivalent linear models. Implementation of UWL and UWLi into the\\nscientific reporting process will result in higher reproducibility between both\\nexperimentalists and machines, thus proving an avenue to more effective\\nmodeling and control of complex systems.\",\"PeriodicalId\":501043,\"journal\":{\"name\":\"arXiv - PHYS - Physics and Society\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Physics and Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05899\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

现代科技的发展趋势是提高信息的精确度和数字化程度。然而,在过去的一个世纪里,记录和交流科学程序的方法在很大程度上一直没有改变。在这项工作中,我们提出了通用工作流语言(UWL)和开源通用工作流语言接口(UWLi)。UWL 是一种基于图的数据架构,可以通过协议步骤的工作流表示和嵌入式程序元数据来捕获任意科学程序。它具有机器可读性、学科无关性,并与 FAIR 报告标准兼容。UWLi 是一个配套软件包,用于将 UWL 文件以受控、详细和多语言的格式构建和处理成表格和纯文本格式。UWL 转录了三份影响力较大的出版物中的规程,结果发现报告程序的细节存在重大缺陷。对这些出版物的 UWL 转录发现,在发表的程序中,每一百个单词中就有十七个程序含糊不清,三十个参数缺失。除了防止和识别程序遗漏之外,我们还发现 UWL 文件与表示科学规程的几何学习技术兼容。在一个用来表示任意多步骤实验过程的代用函数中,图变换网络能够比等效线性模型少预测大约 6000 个实验结果。将 UWL 和 UWLi 应用于科学报告过程将提高实验人员和机器之间的可重复性,从而为更有效地模拟和控制复杂系统提供了途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Universal Workflow Language and Software Enables Geometric Learning and FAIR Scientific Protocol Reporting
The modern technological landscape has trended towards increased precision and greater digitization of information. However, the methods used to record and communicate scientific procedures have remained largely unchanged over the last century. Written text as the primary means for communicating scientific protocols poses notable limitations in human and machine information transfer. In this work, we present the Universal Workflow Language (UWL) and the open-source Universal Workflow Language interface (UWLi). UWL is a graph-based data architecture that can capture arbitrary scientific procedures through workflow representation of protocol steps and embedded procedure metadata. It is machine readable, discipline agnostic, and compatible with FAIR reporting standards. UWLi is an accompanying software package for building and manipulating UWL files into tabular and plain text representations in a controlled, detailed, and multilingual format. UWL transcription of protocols from three high-impact publications resulted in the identification of substantial deficiencies in the detail of the reported procedures. UWL transcription of these publications identified seventeen procedural ambiguities and thirty missing parameters for every one hundred words in published procedures. In addition to preventing and identifying procedural omission, UWL files were found to be compatible with geometric learning techniques for representing scientific protocols. In a surrogate function designed to represent an arbitrary multi-step experimental process, graph transformer networks were able to predict outcomes in approximately 6,000 fewer experiments than equivalent linear models. Implementation of UWL and UWLi into the scientific reporting process will result in higher reproducibility between both experimentalists and machines, thus proving an avenue to more effective modeling and control of complex systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Continuity equation and fundamental diagram of pedestrians Anomalous behavior of Replicator dynamics for the Prisoner's Dilemma on diluted lattices Quantifying the role of supernatural entities and the effect of missing data in Irish sagas Crossing the disciplines -- a starter toolkit for researchers who wish to explore early Irish literature Female representation across mythologies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1