生物多样性信息系统时间数据来源的自动生成

Q2 Computer Science Interdisciplinary Journal of Information, Knowledge, and Management Pub Date : 2022-01-01 DOI:10.28945/5003

Zaenal Akbar, Dadan Ridwan Saleh, Yulia Aris Kartika, W. Fatriasari, Adila A Krisnadhi, Deded Sarip Nawawi

{"title":"生物多样性信息系统时间数据来源的自动生成","authors":"Zaenal Akbar, Dadan Ridwan Saleh, Yulia Aris Kartika, W. Fatriasari, Adila A Krisnadhi, Deded Sarip Nawawi","doi":"10.28945/5003","DOIUrl":null,"url":null,"abstract":"Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution: This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings: Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners: The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendation for Researchers: The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society: Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research: The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.","PeriodicalId":38962,"journal":{"name":"Interdisciplinary Journal of Information, Knowledge, and Management","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automatic Generation of Temporal Data Provenance From Biodiversity Information Systems\",\"authors\":\"Zaenal Akbar, Dadan Ridwan Saleh, Yulia Aris Kartika, W. Fatriasari, Adila A Krisnadhi, Deded Sarip Nawawi\",\"doi\":\"10.28945/5003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution: This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings: Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners: The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendation for Researchers: The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society: Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research: The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.\",\"PeriodicalId\":38962,\"journal\":{\"name\":\"Interdisciplinary Journal of Information, Knowledge, and Management\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Interdisciplinary Journal of Information, Knowledge, and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.28945/5003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Journal of Information, Knowledge, and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28945/5003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 2

摘要

目的/目的:虽然数据来源的重要性已在各个部门得到认可，但目前还没有收集数据来源的标准化技术或方法。目前的自动化技术主要采用基于工作流的策略。不幸的是，目前大多数信息系统都不接受这一战略，特别是生物多样性信息系统，其中的数据是由各种各样的人使用各种各样的设备、工具和协议获取的。背景:本文提出了一种独立于生物多样性信息系统的时间数据来源的自动化技术。该方法依赖于数据项上下文信息的变化。通过将修改映射到模式，可以创建数据来源的标准化表示。因此，时间信息可以自动推断出来。方法论:研究方法论包括三个主要活动:数据库事件检测、事件模式映射和时间信息推断。首先，将从数据库检测到事件列表。之后，将检测到的事件映射到本体，从而获得数据来源的通用表示。基于导出的数据来源，自动使用基于规则的推理来推断时间信息。因此，将产生一个时间来源。贡献:本文提供了一种不干扰现有生物多样性信息系统的自动生成数据来源的新方法。除此之外，它并不强制要求任何信息系统遵循任何特定的形式。本体和基于规则的系统作为解决方案的核心组成部分，已被证实在生物多样性科学中具有很高的价值。发现:将解决方案从任何生物多样性信息系统中分离出来，在实施中提供了可扩展性。通过对一个典型的植物物种性状生物多样性信息系统的评价，可以最大程度地生成大量的时间信息。使用规则对不同类型的知识进行编码，为生成时间信息提供了高度的灵活性，从而支持不同的基于时间的分析和推理。对从业者的建议:该策略基于数据项的上下文信息，然而大多数信息系统只是保存最新的信息。因此，为了使解决方案正常工作，必须经常存储数据库快照。此外，最好采用一种更实用的技术来记录上下文信息的变化。给研究人员的建议:使用模式统一表示事件的能力为时间信息的自动推断铺平了道路。因此，应该进一步研究时间信息的更丰富的表示。此外，这项工作表明，基于规则的推理为编码来自专家的不同类型的知识提供了灵活性。因此，可以执行各种基于时间的数据分析和推理。因此，使用该解决方案可以更好地研究多个面向领域的知识。对社会的影响:使用典型的信息系统来存储和管理生物多样性数据并不妨碍我们生成数据来源。由于对信息系统的类型没有限制，我们的解决方案有很大的被广泛采用的潜力。未来研究方向:本工作的数据分析仅限于物种性状数据。然而，还有其他类型的生物多样性数据，包括遗传组成、物种种群和群落组成。在未来，这项工作将扩大到涵盖所有这些类型的生物多样性数据。最终目标是从任何生物多样性数据中收集来源的标准方法或策略，而不管数据是如何存储或管理的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Automatic Generation of Temporal Data Provenance From Biodiversity Information Systems

Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution: This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings: Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners: The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendation for Researchers: The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society: Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research: The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Interdisciplinary Journal of Information, Knowledge, and Management Computer Science-Computer Science (all)

CiteScore

2.30

自引率

0.00%

发文量