生物多样性信息系统时间数据来源的自动生成

Zaenal Akbar, Dadan Ridwan Saleh, Yulia Aris Kartika, W. Fatriasari, Adila A Krisnadhi, Deded Sarip Nawawi
{"title":"生物多样性信息系统时间数据来源的自动生成","authors":"Zaenal Akbar, Dadan Ridwan Saleh, Yulia Aris Kartika, W. Fatriasari, Adila A Krisnadhi, Deded Sarip Nawawi","doi":"10.28945/5003","DOIUrl":null,"url":null,"abstract":"Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution: This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings: Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners: The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendation for Researchers: The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society: Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research: The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.","PeriodicalId":38962,"journal":{"name":"Interdisciplinary Journal of Information, Knowledge, and Management","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automatic Generation of Temporal Data Provenance From Biodiversity Information Systems\",\"authors\":\"Zaenal Akbar, Dadan Ridwan Saleh, Yulia Aris Kartika, W. Fatriasari, Adila A Krisnadhi, Deded Sarip Nawawi\",\"doi\":\"10.28945/5003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution: This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings: Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners: The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendation for Researchers: The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society: Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research: The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.\",\"PeriodicalId\":38962,\"journal\":{\"name\":\"Interdisciplinary Journal of Information, Knowledge, and Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Interdisciplinary Journal of Information, Knowledge, and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.28945/5003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Journal of Information, Knowledge, and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28945/5003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 2

摘要

目的/目的:虽然数据来源的重要性已在各个部门得到认可,但目前还没有收集数据来源的标准化技术或方法。目前的自动化技术主要采用基于工作流的策略。不幸的是,目前大多数信息系统都不接受这一战略,特别是生物多样性信息系统,其中的数据是由各种各样的人使用各种各样的设备、工具和协议获取的。背景:本文提出了一种独立于生物多样性信息系统的时间数据来源的自动化技术。该方法依赖于数据项上下文信息的变化。通过将修改映射到模式,可以创建数据来源的标准化表示。因此,时间信息可以自动推断出来。方法论:研究方法论包括三个主要活动:数据库事件检测、事件模式映射和时间信息推断。首先,将从数据库检测到事件列表。之后,将检测到的事件映射到本体,从而获得数据来源的通用表示。基于导出的数据来源,自动使用基于规则的推理来推断时间信息。因此,将产生一个时间来源。贡献:本文提供了一种不干扰现有生物多样性信息系统的自动生成数据来源的新方法。除此之外,它并不强制要求任何信息系统遵循任何特定的形式。本体和基于规则的系统作为解决方案的核心组成部分,已被证实在生物多样性科学中具有很高的价值。发现:将解决方案从任何生物多样性信息系统中分离出来,在实施中提供了可扩展性。通过对一个典型的植物物种性状生物多样性信息系统的评价,可以最大程度地生成大量的时间信息。使用规则对不同类型的知识进行编码,为生成时间信息提供了高度的灵活性,从而支持不同的基于时间的分析和推理。对从业者的建议:该策略基于数据项的上下文信息,然而大多数信息系统只是保存最新的信息。因此,为了使解决方案正常工作,必须经常存储数据库快照。此外,最好采用一种更实用的技术来记录上下文信息的变化。给研究人员的建议:使用模式统一表示事件的能力为时间信息的自动推断铺平了道路。因此,应该进一步研究时间信息的更丰富的表示。此外,这项工作表明,基于规则的推理为编码来自专家的不同类型的知识提供了灵活性。因此,可以执行各种基于时间的数据分析和推理。因此,使用该解决方案可以更好地研究多个面向领域的知识。对社会的影响:使用典型的信息系统来存储和管理生物多样性数据并不妨碍我们生成数据来源。由于对信息系统的类型没有限制,我们的解决方案有很大的被广泛采用的潜力。未来研究方向:本工作的数据分析仅限于物种性状数据。然而,还有其他类型的生物多样性数据,包括遗传组成、物种种群和群落组成。在未来,这项工作将扩大到涵盖所有这些类型的生物多样性数据。最终目标是从任何生物多样性数据中收集来源的标准方法或策略,而不管数据是如何存储或管理的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Automatic Generation of Temporal Data Provenance From Biodiversity Information Systems
Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution: This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings: Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners: The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendation for Researchers: The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society: Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research: The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
14
期刊最新文献
IJIKM Volume 18, 2023 – Table of Contents Factors Affecting Individuals’ Behavioral Intention to Use Online Capital Market Investment Platforms in Indonesia Investigating the Adoption of Social Commerce: A Case Study of SMEs in Jordan The Influence of Big Data Management on Organizational Performance in Organizations: The Role of Electronic Records Management System Potentiality Customer Churn Prediction in the Banking Sector Using Machine Learning-Based Classification Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1