Semantic Web最新文献_第3页

3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-20 DOI: 10.3233/sw-233340

Patrick Lambrix, Rickard Armiento, Huanyu Li, Olaf Hartig, Mina Abd Nikooie Pour, Ying Li

In the materials design domain, much of the data from materials calculations is stored in different heterogeneous databases with different data and access models. Therefore, accessing and integrating data from different sources is challenging. As ontology-based access and integration alleviates these issues, in this paper we address data access and interoperability for computational materials databases by developing the Materials Design Ontology. This ontology is inspired by and guided by the OPTIMADE effort that aims to make materials databases interoperable and includes many of the data providers in computational materials science. In this paper, first, we describe the development and the content of the Materials Design Ontology. Then, we use a topic model-based approach to propose additional candidate concepts for the ontology. Finally, we show the use of the Materials Design Ontology by a proof-of-concept implementation of a data access and integration system for materials databases based on the ontology.11 This paper is an extension of (In The Semantic Web – ISWC 2020 – 19th International Semantic Web Conference, Proceedings, Part II (2000) 212–227 Springer) with results from (In ESWC Workshop on Domain Ontologies for Research Data Management in Industry Commons of Materials and Manufacturing 2021 1–11) and currently unpublished results regarding an application using the ontology.

在材料设计领域，许多材料计算数据存储在不同的异构数据库中，具有不同的数据和访问模型。因此，访问和集成来自不同来源的数据具有挑战性。基于本体的访问和集成缓解了这些问题，本文通过开发材料设计本体来解决计算材料数据库的数据访问和互操作性问题。该本体受到OPTIMADE工作的启发和指导，OPTIMADE旨在使材料数据库具有互操作性，并包括计算材料科学中的许多数据提供者。本文首先介绍了材料设计本体的发展和内容。然后，我们使用基于主题模型的方法为本体提出额外的候选概念。最后，我们通过一个基于本体的材料数据库数据访问和集成系统的概念验证实现展示了材料设计本体的使用本文是(In The Semantic Web - ISWC 2020 -第19届国际语义网会议，论文集，第二部分(2000)212-227 Springer)的扩展，其结果来自(In ESWC研讨会领域本体用于材料和制造行业共享的研究数据管理2021 1-11)和目前未发表的关于使用本体的应用的结果。

{"title":"The materials design ontology","authors":"Patrick Lambrix, Rickard Armiento, Huanyu Li, Olaf Hartig, Mina Abd Nikooie Pour, Ying Li","doi":"10.3233/sw-233340","DOIUrl":"https://doi.org/10.3233/sw-233340","url":null,"abstract":"In the materials design domain, much of the data from materials calculations is stored in different heterogeneous databases with different data and access models. Therefore, accessing and integrating data from different sources is challenging. As ontology-based access and integration alleviates these issues, in this paper we address data access and interoperability for computational materials databases by developing the Materials Design Ontology. This ontology is inspired by and guided by the OPTIMADE effort that aims to make materials databases interoperable and includes many of the data providers in computational materials science. In this paper, first, we describe the development and the content of the Materials Design Ontology. Then, we use a topic model-based approach to propose additional candidate concepts for the ontology. Finally, we show the use of the Materials Design Ontology by a proof-of-concept implementation of a data access and integration system for materials databases based on the ontology.11 This paper is an extension of (In The Semantic Web – ISWC 2020 – 19th International Semantic Web Conference, Proceedings, Part II (2000) 212–227 Springer) with results from (In ESWC Workshop on Domain Ontologies for Research Data Management in Industry Commons of Materials and Manufacturing 2021 1–11) and currently unpublished results regarding an application using the ontology.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135090731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Deriving semantic validation rules from industrial standards: An OPC UA study 从工业标准中派生语义验证规则:OPC UA研究

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-19 DOI: 10.3233/sw-233342

Yashoda Saisree Bareedu, Thomas Frühwirth, C. Niedermeier, M. Sabou, Gernot Steindl, Aparna Saisree Thuluva, Stefani Tsaneva, Nilay Tufek Ozkaya

Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i) it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii) we could automatically identify modeling constraints in the specification documents by inspecting the tables ( P = 87 %) and text of these documents (F1 up to 94%); (iii) the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text.

工业标准为数据建模提供指导方针，以确保行业分支(例如，机器人技术)的涉众之间的互操作性。大多数情况下，这些指导方针以非结构化格式(例如，pdf文档)提供，这会妨碍信息对象(例如，数据模型)的自动验证，这些信息对象依赖于这些标准，因为它们符合指导方针规定的建模约束。这增加了由于不正确使用标准而导致的代价高昂的互操作性错误的风险。因此，人们对基于工业标准的信息对象的自动语义验证越来越感兴趣。在本文中，我们将重点介绍一种语义验证方法，通过将非结构化文档中的建模约束形式化地表示为显式的、机器可操作的规则(然后用于语义验证)，并(半)自动地从pdf文档中提取这些规则。虽然我们的方法旨在普遍适用，但我们举例说明了该方法在OPC UA工业标准的具体背景下的适应性，因为它在重要的工业利益相关者中被大规模采用，并且OPC UA内部正在努力进行语义验证。我们得出的结论是:(i)将标准规范中的建模约束表示为规则是可行的，这些规则可以组织在一个分类法中，并使用OWL和SPARQL等语义Web技术表示;(ii)我们可以通过检查这些文档的表(P = 87%)和文本(F1高达94%)来自动识别规范文档中的建模约束;(iii)当从表中提取约束时，将建模约束转换为正式规则可以完全自动化，并且从文本中提取约束需要人工在环方法。

{"title":"Deriving semantic validation rules from industrial standards: An OPC UA study","authors":"Yashoda Saisree Bareedu, Thomas Frühwirth, C. Niedermeier, M. Sabou, Gernot Steindl, Aparna Saisree Thuluva, Stefani Tsaneva, Nilay Tufek Ozkaya","doi":"10.3233/sw-233342","DOIUrl":"https://doi.org/10.3233/sw-233342","url":null,"abstract":"Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i) it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii) we could automatically identify modeling constraints in the specification documents by inspecting the tables ( P = 87 %) and text of these documents (F1 up to 94%); (iii) the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"57 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78271951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Data journeys: Explaining AI workflows through abstraction 数据之旅:通过抽象解释AI工作流

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-15 DOI: 10.3233/sw-233407

E. Daga, Paul Groth

Artificial intelligence systems are not simply built on a single dataset or trained model. Instead, they are made by complex data science workflows involving multiple datasets, models, preparation scripts, and algorithms. Given this complexity, in order to understand these AI systems, we need to provide explanations of their functioning at higher levels of abstraction. To tackle this problem, we focus on the extraction and representation of data journeys from these workflows. A data journey is a multi-layered semantic representation of data processing activity linked to data science code and assets. We propose an ontology to capture the essential elements of a data journey and an approach to extract such data journeys. Using a corpus of Python notebooks from Kaggle, we show that we are able to capture high-level semantic data flow that is more compact than using the code structure itself. Furthermore, we show that introducing an intermediate knowledge graph representation outperforms models that rely only on the code itself. Finally, we report on a user survey to reflect on the challenges and opportunities presented by computational data journeys for explainable AI.

人工智能系统不是简单地建立在单个数据集或经过训练的模型上。相反，它们是由复杂的数据科学工作流组成的，涉及多个数据集、模型、准备脚本和算法。考虑到这种复杂性，为了理解这些人工智能系统，我们需要在更高的抽象层次上解释它们的功能。为了解决这个问题，我们专注于从这些工作流中提取和表示数据旅程。数据旅程是与数据科学代码和资产相关联的数据处理活动的多层语义表示。我们提出了一个本体来捕获数据旅程的基本元素，并提出了一种提取这些数据旅程的方法。使用来自Kaggle的Python笔记本语料库，我们展示了我们能够捕获比使用代码结构本身更紧凑的高级语义数据流。此外，我们表明引入中间知识图表示优于仅依赖代码本身的模型。最后，我们报告了一项用户调查，以反映可解释人工智能的计算数据旅程所带来的挑战和机遇。

引用次数: 2

LOD4Culture: Easy exploration of cultural heritage linked open data LOD4Culture:轻松探索文化遗产链接开放数据

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-15 DOI: 10.3233/sw-233358

Guillermo Vega-Gorgojo

LOD4Culture is a web application that exploits Cultural Heritage Linked Open Data for tourism and education purposes. Since target users are not fluid on Semantic Web technologies, the user interface is designed to hide the intricacies of RDF or SPARQL. An interactive map is provided for exploring world-wide Cultural Heritage sites that can be filtered by type and that uses cluster markers to adapt the view to different zoom levels. LOD4Culture also includes a Cultural Heritage entity browser that builds comprehensive visualizations of sites, artists, and artworks. All data exchanges are facilitated through the use of a generator of REST APIs over Linked Open Data that translates API calls into SPARQL queries across multiple sources, including Wikidata and DBpedia. Since March 2022, more than 1.7K users have employed LOD4Culture. The application has been mentioned many times in social media and has been featured in the DBpedia Newsletter, in the list of Wikidata tools for visualizing data, and in the open data applications list of datos.gob.es.

LOD4Culture是一个为旅游和教育目的开发文化遗产关联开放数据的网络应用程序。由于目标用户在语义Web技术上并不灵活，因此用户界面被设计为隐藏RDF或SPARQL的复杂性。提供了一个交互式地图，用于探索世界范围的文化遗产遗址，该地图可以按类型过滤，并使用集群标记来调整视图以适应不同的缩放级别。LOD4Culture还包括一个文化遗产实体浏览器，它可以构建网站、艺术家和艺术品的全面可视化。所有的数据交换都是通过使用Linked Open data上的REST API生成器来实现的，该生成器将API调用转换为跨多个源(包括Wikidata和DBpedia)的SPARQL查询。自2022年3月以来，已有超过1.7万用户使用LOD4Culture。该应用程序在社交媒体上被多次提及，并在DBpedia Newsletter、用于可视化数据的Wikidata工具列表和datos.gob.es的开放数据应用程序列表中都有介绍。

引用次数: 1

A benchmark dataset with Knowledge Graph generation for Industry 4.0 production lines 具有工业4.0生产线知识图谱生成的基准数据集

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-13 DOI: 10.3233/sw-233431

Muhammad Yahya, Aabid Ali, Qaiser Mehmood, Lan Yang, J. Breslin, M. Ali

Industry 4.0 (I4.0) is a new era in the industrial revolution that emphasizes machine connectivity, automation, and data analytics. The I4.0 pillars such as autonomous robots, cloud computing, horizontal and vertical system integration, and the industrial internet of things have increased the performance and efficiency of production lines in the manufacturing industry. Over the past years, efforts have been made to propose semantic models to represent the manufacturing domain knowledge, one such model is Reference Generalized Ontological Model (RGOM).11 https://w3id.org/rgom However, its adaptability like other models is not ensured due to the lack of manufacturing data. In this paper, we aim to develop a benchmark dataset for knowledge graph generation in Industry 4.0 production lines and to show the benefits of using ontologies and semantic annotations of data to showcase how the I4.0 industry can benefit from KGs and semantic datasets. This work is the result of collaboration with the production line managers, supervisors, and engineers in the football industry to acquire realistic production line data22 https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football,.33 https://zenodo.org/record/7779522 Knowledge Graphs (KGs) or Knowledge Graph (KG) have emerged as a significant technology to store the semantics of the domain entities. KGs have been used in a variety of industries, including banking, the automobile industry, oil and gas, pharmaceutical and health care, publishing, media, etc. The data is mapped and populated to the RGOM classes and relationships using an automated solution based on JenaAPI, producing an I4.0 KG. It contains more than 2.5 million axioms and about 1 million instances. This KG enables us to demonstrate the adaptability and usefulness of the RGOM. Our research helps the production line staff to take timely decisions by exploiting the information embedded in the KG. In relation to this, the RGOM adaptability is demonstrated with the help of a use case scenario to discover required information such as current temperature at a particular time, the status of the motor, tools deployed on the machine, etc.

工业4.0 (I4.0)是工业革命的新时代，强调机器连接，自动化和数据分析。自主机器人、云计算、横向和纵向系统集成、工业物联网等工业4.0支柱提升了制造业生产线的性能和效率。近年来，人们提出了表示制造领域知识的语义模型，其中一种模型是参考广义本体论模型(RGOM)。11 https://w3id.org/rgom但由于缺乏制造数据，不能保证其与其他模型一样的适应性。在本文中，我们的目标是为工业4.0生产线中的知识图生成开发一个基准数据集，并展示使用数据本体和语义注释的好处，以展示工业4.0如何从KGs和语义数据集中受益。这项工作是与足球行业的生产线经理、主管和工程师合作的结果，以获取真实的生产线数据22 https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football，.33 https://zenodo.org/record/7779522知识图(KGs)或知识图(KG)已经成为存储领域实体语义的重要技术。KGs已用于各种行业，包括银行，汽车工业，石油和天然气，制药和医疗保健，出版，媒体等。使用基于JenaAPI的自动化解决方案，将数据映射并填充到RGOM类和关系中，生成I4.0 KG。它包含超过250万个公理和大约100万个实例。这个KG使我们能够演示RGOM的适应性和有用性。我们的研究通过利用嵌入在KG中的信息来帮助生产线员工及时做出决策。与此相关的是，RGOM的适应性在一个用例场景的帮助下进行了演示，以发现所需的信息，如特定时间的当前温度、电机状态、机器上部署的工具等。

{"title":"A benchmark dataset with Knowledge Graph generation for Industry 4.0 production lines","authors":"Muhammad Yahya, Aabid Ali, Qaiser Mehmood, Lan Yang, J. Breslin, M. Ali","doi":"10.3233/sw-233431","DOIUrl":"https://doi.org/10.3233/sw-233431","url":null,"abstract":"Industry 4.0 (I4.0) is a new era in the industrial revolution that emphasizes machine connectivity, automation, and data analytics. The I4.0 pillars such as autonomous robots, cloud computing, horizontal and vertical system integration, and the industrial internet of things have increased the performance and efficiency of production lines in the manufacturing industry. Over the past years, efforts have been made to propose semantic models to represent the manufacturing domain knowledge, one such model is Reference Generalized Ontological Model (RGOM).11 https://w3id.org/rgom However, its adaptability like other models is not ensured due to the lack of manufacturing data. In this paper, we aim to develop a benchmark dataset for knowledge graph generation in Industry 4.0 production lines and to show the benefits of using ontologies and semantic annotations of data to showcase how the I4.0 industry can benefit from KGs and semantic datasets. This work is the result of collaboration with the production line managers, supervisors, and engineers in the football industry to acquire realistic production line data22 https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football,.33 https://zenodo.org/record/7779522 Knowledge Graphs (KGs) or Knowledge Graph (KG) have emerged as a significant technology to store the semantics of the domain entities. KGs have been used in a variety of industries, including banking, the automobile industry, oil and gas, pharmaceutical and health care, publishing, media, etc. The data is mapped and populated to the RGOM classes and relationships using an automated solution based on JenaAPI, producing an I4.0 KG. It contains more than 2.5 million axioms and about 1 million instances. This KG enables us to demonstrate the adaptability and usefulness of the RGOM. Our research helps the production line staff to take timely decisions by exploiting the information embedded in the KG. In relation to this, the RGOM adaptability is demonstrated with the help of a use case scenario to discover required information such as current temperature at a particular time, the status of the motor, tools deployed on the machine, etc.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"1 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85107358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ConSolid: A federated ecosystem for heterogeneous multi-stakeholder projects ConSolid:一个联合的生态系统，用于异构的多涉众项目

3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-09 DOI: 10.3233/sw-233396

Jeroen Werbrouck, Pieter Pauwels, Jakob Beetz, Ruben Verborgh, Erik Mannens

In many industries, multiple parties collaborate on a larger project. At the same time, each of those stakeholders participates in multiple independent projects simultaneously. A double patchwork can thus be identified, with a many-to-many relationship between actors and collaborative projects. One key example is the construction industry, where every project is unique, involving specialists for many subdomains, ranging from the architectural design over technical installations to geospatial information, governmental regulation and sometimes even historical research. A digital representation of this process and its outcomes requires semantic interoperability between these subdomains, which however often work with heterogeneous and unstructured data. In this paper we propose to address this double patchwork via a decentralized ecosystem for multi-stakeholder, multi-industry collaborations dealing with heterogeneous information snippets. At its core, this ecosystem, called ConSolid, builds upon the Solid specifications for Web decentralization, but extends these both on a (meta)data pattern level and on microservice level. To increase the robustness of data allocation and filtering, we identify the need to go beyond Solid’s current LDP-inspired interfaces to a Solid Pod and introduce the concept of metadata-generated ‘virtual views’, to be generated using an access-controlled SPARQL interface to a Pod. A recursive, scalable way to discover multi-vault aggregations is proposed, along with data patterns for connecting and aligning heterogeneous (RDF and non-RDF) resources across vaults in a mediatype-agnostic fashion. We demonstrate the use and benefits of the ecosystem using minimal running examples, concluding with the setup of an example use case from the Architecture, Engineering, Construction and Operations (AECO) industry.

在许多行业中，多方在一个更大的项目上合作。同时，每个利益相关者同时参与多个独立的项目。因此，可以识别出双重拼凑，参与者和协作项目之间存在多对多关系。一个关键的例子是建筑业，每个项目都是独一无二的，涉及许多子领域的专家，从建筑设计到技术安装，从地理空间信息到政府法规，有时甚至是历史研究。该过程及其结果的数字表示需要这些子域之间的语义互操作性，而这些子域通常处理异构和非结构化数据。在本文中，我们建议通过一个分散的生态系统来解决这种双重拼凑问题，该生态系统用于处理异构信息片段的多利益相关者、多行业协作。这个被称为ConSolid的生态系统的核心是建立在用于Web去中心化的Solid规范之上，但在(元)数据模式级别和微服务级别对这些规范进行了扩展。为了增加数据分配和过滤的稳健性，我们确定需要超越Solid当前的受ldp启发的接口到Solid Pod，并引入元数据生成的“虚拟视图”的概念，将使用访问控制的SPARQL接口生成到Pod。提出了一种递归的、可伸缩的方式来发现多vault聚合，以及以一种与中介类型无关的方式跨vault连接和对齐异构(RDF和非RDF)资源的数据模式。我们使用最小的运行示例演示了生态系统的使用和好处，最后设置了一个来自建筑、工程、建筑和运营(AECO)行业的示例用例。

{"title":"ConSolid: A federated ecosystem for heterogeneous multi-stakeholder projects","authors":"Jeroen Werbrouck, Pieter Pauwels, Jakob Beetz, Ruben Verborgh, Erik Mannens","doi":"10.3233/sw-233396","DOIUrl":"https://doi.org/10.3233/sw-233396","url":null,"abstract":"In many industries, multiple parties collaborate on a larger project. At the same time, each of those stakeholders participates in multiple independent projects simultaneously. A double patchwork can thus be identified, with a many-to-many relationship between actors and collaborative projects. One key example is the construction industry, where every project is unique, involving specialists for many subdomains, ranging from the architectural design over technical installations to geospatial information, governmental regulation and sometimes even historical research. A digital representation of this process and its outcomes requires semantic interoperability between these subdomains, which however often work with heterogeneous and unstructured data. In this paper we propose to address this double patchwork via a decentralized ecosystem for multi-stakeholder, multi-industry collaborations dealing with heterogeneous information snippets. At its core, this ecosystem, called ConSolid, builds upon the Solid specifications for Web decentralization, but extends these both on a (meta)data pattern level and on microservice level. To increase the robustness of data allocation and filtering, we identify the need to go beyond Solid’s current LDP-inspired interfaces to a Solid Pod and introduce the concept of metadata-generated ‘virtual views’, to be generated using an access-controlled SPARQL interface to a Pod. A recursive, scalable way to discover multi-vault aggregations is proposed, along with data patterns for connecting and aligning heterogeneous (RDF and non-RDF) resources across vaults in a mediatype-agnostic fashion. We demonstrate the use and benefits of the ecosystem using minimal running examples, concluding with the setup of an example use case from the Architecture, Engineering, Construction and Operations (AECO) industry.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135051037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Dynamic system models and their simulation in the Semantic Web 语义网中的动态系统模型及其仿真

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-08 DOI: 10.3233/sw-233359

Moritz Stüber, Georg Frey

Modelling and Simulation (M&S) are core tools for designing, analysing and operating today’s industrial systems. They often also represent both a valuable asset and a significant investment. Typically, their use is constrained to a software environment intended to be used by engineers on a single computer. However, the knowledge relevant to a task involving modelling and simulation is in general distributed in nature, even across organizational boundaries, and may be large in volume. Therefore, it is desirable to increase the FAIRness (Findability, Accessibility, Interoperability, and Reuse) of M&S capabilities; to enable their use in loosely coupled systems of systems; and to support their composition and execution by intelligent software agents. In this contribution, the suitability of Semantic Web technologies to achieve these goals is investigated and an open-source proof of concept-implementation based on the Functional Mock-up Interface (FMI) standard is presented. Specifically, models, model instances, and simulation results are exposed through a hypermedia API and an implementation of the Pragmatic Proof Algorithm (PPA) is used to successfully demonstrate the API’s use by a generic software agent. The solution shows an increased degree of FAIRness and fully supports its use in loosely coupled systems. The FAIRness could be further improved by providing more “ rich” (meta)data.

建模和仿真(M&S)是设计、分析和操作当今工业系统的核心工具。它们通常既是一项宝贵的资产，也是一项重要的投资。通常，它们的使用仅限于工程师在单台计算机上使用的软件环境。然而，与涉及建模和仿真的任务相关的知识通常在本质上是分布的，甚至跨越组织边界，并且可能数量很大。因此，需要增加M&S功能的公平性(可查找性、可访问性、互操作性和重用性);使它们能够在松散耦合的系统中使用;并通过智能软件代理来支持它们的组成和执行。本文研究了语义Web技术实现这些目标的适用性，并提出了基于功能模型接口(FMI)标准的概念实现的开源证明。具体来说，模型、模型实例和仿真结果通过超媒体API公开，并使用实用证明算法(PPA)的实现成功地演示了通用软件代理对API的使用。该解决方案显示出更高程度的公平性，并完全支持在松耦合系统中使用它。通过提供更多的“丰富”(元)数据，可以进一步提高公平性。

{"title":"Dynamic system models and their simulation in the Semantic Web","authors":"Moritz Stüber, Georg Frey","doi":"10.3233/sw-233359","DOIUrl":"https://doi.org/10.3233/sw-233359","url":null,"abstract":"Modelling and Simulation (M&S) are core tools for designing, analysing and operating today’s industrial systems. They often also represent both a valuable asset and a significant investment. Typically, their use is constrained to a software environment intended to be used by engineers on a single computer. However, the knowledge relevant to a task involving modelling and simulation is in general distributed in nature, even across organizational boundaries, and may be large in volume. Therefore, it is desirable to increase the FAIRness (Findability, Accessibility, Interoperability, and Reuse) of M&S capabilities; to enable their use in loosely coupled systems of systems; and to support their composition and execution by intelligent software agents. In this contribution, the suitability of Semantic Web technologies to achieve these goals is investigated and an open-source proof of concept-implementation based on the Functional Mock-up Interface (FMI) standard is presented. Specifically, models, model instances, and simulation results are exposed through a hypermedia API and an implementation of the Pragmatic Proof Algorithm (PPA) is used to successfully demonstrate the API’s use by a generic software agent. The solution shows an increased degree of FAIRness and fully supports its use in loosely coupled systems. The FAIRness could be further improved by providing more “ rich” (meta)data.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"4 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78743959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incremental schema integration for data wrangling via knowledge graphs 通过知识图实现数据争用的增量模式集成

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-08 DOI: 10.3233/sw-233347

Javier Flores, Kashif Rabbani, S. Nadal, Cristina Gómez, Oscar Romero, E. Jamin, S. Dasiopoulou

Virtual data integration is the current approach to go for data wrangling in data-driven decision-making. In this paper, we focus on automating schema integration, which extracts a homogenised representation of the data source schemata and integrates them into a global schema to enable virtual data integration. Schema integration requires a set of well-known constructs: the data source schemata and wrappers, a global integrated schema and the mappings between them. Based on them, virtual data integration systems enable fast and on-demand data exploration via query rewriting. Unfortunately, the generation of such constructs is currently performed in a largely manual manner, hindering its feasibility in real scenarios. This becomes aggravated when dealing with heterogeneous and evolving data sources. To overcome these issues, we propose a fully-fledged semi-automatic and incremental approach grounded on knowledge graphs to generate the required schema integration constructs in four main steps: bootstrapping, schema matching, schema integration, and generation of system-specific constructs. We also present Nextia DI , a tool implementing our approach. Finally, a comprehensive evaluation is presented to scrutinize our approach.

虚拟数据集成是当前数据驱动决策中处理数据争论的方法。在本文中，我们专注于自动化模式集成，它提取数据源模式的同质化表示，并将它们集成到一个全局模式中，以实现虚拟数据集成。模式集成需要一组众所周知的结构:数据源模式和包装器、全局集成模式以及它们之间的映射。在此基础上，虚拟数据集成系统通过查询重写实现快速和按需的数据探索。不幸的是，这种构造的生成目前主要以手工方式执行，阻碍了其在实际场景中的可行性。在处理异构和不断发展的数据源时，这种情况变得更加严重。为了克服这些问题，我们提出了一种基于知识图的完全成熟的半自动增量方法，通过四个主要步骤生成所需的模式集成构造:引导、模式匹配、模式集成和生成系统特定构造。我们还介绍了Nextia DI，这是一个实现我们方法的工具。最后，提出了一个全面的评估，以审查我们的方法。

{"title":"Incremental schema integration for data wrangling via knowledge graphs","authors":"Javier Flores, Kashif Rabbani, S. Nadal, Cristina Gómez, Oscar Romero, E. Jamin, S. Dasiopoulou","doi":"10.3233/sw-233347","DOIUrl":"https://doi.org/10.3233/sw-233347","url":null,"abstract":"Virtual data integration is the current approach to go for data wrangling in data-driven decision-making. In this paper, we focus on automating schema integration, which extracts a homogenised representation of the data source schemata and integrates them into a global schema to enable virtual data integration. Schema integration requires a set of well-known constructs: the data source schemata and wrappers, a global integrated schema and the mappings between them. Based on them, virtual data integration systems enable fast and on-demand data exploration via query rewriting. Unfortunately, the generation of such constructs is currently performed in a largely manual manner, hindering its feasibility in real scenarios. This becomes aggravated when dealing with heterogeneous and evolving data sources. To overcome these issues, we propose a fully-fledged semi-automatic and incremental approach grounded on knowledge graphs to generate the required schema integration constructs in four main steps: bootstrapping, schema matching, schema integration, and generation of system-specific constructs. We also present Nextia DI , a tool implementing our approach. Finally, a comprehensive evaluation is presented to scrutinize our approach.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"228 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79475110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Enhancing awareness of industrial robots in collaborative manufacturing 增强工业机器人在协同制造中的意识

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-08 DOI: 10.3233/sw-233394

A. Umbrico, A. Cesta, Andrea Orlandini

The diffusion of Human-Robot Collaborative cells is prevented by several barriers. Classical control approaches seem not yet fully suitable for facing the variability conveyed by the presence of human operators beside robots. The capabilities of representing heterogeneous knowledge representation and performing abstract reasoning are crucial to enhance the flexibility of control solutions. To this aim, the ontology SOHO (Sharework Ontology for Human-Robot Collaboration) has been specifically designed for representing Human-Robot Collaboration scenarios, following a context-based approach. This work brings several contributions. This paper proposes an extension of SOHO to better characterize behavioral constraints of collaborative tasks. Furthermore, this work shows a knowledge extraction procedure designed to automatize the synthesis of Artificial Intelligence plan-based controllers for realizing flexible coordination of human and robot behaviors in collaborative tasks. The generality of the ontological model and the developed representation capabilities as well as the validity of the synthesized planning domains are evaluated on a number of realistic industrial scenarios where collaborative robots are actually deployed.

人-机器人协作细胞的扩散受到若干障碍的阻碍。经典的控制方法似乎还不完全适合面对人类操作员在机器人旁边的存在所传达的可变性。表示异构知识和执行抽象推理的能力对于提高控制方案的灵活性至关重要。为此，SOHO(人机协作共享本体)是专门为表示人机协作场景而设计的，遵循基于上下文的方法。这项工作带来了几个贡献。本文提出了SOHO的扩展，以更好地表征协作任务的行为约束。此外，本工作展示了一种知识提取程序，旨在自动合成基于人工智能计划的控制器，以实现协作任务中人与机器人行为的灵活协调。在实际部署协作机器人的实际工业场景中，评估了本体模型的通用性和开发的表示能力以及综合规划域的有效性。

{"title":"Enhancing awareness of industrial robots in collaborative manufacturing","authors":"A. Umbrico, A. Cesta, Andrea Orlandini","doi":"10.3233/sw-233394","DOIUrl":"https://doi.org/10.3233/sw-233394","url":null,"abstract":"The diffusion of Human-Robot Collaborative cells is prevented by several barriers. Classical control approaches seem not yet fully suitable for facing the variability conveyed by the presence of human operators beside robots. The capabilities of representing heterogeneous knowledge representation and performing abstract reasoning are crucial to enhance the flexibility of control solutions. To this aim, the ontology SOHO (Sharework Ontology for Human-Robot Collaboration) has been specifically designed for representing Human-Robot Collaboration scenarios, following a context-based approach. This work brings several contributions. This paper proposes an extension of SOHO to better characterize behavioral constraints of collaborative tasks. Furthermore, this work shows a knowledge extraction procedure designed to automatize the synthesis of Artificial Intelligence plan-based controllers for realizing flexible coordination of human and robot behaviors in collaborative tasks. The generality of the ontological model and the developed representation capabilities as well as the validity of the synthesized planning domains are evaluated on a number of realistic industrial scenarios where collaborative robots are actually deployed.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"19 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74582152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Separability and Its Approximations in Ontology-based Data Management 基于本体的数据管理中的可分性及其近似

3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2023-06-08 DOI: 10.3233/sw-233391

Gianluca Cima, Federico Croce, Maurizio Lenzerini

Given two datasets, i.e., two sets of tuples of constants, representing positive and negative examples, logical separability is the reasoning task of finding a formula in a certain target query language that separates them. As already pointed out in previous works, this task turns out to be relevant in several application scenarios such as concept learning and generating referring expressions. Besides, if we think of the input datasets of positive and negative examples as composed of tuples of constants classified, respectively, positively and negatively by a black-box model, then the separating formula can be used to provide global post-hoc explanations of such a model. In this paper, we study the separability task in the context of Ontology-based Data Management (OBDM), in which a domain ontology provides a high-level, logic-based specification of a domain of interest, semantically linked through suitable mapping assertions to the data source layer of an information system. Since a formula that properly separates (proper separation) two input datasets does not always exist, our first contribution is to propose (best) approximations of the proper separation, called (minimally) complete and (maximally) sound separations. We do this by presenting a general framework for separability in OBDM. Then, in a scenario that uses by far the most popular languages for the OBDM paradigm, our second contribution is a comprehensive study of three natural computational problems associated with the framework, namely Verification (check whether a given formula is a proper, complete, or sound separation of two given datasets), Existence (check whether a proper, or best approximated separation of two given datasets exists at all), and Computation (compute any proper, or any best approximated separation of two given datasets).

给定两个数据集，即两组常量，分别表示正例和反例，逻辑可分性是用某种目标查询语言找到一个公式来分离它们的推理任务。正如在之前的工作中已经指出的那样，这个任务在概念学习和生成引用表达式等几个应用场景中是相关的。此外，如果我们认为正反例的输入数据集是由黑箱模型中分别分类为正负的常量元组组成的，那么分离公式可以用来提供这种模型的全局事后解释。在本文中，我们研究了基于本体的数据管理(OBDM)背景下的可分离性任务，在OBDM中，领域本体提供了感兴趣领域的高级、基于逻辑的规范，通过适当的映射断言与信息系统的数据源层进行语义链接。由于适当分离(适当分离)两个输入数据集的公式并不总是存在，因此我们的第一个贡献是提出适当分离的(最佳)近似，称为(最小)完全分离和(最大)声音分离。我们通过在OBDM中提供可分离性的通用框架来实现这一点。然后，在使用迄今为止最流行的OBDM范式语言的场景中，我们的第二个贡献是对与框架相关的三个自然计算问题的全面研究，即验证(检查给定公式是否是两个给定数据集的适当，完整或合理的分离)，存在(检查两个给定数据集的适当或最佳近似分离是否存在)和计算(计算任何适当的，或两个给定数据集的任何最佳近似分离)。

{"title":"Separability and Its Approximations in Ontology-based Data Management","authors":"Gianluca Cima, Federico Croce, Maurizio Lenzerini","doi":"10.3233/sw-233391","DOIUrl":"https://doi.org/10.3233/sw-233391","url":null,"abstract":"Given two datasets, i.e., two sets of tuples of constants, representing positive and negative examples, logical separability is the reasoning task of finding a formula in a certain target query language that separates them. As already pointed out in previous works, this task turns out to be relevant in several application scenarios such as concept learning and generating referring expressions. Besides, if we think of the input datasets of positive and negative examples as composed of tuples of constants classified, respectively, positively and negatively by a black-box model, then the separating formula can be used to provide global post-hoc explanations of such a model. In this paper, we study the separability task in the context of Ontology-based Data Management (OBDM), in which a domain ontology provides a high-level, logic-based specification of a domain of interest, semantically linked through suitable mapping assertions to the data source layer of an information system. Since a formula that properly separates (proper separation) two input datasets does not always exist, our first contribution is to propose (best) approximations of the proper separation, called (minimally) complete and (maximally) sound separations. We do this by presenting a general framework for separability in OBDM. Then, in a scenario that uses by far the most popular languages for the OBDM paradigm, our second contribution is a comprehensive study of three natural computational problems associated with the framework, namely Verification (check whether a given formula is a proper, complete, or sound separation of two given datasets), Existence (check whether a proper, or best approximated separation of two given datasets exists at all), and Computation (compute any proper, or any best approximated separation of two given datasets).","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135269902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0