首页 > 最新文献

Journal of Web Semantics最新文献

英文 中文
IndeGx: A model and a framework for indexing RDF knowledge graphs with SPARQL-based test suits indexx:使用基于sparql的测试套件为RDF知识图建立索引的模型和框架
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2023.100775
Pierre Maillot, Olivier Corby, Catherine Faron, Fabien Gandon, Franck Michel

In recent years, a large number of RDF datasets have been built and published on the Web in fields as diverse as linguistics or life sciences, as well as general datasets such as DBpedia or Wikidata. The joint exploitation of these datasets requires specific knowledge about their content, access points, and commonalities. However, not all datasets contain a self-description, and not all access points can handle the complex queries used to generate such a description.

In this article, we provide a standard-based approach to generate the description of a dataset. The generated descriptions as well as the process of their computation are expressed using standard vocabularies and languages. We implemented our approach into a framework, called IndeGx, where each indexing feature and its computation is collaboratively and declaratively defined in a GitHub repository. We have experimented IndeGx on a set of 339 RDF datasets with endpoints listed in public catalogs, over 8 months. The results show that we can collect, as much as possible, important characteristics of the datasets depending on their availability and capacities. The resulting index captures the commonalities, variety and disparity in the offered content and services and it provides an important support to any application designed to query RDF datasets.

近年来,在语言学或生命科学等不同领域,以及DBpedia或Wikidata等通用数据集,已经在Web上构建和发布了大量RDF数据集。联合利用这些数据集需要对它们的内容、访问点和共性有特定的了解。然而,并不是所有的数据集都包含自我描述,也不是所有的访问点都能处理用于生成这种描述的复杂查询。在本文中,我们提供了一种基于标准的方法来生成数据集的描述。生成的描述及其计算过程使用标准词汇和语言表示。我们将我们的方法实现到一个名为IndeGx的框架中,其中每个索引特性及其计算都是在GitHub存储库中以协作和声明的方式定义的。我们在一组339个RDF数据集上进行了indexx实验,这些数据集的端点列在公共目录中,耗时8个月。结果表明,我们可以根据数据集的可用性和容量尽可能多地收集数据集的重要特征。生成的索引捕获了所提供内容和服务的共性、多样性和差异性,它为任何设计用于查询RDF数据集的应用程序提供了重要的支持。
{"title":"IndeGx: A model and a framework for indexing RDF knowledge graphs with SPARQL-based test suits","authors":"Pierre Maillot,&nbsp;Olivier Corby,&nbsp;Catherine Faron,&nbsp;Fabien Gandon,&nbsp;Franck Michel","doi":"10.1016/j.websem.2023.100775","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100775","url":null,"abstract":"<div><p>In recent years, a large number of RDF datasets have been built and published on the Web in fields as diverse as linguistics or life sciences, as well as general datasets such as DBpedia or Wikidata. The joint exploitation of these datasets requires specific knowledge about their content, access points, and commonalities. However, not all datasets contain a self-description, and not all access points can handle the complex queries used to generate such a description.</p><p>In this article, we provide a standard-based approach to generate the description of a dataset. The generated descriptions as well as the process of their computation are expressed using standard vocabularies and languages. We implemented our approach into a framework, called IndeGx, where each indexing feature and its computation is collaboratively and declaratively defined in a GitHub repository. We have experimented IndeGx on a set of 339 RDF datasets with endpoints listed in public catalogs, over 8 months. The results show that we can collect, as much as possible, important characteristics of the datasets depending on their availability and capacities. The resulting index captures the commonalities, variety and disparity in the offered content and services and it provides an important support to any application designed to query RDF datasets.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100775"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Stream reasoning with DatalogMTL 使用DatalogMTL进行流推理
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2023.100776
Przemysław A. Wałęga, Mark Kaminski, Dingmin Wang, Bernardo Cuenca Grau

We study stream reasoning in DatalogMTL—an extension of Datalog with metric temporal operators. We propose a sound and complete stream reasoning algorithm that is applicable to forward-propagating DatalogMTL programs, in which propagation of derived information towards past time points is precluded. Memory consumption in our generic algorithm depends both on the properties of the rule set and the input data stream; in particular, it depends on the distances between timestamps occurring in data. This may be undesirable in certain practical scenarios since these distances can be very small, in which case the algorithm may require large amounts of memory. To address this issue, we propose a second algorithm, where the size of the required memory becomes independent on the timestamps in the data at the expense of disallowing punctual intervals in the rule set. We have implemented our approach as an extension of the DatalogMTL reasoner MeTeoR and tested it experimentally. The obtained results support the feasibility of our approach in practice.

我们研究了datalogmtl中的流推理,datalogmtl是Datalog的一个扩展,带有度量时间算子。我们提出了一种适用于前向传播DatalogMTL程序的健全和完整的流推理算法,其中排除了向过去时间点传播派生信息的可能性。我们的通用算法中的内存消耗取决于规则集和输入数据流的属性;特别是,它取决于数据中出现的时间戳之间的距离。在某些实际场景中,这可能是不可取的,因为这些距离可能非常小,在这种情况下,算法可能需要大量内存。为了解决这个问题,我们提出了第二种算法,其中所需内存的大小与数据中的时间戳无关,代价是不允许规则集中的准时间隔。我们已经将我们的方法作为DatalogMTL推理器MeTeoR的扩展来实现,并进行了实验测试。所得结果支持了该方法在实践中的可行性。
{"title":"Stream reasoning with DatalogMTL","authors":"Przemysław A. Wałęga,&nbsp;Mark Kaminski,&nbsp;Dingmin Wang,&nbsp;Bernardo Cuenca Grau","doi":"10.1016/j.websem.2023.100776","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100776","url":null,"abstract":"<div><p>We study stream reasoning in <span><math><mtext>DatalogMTL</mtext></math></span>—an extension of Datalog with metric temporal operators. We propose a sound and complete stream reasoning algorithm that is applicable to forward-propagating <span><math><mtext>DatalogMTL</mtext></math></span> programs, in which propagation of derived information towards past time points is precluded. Memory consumption in our generic algorithm depends both on the properties of the rule set and the input data stream; in particular, it depends on the distances between timestamps occurring in data. This may be undesirable in certain practical scenarios since these distances can be very small, in which case the algorithm may require large amounts of memory. To address this issue, we propose a second algorithm, where the size of the required memory becomes independent on the timestamps in the data at the expense of disallowing punctual intervals in the rule set. We have implemented our approach as an extension of the <span><math><mtext>DatalogMTL</mtext></math></span> reasoner MeTeoR and tested it experimentally. The obtained results support the feasibility of our approach in practice.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100776"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49876692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Decentralized semantic provision of personal health streams 个人健康流的去中心化语义提供
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2023.100774
Jean-Paul Calbimonte , Orfeas Aidonopoulos , Fabien Dubosson , Benjamin Pocklington , Ilia Kebets , Pierre-Mikael Legris , Michael Schumacher

Personalized healthcare is nowadays driven by the increasing volumes of patient data, observed and produced continuously thanks to medical devices, mobile sensors, patient-reported outcomes, among other data sources. This data is made available as streams, due to their dynamic nature, which represents an important challenge for processing, querying and interpreting the incoming information. In addition, the sensitive nature of healthcare data poses significant restrictions regarding privacy, which has led to the emergence of decentralized personal data management systems. Data semantics play a key role in order to enable both decentralization and integration of personal health data, as they introduce the capability to represent knowledge and information using ontologies and semantic vocabularies. In this paper we describe the SemPryv system, which provides the means to manage personal health data streams enriched with semantic information. SemPryv is designed as a decentralized system, so that users have the possibility of hosting their personal data at different sites, while keeping control of access rights. The semantization of data in SemPryv is implemented through different strategies, ranging from rule-based annotation to machine learning-based suggestions, fed from third-party specialized healthcare metadata providers. The system has been made available as Open Source, and is integrated as part of the Pryv.io platform used and commercialized in the healthcare and personal data management industry.

如今,个性化医疗保健是由越来越多的患者数据驱动的,由于医疗设备、移动传感器、患者报告的结果以及其他数据源,这些数据不断被观察和生成。由于这些数据的动态特性,它们以流的形式提供,这对于处理、查询和解释传入的信息来说是一个重要的挑战。此外,医疗保健数据的敏感性对隐私构成了重大限制,这导致了分散的个人数据管理系统的出现。数据语义在实现个人健康数据的去中心化和集成方面发挥着关键作用,因为它们引入了使用本体和语义词汇表表示知识和信息的能力。在本文中,我们描述了SemPryv系统,它提供了管理富含语义信息的个人健康数据流的方法。SemPryv被设计为一个分散的系统,因此用户可以在不同的站点托管他们的个人数据,同时保持访问权限的控制。SemPryv中的数据语义是通过不同的策略实现的,从基于规则的注释到基于机器学习的建议,这些策略都来自第三方专业医疗保健元数据提供商。该系统已作为开放源代码提供,并作为Pryv的一部分集成。IO平台用于医疗保健和个人数据管理行业并实现商业化。
{"title":"Decentralized semantic provision of personal health streams","authors":"Jean-Paul Calbimonte ,&nbsp;Orfeas Aidonopoulos ,&nbsp;Fabien Dubosson ,&nbsp;Benjamin Pocklington ,&nbsp;Ilia Kebets ,&nbsp;Pierre-Mikael Legris ,&nbsp;Michael Schumacher","doi":"10.1016/j.websem.2023.100774","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100774","url":null,"abstract":"<div><p>Personalized healthcare is nowadays driven by the increasing volumes of patient data, observed and produced continuously thanks to medical devices, mobile sensors, patient-reported outcomes, among other data sources. This data is made available as streams, due to their dynamic nature, which represents an important challenge for processing, querying and interpreting the incoming information. In addition, the sensitive nature of healthcare data poses significant restrictions regarding privacy, which has led to the emergence of decentralized personal data management systems. Data semantics play a key role in order to enable both decentralization and integration of personal health data, as they introduce the capability to represent knowledge and information using ontologies and semantic vocabularies. In this paper we describe the SemPryv system, which provides the means to manage personal health data streams enriched with semantic information. SemPryv is designed as a decentralized system, so that users have the possibility of hosting their personal data at different sites, while keeping control of access rights. The semantization of data in SemPryv is implemented through different strategies, ranging from rule-based annotation to machine learning-based suggestions, fed from third-party specialized healthcare metadata providers. The system has been made available as Open Source, and is integrated as part of the Pryv.io platform used and commercialized in the healthcare and personal data management industry.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100774"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A parametric similarity method: Comparative experiments based on semantically annotated large datasets 一种参数相似度方法:基于语义标注大数据集的对比实验
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2023.100773
Antonio De Nicola , Anna Formica , Michele Missikoff , Elaheh Pourabbas , Francesco Taglino

We present the parametric method SemSimp aimed at measuring semantic similarity of digital resources. SemSimp is based on the notion of information content, and it leverages a reference ontology and taxonomic reasoning, encompassing different approaches for weighting the concepts of the ontology. In particular, weights can be computed by considering either the available digital resources or the structure of the reference ontology of a given domain. SemSimp is assessed against six representative semantic similarity methods for comparing sets of concepts proposed in the literature, by carrying out an experimentation that includes both a statistical analysis and an expert judgment evaluation. To the purpose of achieving a reliable assessment, we used a real-world large dataset based on the Digital Library of the Association for Computing Machinery (ACM), and a reference ontology derived from the ACM Computing Classification System (ACM-CCS). For each method, we considered two indicators. The first concerns the degree of confidence to identify the similarity among the papers belonging to some special issues selected from the ACM Transactions on Information Systems journal, the second the Pearson correlation with human judgment. The results reveal that one of the configurations of SemSimp outperforms the other assessed methods. An additional experiment performed in the domain of physics shows that, in general, SemSimp provides better results than the other similarity methods.

针对数字资源的语义相似度度量问题,提出了参数化方法SemSimp。SemSimp基于信息内容的概念,它利用了参考本体和分类推理,包含了对本体概念进行加权的不同方法。特别是,可以通过考虑给定领域的可用数字资源或参考本体的结构来计算权重。通过进行包括统计分析和专家判断评估的实验,对六种具有代表性的语义相似度方法进行评估,以比较文献中提出的概念集。为了实现可靠的评估,我们使用了基于计算机协会(ACM)数字图书馆的真实大型数据集,以及来自ACM计算分类系统(ACM- ccs)的参考本体。对于每种方法,我们考虑了两个指标。第一个问题是确定从ACM信息系统学报中选择的一些特殊问题的论文之间的相似性的置信度,第二个问题是与人类判断的Pearson相关性。结果表明,SemSimp的一种配置优于其他评估方法。在物理领域进行的另一个实验表明,SemSimp通常比其他相似方法提供更好的结果。
{"title":"A parametric similarity method: Comparative experiments based on semantically annotated large datasets","authors":"Antonio De Nicola ,&nbsp;Anna Formica ,&nbsp;Michele Missikoff ,&nbsp;Elaheh Pourabbas ,&nbsp;Francesco Taglino","doi":"10.1016/j.websem.2023.100773","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100773","url":null,"abstract":"<div><p>We present the parametric method <em>SemSim<sup>p</sup></em><span> aimed at measuring semantic similarity of digital resources. </span><em>SemSim<sup>p</sup></em> is based on the notion of <em>information content</em>, and it leverages a reference ontology and taxonomic reasoning, encompassing different approaches for weighting the concepts of the ontology. In particular, weights can be computed by considering either the available digital resources or the structure of the reference ontology of a given domain. <em>SemSim<sup>p</sup></em><span> is assessed against six representative semantic similarity methods for comparing sets of concepts proposed in the literature, by carrying out an experimentation that includes both a statistical analysis and an expert judgment evaluation. To the purpose of achieving a reliable assessment, we used a real-world large dataset based on the Digital Library of the Association for Computing Machinery<span> (ACM), and a reference ontology derived from the ACM Computing Classification System (ACM-CCS). For each method, we considered two indicators. The first concerns the degree of confidence to identify the similarity among the papers belonging to some special issues selected from the ACM Transactions on Information Systems journal, the second the Pearson correlation with human judgment. The results reveal that one of the configurations of </span></span><em>SemSim<sup>p</sup></em> outperforms the other assessed methods. An additional experiment performed in the domain of physics shows that, in general, <em>SemSim<sup>p</sup></em> provides better results than the other similarity methods.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100773"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Answering Count Questions with Structured Answers from Text 回答计数问题与结构化的答案从文本
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2022.100769
Shrestha Ghosh , Simon Razniewski , Gerhard Weikum

In this work we address the challenging case of answering count queries in web search, such as number of songs by John Lennon. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries, including existing benchmark show the benefits of our method, and the influence of specific parameter settings. Our code, data and an interactive system demonstration are publicly available at https://github.com/ghoshs/CoQEx and https://nlcounqer.mpi-inf.mpg.de/.

在这项工作中,我们解决了在网络搜索中回答计数查询的挑战性案例,例如约翰·列侬的歌曲数量。以前的方法只是用一个数字来回答这些问题,有时甚至令人费解,或者返回一个不同数字的文本片段的排序列表。本文提出了一种利用推理、情境化和解释性证据回答计数查询的方法。与以前的系统不同,我们的方法从多个观察中推断出最终答案,支持计数的语义限定符,并通过枚举具有代表性的实例来提供证据。对各种查询的实验,包括现有的基准测试,显示了我们的方法的好处,以及特定参数设置的影响。我们的代码、数据和交互式系统演示可在https://github.com/ghoshs/CoQEx和https://nlcounqer.mpi-inf.mpg.de/.
{"title":"Answering Count Questions with Structured Answers from Text","authors":"Shrestha Ghosh ,&nbsp;Simon Razniewski ,&nbsp;Gerhard Weikum","doi":"10.1016/j.websem.2022.100769","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100769","url":null,"abstract":"<div><p><span>In this work we address the challenging case of answering count queries in web search, such as </span><em>number of songs by John Lennon</em><span>. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries, including existing benchmark show the benefits of our method, and the influence of specific parameter settings. Our code, data and an interactive system demonstration are publicly available at </span><span>https://github.com/ghoshs/CoQEx</span><svg><path></path></svg> and <span>https://nlcounqer.mpi-inf.mpg.de/</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100769"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Building a Knowledge Graph for the History of Vienna with Semantic MediaWiki 利用语义媒体维基构建维也纳历史知识图谱
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2022.100771
Bernhard Krabina

While research on semantic wikis is declining, Semantic MediaWiki (SMW) can still play an important role in the emerging field of knowledge graph curation.

The Vienna History Wiki, a large knowledge base curated by the city government in collaboration with other institutions and the general public, provides an ideal use case for demonstrating strengths and weaknesses of SMW as well as discussing the challenges of co-curation in a cultural heritage setting. This paper describes processes like collaborative editing, interlinking unique identifiers on the web, sharing data with Wikidata, making use of Schema.org, and other ontologies. It presents insights from a user survey, access statistics, and a knowledge graph analysis.

This work contributes to the scarce research in wiki usage outside of the Wikipedia ecosystem as well as to the field of community-based knowledge graph curation. The availability of a now significantly improved RDF representation indicates future directions for research and practice.

虽然对语义维基的研究正在减少,但语义媒体wiki (semantic MediaWiki, SMW)仍然可以在新兴的知识图谱管理领域发挥重要作用。维也纳历史维基是一个由市政府与其他机构和公众合作管理的大型知识库,它提供了一个理想的用例,展示了维也纳历史博物馆的优势和劣势,并讨论了在文化遗产环境中共同管理的挑战。本文描述了协同编辑、在网络上连接唯一标识符、与维基数据共享数据、利用Schema.org和其他本体等过程。它提供了来自用户调查、访问统计和知识图分析的见解。这项工作有助于维基百科生态系统之外的维基使用方面的稀缺研究,以及基于社区的知识图谱管理领域。现在显著改进的RDF表示的可用性表明了研究和实践的未来方向。
{"title":"Building a Knowledge Graph for the History of Vienna with Semantic MediaWiki","authors":"Bernhard Krabina","doi":"10.1016/j.websem.2022.100771","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100771","url":null,"abstract":"<div><p>While research on semantic wikis is declining, Semantic MediaWiki (SMW) can still play an important role in the emerging field of knowledge graph curation.</p><p>The Vienna History Wiki, a large knowledge base curated by the city government in collaboration with other institutions and the general public, provides an ideal use case for demonstrating strengths and weaknesses of SMW as well as discussing the challenges of co-curation in a cultural heritage setting. This paper describes processes like collaborative editing, interlinking unique identifiers on the web, sharing data with Wikidata, making use of Schema.org, and other ontologies. It presents insights from a user survey, access statistics, and a knowledge graph analysis.</p><p>This work contributes to the scarce research in wiki usage outside of the Wikipedia ecosystem as well as to the field of community-based knowledge graph curation. The availability of a now significantly improved RDF representation indicates future directions for research and practice.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100771"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
From tabular data to knowledge graphs: A survey of semantic table interpretation tasks and methods 从表格数据到知识图谱:语义表解释任务和方法综述
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2022.100761
Jixiong Liu , Yoan Chabot , Raphaël Troncy , Viet-Phi Huynh , Thomas Labbé , Pierre Monnin

Tabular data often refers to data that is organized in a table with rows and columns. We observe that this data format is widely used on the Web and within enterprise data repositories. Tables potentially contain rich semantic information that still needs to be interpreted. The process of extracting meaningful information out of tabular data with respect to a semantic artefact, such as an ontology or a knowledge graph, is often referred to as Semantic Table Interpretation (STI) or Semantic Table Annotation. In this survey paper, we aim to provide a comprehensive and up-to-date state-of-the-art review of the different tasks and methods that have been proposed so far to perform STI. First, we propose a new categorization that reflects the heterogeneity of table types that one can encounter, revealing different challenges that need to be addressed. Next, we define five major sub-tasks that STI deals with even if the literature has mostly focused on three sub-tasks so far. We review and group the many approaches that have been proposed into three macro families and we discuss their performance and limitations with respect to the various datasets and benchmarks proposed by the community. Finally, we detail what are the remaining scientific barriers to be able to truly automatically interpret any type of tables that can be found in the wild Web.

表格数据通常指在具有行和列的表中组织的数据。我们观察到,这种数据格式在Web和企业数据存储库中被广泛使用。表可能包含仍需解释的丰富语义信息。从表格数据中提取关于语义人工制品(如本体或知识图)的有意义信息的过程通常被称为语义表解释(STI)或语义表注释。在这份调查文件中,我们旨在对迄今为止提出的执行STI的不同任务和方法进行全面和最新的最新审查。首先,我们提出了一种新的分类,它反映了可能遇到的表类型的异构性,揭示了需要解决的不同挑战。接下来,我们定义了STI处理的五个主要子任务,即使到目前为止文献主要集中在三个子任务上。我们将已经提出的许多方法分为三个宏观家族进行审查和分组,并讨论它们相对于社区提出的各种数据集和基准的性能和局限性。最后,我们详细介绍了能够真正自动解释野生网络中任何类型的表的剩余科学障碍。
{"title":"From tabular data to knowledge graphs: A survey of semantic table interpretation tasks and methods","authors":"Jixiong Liu ,&nbsp;Yoan Chabot ,&nbsp;Raphaël Troncy ,&nbsp;Viet-Phi Huynh ,&nbsp;Thomas Labbé ,&nbsp;Pierre Monnin","doi":"10.1016/j.websem.2022.100761","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100761","url":null,"abstract":"<div><p>Tabular data often refers to data that is organized in a table with rows and columns. We observe that this data format<span> is widely used on the Web and within enterprise data repositories. Tables potentially contain rich semantic information that still needs to be interpreted. The process of extracting meaningful information out of tabular data with respect to a semantic artefact, such as an ontology or a knowledge graph, is often referred to as Semantic Table Interpretation (STI) or Semantic Table Annotation. In this survey paper, we aim to provide a comprehensive and up-to-date state-of-the-art review of the different tasks and methods that have been proposed so far to perform STI. First, we propose a new categorization that reflects the heterogeneity of table types that one can encounter, revealing different challenges that need to be addressed. Next, we define five major sub-tasks that STI deals with even if the literature has mostly focused on three sub-tasks so far. We review and group the many approaches that have been proposed into three macro families and we discuss their performance and limitations with respect to the various datasets and benchmarks proposed by the community. Finally, we detail what are the remaining scientific barriers to be able to truly automatically interpret any type of tables that can be found in the wild Web.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100761"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Explainable argumentation as a service 可解释的论证服务
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-01 DOI: 10.1016/j.websem.2023.100772
Nikolaos I. Spanoudakis , Georgios Gligoris , Adamos Koumi , Antonis C. Kakas

Gorgias Cloud offers an integrated application development environment that facilitates the development of argumentation-based systems over the internet. Argumentation is offered as a service in a way that this allows application systems to remotely access the argumentation service and utilize the results of the argumentative computation. Moreover, the service results include the explanation of the decision in both human and machine-readable formats. The first is useful for allowing the application validation to be done by experts, while the second is useful for development. It appears that this is the first case where argumentation is offered to developers in such an open and distributed way.

Gorgias Cloud提供了一个集成的应用程序开发环境,促进了基于论证的系统在互联网上的开发。论证作为一种服务提供,允许应用程序系统远程访问论证服务并利用论证计算的结果。此外,服务结果以人类和机器可读的格式包括对决策的解释。第一个对于让专家完成应用程序验证是有用的,而第二个对于开发是有用的。这似乎是第一次以如此开放和分布式的方式向开发人员提供论证。
{"title":"Explainable argumentation as a service","authors":"Nikolaos I. Spanoudakis ,&nbsp;Georgios Gligoris ,&nbsp;Adamos Koumi ,&nbsp;Antonis C. Kakas","doi":"10.1016/j.websem.2023.100772","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100772","url":null,"abstract":"<div><p><span>Gorgias Cloud</span><span> offers an integrated application development environment that facilitates the development of argumentation-based systems over the internet. Argumentation is offered as a service in a way that this allows application systems to remotely access the argumentation service and utilize the results of the argumentative computation. Moreover, the service results include the explanation of the decision in both human and machine-readable formats. The first is useful for allowing the application validation to be done by experts, while the second is useful for development. It appears that this is the first case where argumentation is offered to developers in such an open and distributed way.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"76 ","pages":"Article 100772"},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49876693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimizing a tableau reasoner and its implementation in Prolog 优化一个表推理器及其在Prolog中的实现
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2021-10-01 DOI: 10.2139/ssrn.3945445
Riccardo Zese, Giuseppe Cota
{"title":"Optimizing a tableau reasoner and its implementation in Prolog","authors":"Riccardo Zese, Giuseppe Cota","doi":"10.2139/ssrn.3945445","DOIUrl":"https://doi.org/10.2139/ssrn.3945445","url":null,"abstract":"","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"63 1","pages":"100677"},"PeriodicalIF":2.5,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86499553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DABGEO: A Reusable and Usable Global Energy Ontology for the Energy Domain DABGEO:面向能源领域的可重用和可用的全局能源本体
IF 2.5 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2020-02-03 DOI: 10.2139/ssrn.3531214
Javier Cuenca, F. Larrinaga, E. Curry
Abstract The heterogeneity of energy ontologies hinders the interoperability between ontology-based energy management applications to perform a large-scale energy management. Thus, there is the need for a global ontology that provides common vocabularies to represent the energy subdomains. A global energy ontology must provide a balance of reusability–usability to moderate the effort required to reuse it in different applications. This paper presents DABGEO: a reusable and usable global ontology for the energy domain that provides a common representation of energy domains represented by existing energy ontologies. DABGEO can be reused by ontology engineers to develop ontologies for specific energy management applications. In contrast to previous global energy ontologies, it follows a layered structure to provide a balance of reusability–usability. In this work, we provide an overview of the structure of DABGEO and we explain how to reuse it in a particular application case. In addition, the paper includes an evaluation of DABGEO to demonstrate that it provides a balance of reusability–usability.
能源本体的异构性阻碍了基于本体的能源管理应用之间的互操作性,难以实现大规模的能源管理。因此,需要一个提供通用词汇表来表示能量子域的全局本体。全局能源本体必须提供可重用性和可用性之间的平衡,以缓和在不同应用程序中重用它所需的工作量。本文提出了一种可重用和可用的能量域全局本体DABGEO,它为现有的能量本体所表示的能量域提供了一种通用的表示。本体工程师可以重用DABGEO,为特定的能源管理应用开发本体。与以前的全球能源本体相比,它遵循分层结构,以提供可重用性和可用性之间的平衡。在这项工作中,我们提供了DABGEO结构的概述,并解释了如何在特定的应用案例中重用它。此外,本文还包括对DABGEO的评估,以证明它提供了可重用性和可用性之间的平衡。
{"title":"DABGEO: A Reusable and Usable Global Energy Ontology for the Energy Domain","authors":"Javier Cuenca, F. Larrinaga, E. Curry","doi":"10.2139/ssrn.3531214","DOIUrl":"https://doi.org/10.2139/ssrn.3531214","url":null,"abstract":"Abstract The heterogeneity of energy ontologies hinders the interoperability between ontology-based energy management applications to perform a large-scale energy management. Thus, there is the need for a global ontology that provides common vocabularies to represent the energy subdomains. A global energy ontology must provide a balance of reusability–usability to moderate the effort required to reuse it in different applications. This paper presents DABGEO: a reusable and usable global ontology for the energy domain that provides a common representation of energy domains represented by existing energy ontologies. DABGEO can be reused by ontology engineers to develop ontologies for specific energy management applications. In contrast to previous global energy ontologies, it follows a layered structure to provide a balance of reusability–usability. In this work, we provide an overview of the structure of DABGEO and we explain how to reuse it in a particular application case. In addition, the paper includes an evaluation of DABGEO to demonstrate that it provides a balance of reusability–usability.","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2020-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45153825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
Journal of Web Semantics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1