首页 > 最新文献

Semantic Web最新文献

英文 中文
Online approximative SPARQL query processing for COUNT-DISTINCT queries with web preemption 带有web抢占的COUNT-DISTINCT查询的在线近似SPARQL查询处理
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-05-26 DOI: 10.3233/sw-222842
Julien Aimonier-Davat, H. Skaf-Molli, P. Molli, Arnaud Grall, Thomas Minier
Getting complete results when processing aggregate queries on public SPARQL endpoints is challenging, mainly due to the application of quotas. Although Web preemption supports processing of aggregate queries online, on preemptable SPARQL servers, data transfer is still very large when processing count-distinct aggregate queries. In this paper, it is shown that count-distinct aggregate queries can be approximated with low data transfer by extending the partial aggregation operator with HyperLogLog++ sketches. Experimental results demonstrate that the proposed approach outperforms existing approaches by orders of magnitude in terms of the amount of data transferred.
在公共SPARQL端点上处理聚合查询时获得完整的结果具有挑战性,这主要是由于配额的应用。尽管Web抢占支持在线处理聚合查询,但是在可抢占的SPARQL服务器上,当处理不同计数的聚合查询时,数据传输仍然非常大。在本文中,通过使用hyperloglog++草图扩展部分聚合算子,证明了可以在低数据传输的情况下逼近计数不同的聚合查询。实验结果表明,该方法在传输数据量方面优于现有方法。
{"title":"Online approximative SPARQL query processing for COUNT-DISTINCT queries with web preemption","authors":"Julien Aimonier-Davat, H. Skaf-Molli, P. Molli, Arnaud Grall, Thomas Minier","doi":"10.3233/sw-222842","DOIUrl":"https://doi.org/10.3233/sw-222842","url":null,"abstract":"Getting complete results when processing aggregate queries on public SPARQL endpoints is challenging, mainly due to the application of quotas. Although Web preemption supports processing of aggregate queries online, on preemptable SPARQL servers, data transfer is still very large when processing count-distinct aggregate queries. In this paper, it is shown that count-distinct aggregate queries can be approximated with low data transfer by extending the partial aggregation operator with HyperLogLog++ sketches. Experimental results demonstrate that the proposed approach outperforms existing approaches by orders of magnitude in terms of the amount of data transferred.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"98 1","pages":"735-755"},"PeriodicalIF":3.0,"publicationDate":"2022-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86286155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
TermitUp: Generation and enrichment of linked terminologies TermitUp:生成和丰富链接的术语
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-05-26 DOI: 10.3233/sw-222885
Patricia Martín-Chozas, Karen Vázquez-Flores, Pablo Calleja, Elena Montiel-Ponsoda, V. Rodríguez-Doncel
Domain-specific terminologies play a central role in many language technology solutions. Substantial manual effort is still involved in the creation of such resources, and many of them are published in proprietary formats that cannot be easily reused in other applications. Automatic term extraction tools help alleviate this cumbersome task. However, their results are usually in the form of plain lists of terms or as unstructured data with limited linguistic information. Initiatives such as the Linguistic Linked Open Data cloud (LLOD) foster the publication of language resources in open structured formats, specifically RDF, and their linking to other resources on the Web of Data. In order to leverage the wealth of linguistic data in the LLOD and speed up the creation of linked terminological resources, we propose TermitUp, a service that generates enriched domain specific terminologies directly from corpora, and publishes them in open and structured formats. TermitUp is composed of five modules performing terminology extraction, terminology post-processing, terminology enrichment, term relation validation and RDF publication. As part of the pipeline implemented by this service, existing resources in the LLOD are linked with the resulting terminologies, contributing in this way to the population of the LLOD cloud. TermitUp has been used in the framework of European projects tackling different fields, such as the legal domain, with promising results. Different alternatives on how to model enriched terminologies are considered and good practices illustrated with examples are proposed.
特定于领域的术语在许多语言技术解决方案中发挥着核心作用。这些资源的创建仍然需要大量的手工工作,而且其中许多都是以专有格式发布的,在其他应用程序中不容易重用。自动术语提取工具有助于减轻这项繁琐的任务。然而,它们的结果通常是纯术语列表的形式,或者是具有有限语言信息的非结构化数据。诸如语言链接开放数据云(LLOD)之类的计划促进了以开放结构化格式(特别是RDF)发布语言资源,并将它们链接到数据网上的其他资源。为了充分利用LLOD中丰富的语言数据并加快相关术语资源的创建,我们提出了TermitUp服务,该服务直接从语料库中生成丰富的领域特定术语,并以开放和结构化的格式发布它们。TermitUp由术语提取、术语后处理、术语充实、术语关系验证和RDF发布五个模块组成。作为该服务实现的管道的一部分,LLOD中的现有资源与生成的术语相关联,以这种方式为LLOD云的填充做出贡献。TermitUp已被用于处理不同领域(如法律领域)的欧洲项目框架中,并取得了可喜的成果。考虑了如何对丰富的术语建模的不同选择,并提出了用实例说明的良好实践。
{"title":"TermitUp: Generation and enrichment of linked terminologies","authors":"Patricia Martín-Chozas, Karen Vázquez-Flores, Pablo Calleja, Elena Montiel-Ponsoda, V. Rodríguez-Doncel","doi":"10.3233/sw-222885","DOIUrl":"https://doi.org/10.3233/sw-222885","url":null,"abstract":"Domain-specific terminologies play a central role in many language technology solutions. Substantial manual effort is still involved in the creation of such resources, and many of them are published in proprietary formats that cannot be easily reused in other applications. Automatic term extraction tools help alleviate this cumbersome task. However, their results are usually in the form of plain lists of terms or as unstructured data with limited linguistic information. Initiatives such as the Linguistic Linked Open Data cloud (LLOD) foster the publication of language resources in open structured formats, specifically RDF, and their linking to other resources on the Web of Data. In order to leverage the wealth of linguistic data in the LLOD and speed up the creation of linked terminological resources, we propose TermitUp, a service that generates enriched domain specific terminologies directly from corpora, and publishes them in open and structured formats. TermitUp is composed of five modules performing terminology extraction, terminology post-processing, terminology enrichment, term relation validation and RDF publication. As part of the pipeline implemented by this service, existing resources in the LLOD are linked with the resulting terminologies, contributing in this way to the population of the LLOD cloud. TermitUp has been used in the framework of European projects tackling different fields, such as the legal domain, with promising results. Different alternatives on how to model enriched terminologies are considered and good practices illustrated with examples are proposed.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"6 1","pages":"967-986"},"PeriodicalIF":3.0,"publicationDate":"2022-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72819715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Modular ontology modeling 模块化本体建模
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-05-20 DOI: 10.3233/sw-222886
C. Shimizu, K. Hammar, P. Hitzler
Reusing ontologies for new purposes, or adapting them to new use-cases, is frequently difficult. In our experiences, we have found this to be the case for several reasons: (i) differing representational granularity in ontologies and in use-cases, (ii) lacking conceptual clarity in potentially reusable ontologies, (iii) lack and difficulty of adherence to good modeling principles, and (iv) a lack of reuse emphasis and process support available in ontology engineering tooling. In order to address these concerns, we have developed the Modular Ontology Modeling (MOMo) methodology, and its supporting tooling infrastructure, CoModIDE (the Comprehensive Modular Ontology IDE – “commodity”). MOMo builds on the established eXtreme Design methodology, and like it emphasizes modular development and design pattern reuse; but crucially adds the extensive use of graphical schema diagrams, and tooling that support them, as vehicles for knowledge elicitation from experts. In this paper, we present the MOMo workflow in detail, and describe several useful resources for executing it. In particular, we provide a thorough and rigorous evaluation of CoModIDE in its role of supporting the MOMo methodology’s graphical modeling paradigm. We find that CoModIDE significantly improves approachability of such a paradigm, and that it displays a high usability.
为了新的目的重用本体,或者使它们适应新的用例,通常是困难的。根据我们的经验,我们发现这种情况有以下几个原因:(i)本体和用例中的表示粒度不同,(ii)潜在可重用本体缺乏概念清晰度,(iii)缺乏良好建模原则的遵循和困难,以及(iv)缺乏重用重点和本体工程工具中可用的过程支持。为了解决这些问题,我们开发了模块化本体建模(MOMo)方法,以及它的支持工具基础设施,CoModIDE(综合模块化本体IDE -“商品”)。MOMo建立在已建立的极限设计方法的基础上,并且像它一样强调模块化开发和设计模式重用;但关键是增加了图形模式图的广泛使用,以及支持它们的工具,作为从专家那里获取知识的工具。在本文中,我们详细介绍了MOMo工作流,并描述了执行它的一些有用资源。特别是,我们对CoModIDE在支持MOMo方法的图形建模范式方面的作用进行了全面而严格的评估。我们发现CoModIDE显著地提高了这种范例的可接近性,并且显示出很高的可用性。
{"title":"Modular ontology modeling","authors":"C. Shimizu, K. Hammar, P. Hitzler","doi":"10.3233/sw-222886","DOIUrl":"https://doi.org/10.3233/sw-222886","url":null,"abstract":"Reusing ontologies for new purposes, or adapting them to new use-cases, is frequently difficult. In our experiences, we have found this to be the case for several reasons: (i) differing representational granularity in ontologies and in use-cases, (ii) lacking conceptual clarity in potentially reusable ontologies, (iii) lack and difficulty of adherence to good modeling principles, and (iv) a lack of reuse emphasis and process support available in ontology engineering tooling. In order to address these concerns, we have developed the Modular Ontology Modeling (MOMo) methodology, and its supporting tooling infrastructure, CoModIDE (the Comprehensive Modular Ontology IDE – “commodity”). MOMo builds on the established eXtreme Design methodology, and like it emphasizes modular development and design pattern reuse; but crucially adds the extensive use of graphical schema diagrams, and tooling that support them, as vehicles for knowledge elicitation from experts. In this paper, we present the MOMo workflow in detail, and describe several useful resources for executing it. In particular, we provide a thorough and rigorous evaluation of CoModIDE in its role of supporting the MOMo methodology’s graphical modeling paradigm. We find that CoModIDE significantly improves approachability of such a paradigm, and that it displays a high usability.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"24 1","pages":"459-489"},"PeriodicalIF":3.0,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75414993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Deep understanding of everyday activity commands for household robots 对家用机器人日常活动指令的深刻理解
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-05-13 DOI: 10.3233/sw-222973
Sebastian Höffner, R. Porzel, Maria M. Hedblom, M. Pomarlan, Vanja Sophie Cangalovic, Johannes Pfau, J. Bateman, R. Malaka
Going from natural language directions to fully specified executable plans for household robots involves a challenging variety of reasoning steps. In this paper, a processing pipeline to tackle these steps for natural language directions is proposed and implemented. It uses the ontological Socio-physical Model of Activities (SOMA) as a common interface between its components. The pipeline includes a natural language parser and a module for natural language grounding. Several reasoning steps formulate simulation plans, in which robot actions are guided by data gathered using human computation. As a last step, the pipeline simulates the given natural language direction inside a virtual environment. The major advantage of employing an overarching ontological framework is that its asserted facts can be stored alongside the semantics of directions, contextual knowledge, and annotated activity models in one central knowledge base. This allows for a unified and efficient knowledge retrieval across all pipeline components, providing flexibility and reasoning capabilities as symbolic knowledge is combined with annotated sub-symbolic models.
从自然语言指示到家用机器人的完全指定的可执行计划涉及各种具有挑战性的推理步骤。本文提出并实现了一种处理管道来处理这些步骤。它使用活动的本体论社会物理模型(SOMA)作为其组件之间的公共接口。该管道包括一个自然语言解析器和一个用于自然语言基础的模块。几个推理步骤制定了仿真计划,其中机器人的行动由使用人类计算收集的数据指导。作为最后一步,管道在虚拟环境中模拟给定的自然语言方向。使用总体本体框架的主要优点是,它断言的事实可以与方向语义、上下文知识和注释活动模型一起存储在一个中心知识库中。这允许跨所有管道组件进行统一和有效的知识检索,提供灵活性和推理能力,因为符号知识与带注释的子符号模型相结合。
{"title":"Deep understanding of everyday activity commands for household robots","authors":"Sebastian Höffner, R. Porzel, Maria M. Hedblom, M. Pomarlan, Vanja Sophie Cangalovic, Johannes Pfau, J. Bateman, R. Malaka","doi":"10.3233/sw-222973","DOIUrl":"https://doi.org/10.3233/sw-222973","url":null,"abstract":"Going from natural language directions to fully specified executable plans for household robots involves a challenging variety of reasoning steps. In this paper, a processing pipeline to tackle these steps for natural language directions is proposed and implemented. It uses the ontological Socio-physical Model of Activities (SOMA) as a common interface between its components. The pipeline includes a natural language parser and a module for natural language grounding. Several reasoning steps formulate simulation plans, in which robot actions are guided by data gathered using human computation. As a last step, the pipeline simulates the given natural language direction inside a virtual environment. The major advantage of employing an overarching ontological framework is that its asserted facts can be stored alongside the semantics of directions, contextual knowledge, and annotated activity models in one central knowledge base. This allows for a unified and efficient knowledge retrieval across all pipeline components, providing flexibility and reasoning capabilities as symbolic knowledge is combined with annotated sub-symbolic models.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"48 1","pages":"895-909"},"PeriodicalIF":3.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76785106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LL(O)D and NLP perspectives on semantic change for humanities research 语言学(O)D和NLP视角下的人文学科语义变化研究
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-05-06 DOI: 10.3233/sw-222848
F. Armaselu, E. Apostol, Anas Fahad Khan, Chaya Liebeskind, Barbara McGillivray, Ciprian-Octavian Truică, A. Utka, G. Oleškevičienė, M. Erp
This paper presents an overview of the LL(O)D and NLP methods, tools and data for detecting and representing semantic change, with its main application in humanities research. The paper’s aim is to provide the starting point for the construction of a workflow and set of multilingual diachronic ontologies within the humanities use case of the COST Action Nexus Linguarum, European network for Web-centred linguistic data science, CA18209. The survey focuses on the essential aspects needed to understand the current trends and to build applications in this area of study.
本文概述了用于检测和表示语义变化的LL(O)D和NLP方法、工具和数据,以及它们在人文学科研究中的主要应用。本文的目的是为成本行动Nexus Linguarum的人文用例中的工作流和一组多语言历时本体的构建提供起点,欧洲网络以网络为中心的语言数据科学,CA18209。该调查侧重于了解当前趋势和在该研究领域构建应用程序所需的基本方面。
{"title":"LL(O)D and NLP perspectives on semantic change for humanities research","authors":"F. Armaselu, E. Apostol, Anas Fahad Khan, Chaya Liebeskind, Barbara McGillivray, Ciprian-Octavian Truică, A. Utka, G. Oleškevičienė, M. Erp","doi":"10.3233/sw-222848","DOIUrl":"https://doi.org/10.3233/sw-222848","url":null,"abstract":"This paper presents an overview of the LL(O)D and NLP methods, tools and data for detecting and representing semantic change, with its main application in humanities research. The paper’s aim is to provide the starting point for the construction of a workflow and set of multilingual diachronic ontologies within the humanities use case of the COST Action Nexus Linguarum, European network for Web-centred linguistic data science, CA18209. The survey focuses on the essential aspects needed to understand the current trends and to build applications in this area of study.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"9 1","pages":"1051-1080"},"PeriodicalIF":3.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86198143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Editorial of the Special Issue on Deep Learning and Knowledge Graphs 深度学习与知识图谱特刊编辑
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-03-30 DOI: 10.3233/sw-223099
Mehwish Alam, D. Buscaldi, Michael Cochez, Francesco Osborne, Diego Reforgiato Recupero, H. Sack
Mehwish Alam a,b,*, Davide Buscaldi c, Michael Cochez d, Francesco Osborne e, Diego Reforgiato Recupero f and Harald Sack a,b a FIZ-Karlsruhe, Leibniz Institute for Information Infrastructure, Karlsruhe, Germany b Karlsruhe Institute of Technology, Karlsruhe, Germany E-mails: mehwish.alam@kit.edu, harald.sack@kit.edu c LIPN, Université Sorbonne Paris Nord, France E-mail: buscaldi@lipn.fr d University of Amsterdam, the Netherlands E-mail: m.cochez@vu.nl e KMi, The Open University, UK E-mail: francesco.osborne@open.ac.uk f Department of Mathematics and Computer Science, University of Cagliari, Italy E-mail: diego.reforgiato@unica.it
Mehwish Alam a,b,*, Davide Buscaldi c, Michael Cochez d, Francesco Osborne e, Diego Reforgiato Recupero f和Harald Sack a,b a fizz -Karlsruhe,德国卡尔斯鲁厄莱布尼茨信息基础设施研究所,德国卡尔斯鲁厄卡尔斯鲁厄理工学院,德国卡尔斯鲁厄e -mail: mehwish.alam@kit.edu, harald.sack@kit.edu c LIPN,法国巴黎北部索邦大学e - buscaldi@lipn.fr d荷兰阿姆斯特丹大学e - m.cochez@vu.nl e - KMi,英国开放大学e -mail:francesco.osborne@open.ac.uk f意大利卡利亚里大学数学与计算机科学系E-mail: diego.reforgiato@unica.it
{"title":"Editorial of the Special Issue on Deep Learning and Knowledge Graphs","authors":"Mehwish Alam, D. Buscaldi, Michael Cochez, Francesco Osborne, Diego Reforgiato Recupero, H. Sack","doi":"10.3233/sw-223099","DOIUrl":"https://doi.org/10.3233/sw-223099","url":null,"abstract":"Mehwish Alam a,b,*, Davide Buscaldi c, Michael Cochez d, Francesco Osborne e, Diego Reforgiato Recupero f and Harald Sack a,b a FIZ-Karlsruhe, Leibniz Institute for Information Infrastructure, Karlsruhe, Germany b Karlsruhe Institute of Technology, Karlsruhe, Germany E-mails: mehwish.alam@kit.edu, harald.sack@kit.edu c LIPN, Université Sorbonne Paris Nord, France E-mail: buscaldi@lipn.fr d University of Amsterdam, the Netherlands E-mail: m.cochez@vu.nl e KMi, The Open University, UK E-mail: francesco.osborne@open.ac.uk f Department of Mathematics and Computer Science, University of Cagliari, Italy E-mail: diego.reforgiato@unica.it","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"42 1","pages":"293-297"},"PeriodicalIF":3.0,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90353599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Answer selection in community question answering exploiting knowledge graph and context information 利用知识图谱和上下文信息进行社区问答的答案选择
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-03-29 DOI: 10.3233/sw-222970
Golshan Assadat Afzali Boroujeni, Heshaam Faili, Yadollah Yaghoobzadeh
With the increasing popularity of knowledge graph (KG), many applications such as sentiment analysis, trend prediction, and question answering use KG for better performance. Despite the obvious usefulness of commonsense and factual information in the KGs, to the best of our knowledge, KGs have been rarely integrated into the task of answer selection in community question answering (CQA). In this paper, we propose a novel answer selection method in CQA by using the knowledge embedded in KGs. We also learn a latent-variable model for learning the representations of the question and answer, jointly optimizing generative and discriminative objectives. It also uses the question category for producing context-aware representations for questions and answers. Moreover, the model uses variational autoencoders (VAE) in a multi-task learning process with a classifier to produce class-specific representations for answers. The experimental results on three widely used datasets demonstrate that our proposed method is effective and outperforms the existing baselines significantly.
随着知识图谱(KG)的日益普及,情感分析、趋势预测和问答等许多应用都使用了知识图谱来提高性能。尽管常识和事实信息在知识问答中有明显的用处,但据我们所知,知识问答很少被整合到社区问答(CQA)的答案选择任务中。在本文中,我们提出了一种新的CQA答案选择方法,利用知识库中嵌入的知识,我们还学习了一个潜在变量模型,用于学习问题和答案的表示,共同优化生成和判别目标。它还使用问题类别为问题和答案生成上下文感知的表示。此外,该模型在多任务学习过程中使用变分自编码器(VAE),并使用分类器为答案生成特定类别的表示。在三个广泛使用的数据集上的实验结果表明,我们提出的方法是有效的,并且明显优于现有的基线。
{"title":"Answer selection in community question answering exploiting knowledge graph and context information","authors":"Golshan Assadat Afzali Boroujeni, Heshaam Faili, Yadollah Yaghoobzadeh","doi":"10.3233/sw-222970","DOIUrl":"https://doi.org/10.3233/sw-222970","url":null,"abstract":"With the increasing popularity of knowledge graph (KG), many applications such as sentiment analysis, trend prediction, and question answering use KG for better performance. Despite the obvious usefulness of commonsense and factual information in the KGs, to the best of our knowledge, KGs have been rarely integrated into the task of answer selection in community question answering (CQA). In this paper, we propose a novel answer selection method in CQA by using the knowledge embedded in KGs. We also learn a latent-variable model for learning the representations of the question and answer, jointly optimizing generative and discriminative objectives. It also uses the question category for producing context-aware representations for questions and answers. Moreover, the model uses variational autoencoders (VAE) in a multi-task learning process with a classifier to produce class-specific representations for answers. The experimental results on three widely used datasets demonstrate that our proposed method is effective and outperforms the existing baselines significantly.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"4 1","pages":"339-356"},"PeriodicalIF":3.0,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75475447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Tab2KG: Semantic table interpretation with lightweight semantic profiles Tab2KG:使用轻量级语义配置文件的语义表解释
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-03-29 DOI: 10.3233/SW-222993
Simon Gottschalk, Elena Demidova
Tabular data plays an essential role in many data analytics and machine learning tasks. Typically, tabular data does not possess any machine-readable semantics. In this context, semantic table interpretation is crucial for making data analytics workflows more robust and explainable. This article proposes Tab2KG – a novel method that targets at the interpretation of tables with previously unseen data and automatically infers their semantics to transform them into semantic data graphs. We introduce original lightweight semantic profiles that enrich a domain ontology’s concepts and relations and represent domain and table characteristics. We propose a one-shot learning approach that relies on these profiles to map a tabular dataset containing previously unseen instances to a domain ontology. In contrast to the existing semantic table interpretation approaches, Tab2KG relies on the semantic profiles only and does not require any instance lookup. This property makes Tab2KG particularly suitable in the data analytics context, in which data tables typically contain new instances. Our experimental evaluation on several real-world datasets from different application domains demonstrates that Tab2KG outperforms state-of-the-art semantic table interpretation baselines.
表格数据在许多数据分析和机器学习任务中起着至关重要的作用。通常,表格数据不具有任何机器可读的语义。在这种情况下,语义表解释对于使数据分析工作流更加健壮和可解释至关重要。本文提出了Tab2KG——一种新的方法,用于解释包含以前未见过的数据的表,并自动推断其语义,将其转换为语义数据图。我们引入了原始的轻量级语义配置文件,丰富了领域本体的概念和关系,并表示了领域和表的特征。我们提出了一种一次性学习方法,该方法依赖于这些配置文件将包含以前未见实例的表格数据集映射到领域本体。与现有的语义表解释方法不同,Tab2KG只依赖于语义概要文件,不需要任何实例查找。这个属性使得Tab2KG特别适用于数据分析上下文中,其中数据表通常包含新实例。我们对来自不同应用领域的几个真实数据集的实验评估表明,Tab2KG优于最先进的语义表解释基线。
{"title":"Tab2KG: Semantic table interpretation with lightweight semantic profiles","authors":"Simon Gottschalk, Elena Demidova","doi":"10.3233/SW-222993","DOIUrl":"https://doi.org/10.3233/SW-222993","url":null,"abstract":"Tabular data plays an essential role in many data analytics and machine learning tasks. Typically, tabular data does not possess any machine-readable semantics. In this context, semantic table interpretation is crucial for making data analytics workflows more robust and explainable. This article proposes Tab2KG – a novel method that targets at the interpretation of tables with previously unseen data and automatically infers their semantics to transform them into semantic data graphs. We introduce original lightweight semantic profiles that enrich a domain ontology’s concepts and relations and represent domain and table characteristics. We propose a one-shot learning approach that relies on these profiles to map a tabular dataset containing previously unseen instances to a domain ontology. In contrast to the existing semantic table interpretation approaches, Tab2KG relies on the semantic profiles only and does not require any instance lookup. This property makes Tab2KG particularly suitable in the data analytics context, in which data tables typically contain new instances. Our experimental evaluation on several real-world datasets from different application domains demonstrates that Tab2KG outperforms state-of-the-art semantic table interpretation baselines.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"115 1","pages":"571-597"},"PeriodicalIF":3.0,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77596896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Methodologies for publishing linked open government data on the Web: A systematic mapping and a unified process model 在网络上发布链接的开放政府数据的方法:系统的映射和统一的过程模型
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-02-08 DOI: 10.3233/sw-222896
B. Penteado, J. Maldonado, Seiji Isotani
Since the beginning of the release of open data by many countries, different methodologies for publishing linked data have been proposed. However, they seem not to be adopted by early studies exploring linked data for different reasons. In this work, we conducted a systematic mapping in the literature to synthesize the different approaches around the following topics: common steps, associated tools and practices, quality assessment validations, and evaluation of the methodology. The findings show a core set of activities, based on the linked data principles, but with additional critical steps for practical use in scale. Furthermore, although a fair amount of quality issues are reported in the literature, very few of these methodologies embed validation steps in their process. We describe an integrated overview of the different activities and how they can be executed with appropriate tools. We also present research challenges that need to be addressed in future works in this area.
自从许多国家开始发布开放数据以来,已经提出了不同的发布关联数据的方法。然而,由于不同的原因,它们似乎没有被早期探索关联数据的研究所采用。在这项工作中,我们在文献中进行了系统的映射,以综合围绕以下主题的不同方法:公共步骤,相关工具和实践,质量评估验证,以及方法的评估。研究结果显示了一组基于关联数据原则的核心活动,但也包含了用于大规模实际应用的额外关键步骤。此外,尽管文献中报道了相当数量的质量问题,但这些方法中很少在其过程中嵌入验证步骤。我们描述了不同活动的综合概述,以及如何使用适当的工具执行这些活动。我们还提出了在该领域未来工作中需要解决的研究挑战。
{"title":"Methodologies for publishing linked open government data on the Web: A systematic mapping and a unified process model","authors":"B. Penteado, J. Maldonado, Seiji Isotani","doi":"10.3233/sw-222896","DOIUrl":"https://doi.org/10.3233/sw-222896","url":null,"abstract":"Since the beginning of the release of open data by many countries, different methodologies for publishing linked data have been proposed. However, they seem not to be adopted by early studies exploring linked data for different reasons. In this work, we conducted a systematic mapping in the literature to synthesize the different approaches around the following topics: common steps, associated tools and practices, quality assessment validations, and evaluation of the methodology. The findings show a core set of activities, based on the linked data principles, but with additional critical steps for practical use in scale. Furthermore, although a fair amount of quality issues are reported in the literature, very few of these methodologies embed validation steps in their process. We describe an integrated overview of the different activities and how they can be executed with appropriate tools. We also present research challenges that need to be addressed in future works in this area.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"125 1","pages":"585-610"},"PeriodicalIF":3.0,"publicationDate":"2022-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73770897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Continuous multi-query optimization for subgraph matching over dynamic graphs 动态图上子图匹配的连续多查询优化
IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-01-28 DOI: 10.3233/sw-212864
Xi Wang, Qianzhen Zhang, Deke Guo, Xiang Zhao
There is a growing need to perform real-time analytics on dynamic graphs in order to deliver the values of big data to users. An important problem from such applications is continuously identifying and monitoring critical patterns when fine-grained updates at a high velocity occur on the graphs. A lot of efforts have been made to develop practical solutions for these problems. Despite the efforts, existing algorithms showed limited running time and scalability in dealing with large and/or many graphs. In this paper, we study the problem of continuous multi-query optimization for subgraph matching over dynamic graph data. (1) We propose annotated query graph, which is obtained by merging the multi-queries into one. (2) Based on the annotated query, we employ a concise auxiliary data structure to represent partial solutions in a compact form. (3) In addition, we propose an efficient maintenance strategy to detect the affected queries for each update and report corresponding matches in one pass. (4) Extensive experiments over real-life and synthetic datasets verify the effectiveness and efficiency of our approach and confirm a two orders of magnitude improvement of the proposed solution.
为了向用户提供大数据的价值,越来越需要对动态图形进行实时分析。此类应用程序的一个重要问题是,当图形上高速发生细粒度更新时,持续识别和监视关键模式。为制定解决这些问题的切实可行的办法,已经作出了许多努力。尽管付出了努力,但现有算法在处理大型和/或许多图时显示出有限的运行时间和可扩展性。本文研究了动态图数据上子图匹配的连续多查询优化问题。(1)提出了带注释的查询图,该查询图由多个查询合并而成。(2)在标注查询的基础上,采用简洁的辅助数据结构以紧凑的形式表示部分解。(3)此外,我们提出了一种有效的维护策略来检测每个更新的受影响查询,并在一次传递中报告相应的匹配。(4)在现实生活和合成数据集上进行的大量实验验证了我们方法的有效性和效率,并确认了所提出的解决方案的两个数量级改进。
{"title":"Continuous multi-query optimization for subgraph matching over dynamic graphs","authors":"Xi Wang, Qianzhen Zhang, Deke Guo, Xiang Zhao","doi":"10.3233/sw-212864","DOIUrl":"https://doi.org/10.3233/sw-212864","url":null,"abstract":"There is a growing need to perform real-time analytics on dynamic graphs in order to deliver the values of big data to users. An important problem from such applications is continuously identifying and monitoring critical patterns when fine-grained updates at a high velocity occur on the graphs. A lot of efforts have been made to develop practical solutions for these problems. Despite the efforts, existing algorithms showed limited running time and scalability in dealing with large and/or many graphs. In this paper, we study the problem of continuous multi-query optimization for subgraph matching over dynamic graph data. (1) We propose annotated query graph, which is obtained by merging the multi-queries into one. (2) Based on the annotated query, we employ a concise auxiliary data structure to represent partial solutions in a compact form. (3) In addition, we propose an efficient maintenance strategy to detect the affected queries for each update and report corresponding matches in one pass. (4) Extensive experiments over real-life and synthetic datasets verify the effectiveness and efficiency of our approach and confirm a two orders of magnitude improvement of the proposed solution.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"41 1","pages":"601-622"},"PeriodicalIF":3.0,"publicationDate":"2022-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78534916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Semantic Web
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1