RevOnt: Reverse engineering of competency questions from knowledge graphs via language models

IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Web Semantics Pub Date : 2024-05-17 DOI:10.1016/j.websem.2024.100822
Fiorela Ciroku , Jacopo de Berardinis , Jongmo Kim , Albert Meroño-Peñuela , Valentina Presutti , Elena Simperl
{"title":"RevOnt: Reverse engineering of competency questions from knowledge graphs via language models","authors":"Fiorela Ciroku ,&nbsp;Jacopo de Berardinis ,&nbsp;Jongmo Kim ,&nbsp;Albert Meroño-Peñuela ,&nbsp;Valentina Presutti ,&nbsp;Elena Simperl","doi":"10.1016/j.websem.2024.100822","DOIUrl":null,"url":null,"abstract":"<div><p>The process of developing ontologies – a formal, explicit specification of a shared conceptualisation – is addressed by well-known methodologies. As for any engineering development, its fundamental basis is the collection of requirements, which includes the elicitation of competency questions. Competency questions are defined through interacting with domain and application experts or by investigating existing datasets that may be used to populate the ontology i.e. its knowledge graph. The rise in popularity and accessibility of knowledge graphs provides an opportunity to support this phase with automatic tools. In this work, we explore the possibility of extracting competency questions from a knowledge graph. This reverses the traditional workflow in which knowledge graphs are built from ontologies, which in turn are engineered from competency questions. We describe in detail RevOnt, an approach that extracts and abstracts triples from a knowledge graph, generates questions based on triple verbalisations, and filters the resulting questions to yield a meaningful set of competency questions; the WDV dataset. This approach is implemented utilising the Wikidata knowledge graph as a use case, and contributes a set of core competency questions from 20 domains present in the WDV dataset. To evaluate RevOnt, we contribute a new dataset of manually-annotated high-quality competency questions, and compare the extracted competency questions by calculating their BLEU score against the human references. The results for the abstraction and question generation components of the approach show good to high quality. Meanwhile, the accuracy of the filtering component is above 86%, which is comparable to the state-of-the-art classifications.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000088/pdfft?md5=df0ecfc8d3506e224b7b22fbafe38dbf&pid=1-s2.0-S1570826824000088-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Semantics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826824000088","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The process of developing ontologies – a formal, explicit specification of a shared conceptualisation – is addressed by well-known methodologies. As for any engineering development, its fundamental basis is the collection of requirements, which includes the elicitation of competency questions. Competency questions are defined through interacting with domain and application experts or by investigating existing datasets that may be used to populate the ontology i.e. its knowledge graph. The rise in popularity and accessibility of knowledge graphs provides an opportunity to support this phase with automatic tools. In this work, we explore the possibility of extracting competency questions from a knowledge graph. This reverses the traditional workflow in which knowledge graphs are built from ontologies, which in turn are engineered from competency questions. We describe in detail RevOnt, an approach that extracts and abstracts triples from a knowledge graph, generates questions based on triple verbalisations, and filters the resulting questions to yield a meaningful set of competency questions; the WDV dataset. This approach is implemented utilising the Wikidata knowledge graph as a use case, and contributes a set of core competency questions from 20 domains present in the WDV dataset. To evaluate RevOnt, we contribute a new dataset of manually-annotated high-quality competency questions, and compare the extracted competency questions by calculating their BLEU score against the human references. The results for the abstraction and question generation components of the approach show good to high quality. Meanwhile, the accuracy of the filtering component is above 86%, which is comparable to the state-of-the-art classifications.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RevOnt:通过语言模型从知识图谱中反向设计能力问题
本体论是对共享概念的一种正式、明确的规范,其开发过程由著名的方法论加以处理。与任何工程开发一样,其根本基础是收集需求,其中包括征集能力问题。能力问题是通过与领域和应用专家互动,或通过调查可用于填充本体(即其知识图谱)的现有数据集来定义的。知识图谱的普及和可访问性的提高为使用自动工具支持这一阶段提供了机会。在这项工作中,我们探索了从知识图谱中提取能力问题的可能性。这颠覆了传统的工作流程,即知识图谱由本体构建,而本体又由能力问题设计。我们详细介绍了 RevOnt,这是一种从知识图谱中提取和抽象三元组,根据三元组的口头表达生成问题,并对生成的问题进行过滤,以产生一组有意义的能力问题的方法;WDV 数据集。这种方法是利用维基数据知识图谱作为用例实现的,并从 WDV 数据集中的 20 个领域中提供了一组核心能力问题。为了对 RevOnt 进行评估,我们提供了一个包含人工标注的高质量能力问题的新数据集,并通过计算提取的能力问题与人工参考的 BLEU 分数进行比较。该方法的抽象和问题生成部分的结果显示出良好到较高的质量。同时,过滤部分的准确率超过 86%,与最先进的分类方法相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Web Semantics
Journal of Web Semantics 工程技术-计算机:人工智能
CiteScore
6.20
自引率
12.00%
发文量
22
审稿时长
14.6 weeks
期刊介绍: The Journal of Web Semantics is an interdisciplinary journal based on research and applications of various subject areas that contribute to the development of a knowledge-intensive and intelligent service Web. These areas include: knowledge technologies, ontology, agents, databases and the semantic grid, obviously disciplines like information retrieval, language technology, human-computer interaction and knowledge discovery are of major relevance as well. All aspects of the Semantic Web development are covered. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services. The journal emphasizes the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications.
期刊最新文献
Uniqorn: Unified question answering over RDF knowledge graphs and natural language text KAE: A property-based method for knowledge graph alignment and extension Multi-stream graph attention network for recommendation with knowledge graph Ontology design facilitating Wikibase integration — and a worked example for historical data Web3-DAO: An ontology for decentralized autonomous organizations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1