Learning Rules from KGs Guided by Language Models

Zihang Peng, Daria Stepanova, Vinh Thinh Ho, Heike Adel, Alessandra Russo, Simon Ott
{"title":"Learning Rules from KGs Guided by Language Models","authors":"Zihang Peng, Daria Stepanova, Vinh Thinh Ho, Heike Adel, Alessandra Russo, Simon Ott","doi":"arxiv-2409.07869","DOIUrl":null,"url":null,"abstract":"Advances in information extraction have enabled the automatic construction of\nlarge knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely\nused in many applications like semantic search or data analytics. However, due\nto their semi-automatic construction, KGs are often incomplete. Rule learning\nmethods, concerned with the extraction of frequent patterns from KGs and\ncasting them into rules, can be applied to predict potentially missing facts. A\ncrucial step in this process is rule ranking. Ranking of rules is especially\nchallenging over highly incomplete or biased KGs (e.g., KGs predominantly\nstoring facts about famous people), as in this case biased rules might fit the\ndata best and be ranked at the top based on standard statistical metrics like\nrule confidence. To address this issue, prior works proposed to rank rules not\nonly relying on the original KG but also facts predicted by a KG embedding\nmodel. At the same time, with the recent rise of Language Models (LMs), several\nworks have claimed that LMs can be used as alternative means for KG completion.\nIn this work, our goal is to verify to which extent the exploitation of LMs is\nhelpful for improving the quality of rule learning systems.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Advances in information extraction have enabled the automatic construction of large knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely used in many applications like semantic search or data analytics. However, due to their semi-automatic construction, KGs are often incomplete. Rule learning methods, concerned with the extraction of frequent patterns from KGs and casting them into rules, can be applied to predict potentially missing facts. A crucial step in this process is rule ranking. Ranking of rules is especially challenging over highly incomplete or biased KGs (e.g., KGs predominantly storing facts about famous people), as in this case biased rules might fit the data best and be ranked at the top based on standard statistical metrics like rule confidence. To address this issue, prior works proposed to rank rules not only relying on the original KG but also facts predicted by a KG embedding model. At the same time, with the recent rise of Language Models (LMs), several works have claimed that LMs can be used as alternative means for KG completion. In this work, our goal is to verify to which extent the exploitation of LMs is helpful for improving the quality of rule learning systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在语言模型指导下从幼稚园学习规则
信息提取技术的进步使得自动构建大型知识图谱(如 Yago、Wikidata 或 Google KG)成为可能,这些图谱在语义搜索或数据分析等许多应用中得到了广泛应用。然而,由于是半自动构建,知识图谱往往是不完整的。规则学习方法涉及从 KG 中提取频繁模式并将其转化为规则,可用于预测可能缺失的事实。这一过程的关键步骤是规则排序。规则排序对于高度不完整或有偏见的 KG(例如主要存储名人事实的 KG)尤其具有挑战性,因为在这种情况下,有偏见的规则可能最适合数据,并根据标准统计指标(如规则置信度)被排在最前面。为了解决这个问题,之前的研究提出不仅要根据原始 KG,还要根据 KG 嵌入模型预测的事实对规则进行排序。与此同时,随着语言模型(LMs)的兴起,一些工作声称 LMs 可以作为完成 KG 的替代手段。在这项工作中,我们的目标是验证利用 LMs 在多大程度上有助于提高规则学习系统的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LLMs + Persona-Plug = Personalized LLMs MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Human-like Affective Cognition in Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1