Quality assessment of cyber threat intelligence knowledge graph based on adaptive joining of embedding model

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2024-11-26 DOI:10.1007/s40747-024-01661-3
Bin Chen, Hongyi Li, Di Zhao, Yitang Yang, Chengwei Pan
{"title":"Quality assessment of cyber threat intelligence knowledge graph based on adaptive joining of embedding model","authors":"Bin Chen, Hongyi Li, Di Zhao, Yitang Yang, Chengwei Pan","doi":"10.1007/s40747-024-01661-3","DOIUrl":null,"url":null,"abstract":"<p>In the research of cyber threat intelligence knowledge graphs, the current challenge is that there are errors, inconsistencies, or missing knowledge graph triples, which makes it difficult to cope with the complexity and diversified application requirements. Currently, the predominant approach in quality assessment research for knowledge graphs involves employing word embeddings. This method evaluates the rationality of triples to assess the quality of knowledge graphs. Recent studies have found that better word representations can be obtained by splicing different types of embeddings, and applied to tasks such as named entity recognition (NER). However, amidst the proliferation of embedding typologies, the conundrum of selecting optimal embeddings for constructing connection representations has emerged as a pressing issue. In this paper, we propose an adaptive joining of embedding (AJE) model to automatically find better word embedding representations for knowledge graph quality assessment. The AJE model operates through a coordinated interplay between a task model and a selector. The former samples word embeddings generated by various models, while the latter generates rewards predicated on feedback obtained from current task outcomes to decide whether or not to splice the embedding. Experiments were conducted on two generic datasets and one cybersecurity dataset for knowledge graph quality assessment. The results show that our model outperforms the baseline model and achieves significant advantages in key metrics such as accuracy and F1 value, obtaining accuracy of 95.8%, 95.6% and 91.3% on the generic datasets WN11, FB13 and cybersecurity dataset CS13K, respectively, representing increases of 1.0%, 0.2% and 0.5% over the AttTucker model.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"257 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01661-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In the research of cyber threat intelligence knowledge graphs, the current challenge is that there are errors, inconsistencies, or missing knowledge graph triples, which makes it difficult to cope with the complexity and diversified application requirements. Currently, the predominant approach in quality assessment research for knowledge graphs involves employing word embeddings. This method evaluates the rationality of triples to assess the quality of knowledge graphs. Recent studies have found that better word representations can be obtained by splicing different types of embeddings, and applied to tasks such as named entity recognition (NER). However, amidst the proliferation of embedding typologies, the conundrum of selecting optimal embeddings for constructing connection representations has emerged as a pressing issue. In this paper, we propose an adaptive joining of embedding (AJE) model to automatically find better word embedding representations for knowledge graph quality assessment. The AJE model operates through a coordinated interplay between a task model and a selector. The former samples word embeddings generated by various models, while the latter generates rewards predicated on feedback obtained from current task outcomes to decide whether or not to splice the embedding. Experiments were conducted on two generic datasets and one cybersecurity dataset for knowledge graph quality assessment. The results show that our model outperforms the baseline model and achieves significant advantages in key metrics such as accuracy and F1 value, obtaining accuracy of 95.8%, 95.6% and 91.3% on the generic datasets WN11, FB13 and cybersecurity dataset CS13K, respectively, representing increases of 1.0%, 0.2% and 0.5% over the AttTucker model.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于嵌入模型自适应加入的网络威胁情报知识图谱质量评估
在网络威胁情报知识图谱的研究中,目前面临的挑战是知识图谱三元组存在错误、不一致或缺失,难以应对复杂多样的应用需求。目前,知识图谱质量评估研究的主要方法是采用词嵌入。这种方法通过评估三元组的合理性来评估知识图谱的质量。最近的研究发现,通过拼接不同类型的嵌入可以获得更好的词表示,并将其应用于命名实体识别(NER)等任务中。然而,随着嵌入类型的激增,如何选择最佳嵌入来构建连接表征已成为一个亟待解决的难题。在本文中,我们提出了一种自适应连接嵌入(AJE)模型,可自动为知识图谱质量评估找到更好的词嵌入表示。AJE 模型通过任务模型和选择器之间的协调互动来运行。前者对各种模型生成的单词嵌入进行采样,后者则根据从当前任务结果中获得的反馈生成奖励,以决定是否拼接嵌入。我们在两个通用数据集和一个网络安全数据集上进行了知识图谱质量评估实验。结果表明,我们的模型优于基线模型,并在准确率和 F1 值等关键指标上取得了显著优势,在通用数据集 WN11、FB13 和网络安全数据集 CS13K 上的准确率分别为 95.8%、95.6% 和 91.3%,比 AttTucker 模型分别提高了 1.0%、0.2% 和 0.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
期刊最新文献
Quality assessment of cyber threat intelligence knowledge graph based on adaptive joining of embedding model Spectral-energy efficiency tradeoff of massive MIMO by a constrained large-scale multi-objective algorithm through decision transfer DR-Z2AN: dual-recurrent neural network with a tri-channel attention mechanism for financial management prediction UNet-Att: a self-supervised denoising and recovery model for two-photon microscopic image FL-Joint: joint aligning features and labels in federated learning for data heterogeneity
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1