HKA：用于多模态知识图谱补全的分层知识对齐框架

IF 5.2 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-05-11 DOI:10.1145/3664288

Yunhui Xu, Youru Li, Muhao Xu, Zhenfeng Zhu, Yao Zhao

{"title":"HKA：用于多模态知识图谱补全的分层知识对齐框架","authors":"Yunhui Xu, Youru Li, Muhao Xu, Zhenfeng Zhu, Yao Zhao","doi":"10.1145/3664288","DOIUrl":null,"url":null,"abstract":"Recent years have witnessed the successful application of knowledge graph techniques in structured data processing, while how to incorporate knowledge from visual and textual modalities into knowledge graphs has been given less attention. To better organize them, Multimodal Knowledge Graphs (MKGs), comprising the structural triplets of traditional Knowledge Graphs (KGs) together with entity-related multimodal data (e.g., images and texts), have been introduced consecutively. However, it is still a great challenge to explore MKGs due to their inherent incompleteness. Although most existing Multimodal Knowledge Graph Completion (MKGC) approaches can infer missing triplets based on available factual triplets and multimodal information, they almost ignore the modal conflicts and supervisory effect, failing to achieve a more comprehensive understanding of entities. To address these issues, we propose a novel <underline>H</underline>ierarchical <underline>K</underline>nowledge <underline>A</underline>lignment (HKA) framework for MKGC. Specifically, a macro-knowledge alignment module is proposed to capture global semantic relevance between modalities for dealing with modal conflicts in MKG. Furthermore, a micro-knowledge alignment module is also developed to reveal the local consistency information through inter- and intra-modality supervisory effect more effectively. By integrating different modal predictions, a final decision can be made. Experimental results on three benchmark MKGC tasks have demonstrated the effectiveness of the proposed HKA framework.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"2015 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HKA: A Hierarchical Knowledge Alignment Framework for Multimodal Knowledge Graph Completion\",\"authors\":\"Yunhui Xu, Youru Li, Muhao Xu, Zhenfeng Zhu, Yao Zhao\",\"doi\":\"10.1145/3664288\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years have witnessed the successful application of knowledge graph techniques in structured data processing, while how to incorporate knowledge from visual and textual modalities into knowledge graphs has been given less attention. To better organize them, Multimodal Knowledge Graphs (MKGs), comprising the structural triplets of traditional Knowledge Graphs (KGs) together with entity-related multimodal data (e.g., images and texts), have been introduced consecutively. However, it is still a great challenge to explore MKGs due to their inherent incompleteness. Although most existing Multimodal Knowledge Graph Completion (MKGC) approaches can infer missing triplets based on available factual triplets and multimodal information, they almost ignore the modal conflicts and supervisory effect, failing to achieve a more comprehensive understanding of entities. To address these issues, we propose a novel <underline>H</underline>ierarchical <underline>K</underline>nowledge <underline>A</underline>lignment (HKA) framework for MKGC. Specifically, a macro-knowledge alignment module is proposed to capture global semantic relevance between modalities for dealing with modal conflicts in MKG. Furthermore, a micro-knowledge alignment module is also developed to reveal the local consistency information through inter- and intra-modality supervisory effect more effectively. By integrating different modal predictions, a final decision can be made. Experimental results on three benchmark MKGC tasks have demonstrated the effectiveness of the proposed HKA framework.\",\"PeriodicalId\":50937,\"journal\":{\"name\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"volume\":\"2015 1\",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-05-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3664288\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3664288","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，知识图谱技术在结构化数据处理中得到了成功应用，但如何将视觉和文本模式的知识纳入知识图谱却鲜有人关注。为了更好地组织知识图谱，由传统知识图谱（KG）的结构三元组与实体相关的多模态数据（如图像和文本）组成的多模态知识图谱（MKG）相继问世。然而，由于多模态知识图谱本身的不完整性，探索多模态知识图谱仍然是一项巨大的挑战。虽然现有的多模态知识图谱补全（MKGC）方法大多能根据现有的事实三元组和多模态信息推断出缺失的三元组，但它们几乎忽略了模态冲突和监督效应，无法实现对实体更全面的理解。为了解决这些问题，我们为 MKGC 提出了一个新颖的分层知识对齐（HKA）框架。具体来说，我们提出了一个宏观知识对齐模块，用于捕捉模态之间的全局语义相关性，以处理 MKG 中的模态冲突。此外，还开发了微观知识对齐模块，通过模态间和模态内的监督效应更有效地揭示局部一致性信息。通过整合不同的模态预测，可以做出最终决策。三个基准 MKGC 任务的实验结果证明了所提出的 HKA 框架的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

HKA: A Hierarchical Knowledge Alignment Framework for Multimodal Knowledge Graph Completion

Recent years have witnessed the successful application of knowledge graph techniques in structured data processing, while how to incorporate knowledge from visual and textual modalities into knowledge graphs has been given less attention. To better organize them, Multimodal Knowledge Graphs (MKGs), comprising the structural triplets of traditional Knowledge Graphs (KGs) together with entity-related multimodal data (e.g., images and texts), have been introduced consecutively. However, it is still a great challenge to explore MKGs due to their inherent incompleteness. Although most existing Multimodal Knowledge Graph Completion (MKGC) approaches can infer missing triplets based on available factual triplets and multimodal information, they almost ignore the modal conflicts and supervisory effect, failing to achieve a more comprehensive understanding of entities. To address these issues, we propose a novel Hierarchical Knowledge Alignment (HKA) framework for MKGC. Specifically, a macro-knowledge alignment module is proposed to capture global semantic relevance between modalities for dealing with modal conflicts in MKG. Furthermore, a micro-knowledge alignment module is also developed to reveal the local consistency information through inter- and intra-modality supervisory effect more effectively. By integrating different modal predictions, a final decision can be made. Experimental results on three benchmark MKGC tasks have demonstrated the effectiveness of the proposed HKA framework.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Multimedia Computing Communications and Applications 工程技术-计算机：理论方法

CiteScore

8.50

自引率

5.90%

发文量

285

审稿时长

7.5 months

期刊介绍： The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.