推荐的多模态校正网络

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-11-07 DOI:10.1109/TKDE.2024.3493374

Zengmao Wang;Yunzhen Feng;Xin Zhang;Renjie Yang;Bo Du

{"title":"推荐的多模态校正网络","authors":"Zengmao Wang;Yunzhen Feng;Xin Zhang;Renjie Yang;Bo Du","doi":"10.1109/TKDE.2024.3493374","DOIUrl":null,"url":null,"abstract":"Multi-modal contents have proven to be the powerful knowledge for recommendation tasks. Most state-of-the-art multi-modal recommendation methods mainly focus on aligning the semantic spaces of different modalities to enhance the item representations and do not pay much attention on the relevant knowledge in the multi-modalities for recommendation, resulting in that the positive effects of the relevant knowledge is reduced and the improvement of recommendation performance is limited. In this paper, we propose a multi-modal correction network termed MMCN to enhance the item representation with the important semantic knowledge in each modality by a residual structure with attention mechanisms and a hierarchical contrastive learning framework. The residual information is obtained through self-attention and cross-attention, which can learn the relevant knowledge across different modalities effectively. While hierarchical contrastive learning further captures the relevant knowledge not only at the feature level but also at the element-wise level with a matrix. Extensive experiments on three large-scale real-world datasets show the superiority of MMCN over state-of-the-art multi-modal recommendation methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 2","pages":"810-822"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Modal Correction Network for Recommendation\",\"authors\":\"Zengmao Wang;Yunzhen Feng;Xin Zhang;Renjie Yang;Bo Du\",\"doi\":\"10.1109/TKDE.2024.3493374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-modal contents have proven to be the powerful knowledge for recommendation tasks. Most state-of-the-art multi-modal recommendation methods mainly focus on aligning the semantic spaces of different modalities to enhance the item representations and do not pay much attention on the relevant knowledge in the multi-modalities for recommendation, resulting in that the positive effects of the relevant knowledge is reduced and the improvement of recommendation performance is limited. In this paper, we propose a multi-modal correction network termed MMCN to enhance the item representation with the important semantic knowledge in each modality by a residual structure with attention mechanisms and a hierarchical contrastive learning framework. The residual information is obtained through self-attention and cross-attention, which can learn the relevant knowledge across different modalities effectively. While hierarchical contrastive learning further captures the relevant knowledge not only at the feature level but also at the element-wise level with a matrix. Extensive experiments on three large-scale real-world datasets show the superiority of MMCN over state-of-the-art multi-modal recommendation methods.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 2\",\"pages\":\"810-822\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10746604/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10746604/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多模态内容已被证明是推荐任务的强大知识。目前大多数多模态推荐方法主要集中在对齐不同模态的语义空间来增强项目表征，而没有过多关注多模态中相关知识的推荐，导致相关知识的积极作用被削弱，推荐性能的提升受到限制。本文提出了一种多模态校正网络（MMCN），通过残差结构、注意机制和层次对比学习框架来增强各模态中重要语义知识对项目的表征。残差信息通过自注意和交叉注意两种方式获得，可以跨不同的模式有效地学习相关知识。而层次对比学习不仅在特征层次上，而且在元素层次上通过矩阵进一步捕获相关知识。在三个大规模真实数据集上的大量实验表明，MMCN优于最先进的多模态推荐方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multi-Modal Correction Network for Recommendation

Multi-modal contents have proven to be the powerful knowledge for recommendation tasks. Most state-of-the-art multi-modal recommendation methods mainly focus on aligning the semantic spaces of different modalities to enhance the item representations and do not pay much attention on the relevant knowledge in the multi-modalities for recommendation, resulting in that the positive effects of the relevant knowledge is reduced and the improvement of recommendation performance is limited. In this paper, we propose a multi-modal correction network termed MMCN to enhance the item representation with the important semantic knowledge in each modality by a residual structure with attention mechanisms and a hierarchical contrastive learning framework. The residual information is obtained through self-attention and cross-attention, which can learn the relevant knowledge across different modalities effectively. While hierarchical contrastive learning further captures the relevant knowledge not only at the feature level but also at the element-wise level with a matrix. Extensive experiments on three large-scale real-world datasets show the superiority of MMCN over state-of-the-art multi-modal recommendation methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.

期刊最新文献

2024 Reviewers List Web-FTP: A Feature Transferring-Based Pre-Trained Model for Web Attack Detection Network-to-Network: Self-Supervised Network Representation Learning via Position Prediction AEGK: Aligned Entropic Graph Kernels Through Continuous-Time Quantum Walks Contextual Inference From Sparse Shopping Transactions Based on Motif Patterns