Pub Date : 2024-01-20DOI: 10.1007/s11280-024-01239-6
Shaohua Teng, Wenbiao Huang, Naiqi Wu, Guanglong Du, Tongbao Chen, Wei Zhang, Luyao Teng
Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.
有监督的跨模态哈希算法吸引了许多研究人员。在这些研究中,他们寻求共同的语义空间,或直接将零一标签信息回归到汉明空间。虽然这些研究取得了不少成果,但也忽略了一些问题:1)分类任务的一些方法不适合检索任务,因为它们缺乏对样本个性化特征的学习;2)哈希检索的结果与哈希码的长度和编码方法都有关。由于样本拥有比标签语义更多的个性化特征,本文提出了一种新的有监督的跨模态哈希协作学习方法,称为 "带松弛和标签语义指导的离散跨模态哈希"(Discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance,CHRLSG)。首先,我们引入两个松弛变量作为潜在空间。一个用于协同提取文本特征和标签语义信息,另一个用于协同提取图像特征和标签语义信息。其次,由于 CHRLSG 以标签为主导,以特征为辅助,协同学习特征语义和标签语义,因此能从潜在空间生成更准确的哈希代码。第三,我们利用标签通过保持成对的接近性来加强模态间样本的相似关系。充分利用标签语义来避免分类错误。第四,在保持样本相似度不变的情况下,引入类权重进一步提高模内不同类样本的区分度。因此,CHRLSG 模型不仅保留了样本之间的关系,还在协作优化过程中保持了标签语义的一致性。三个常见基准数据集的实验结果表明,所提出的模型优于现有的先进方法。
{"title":"Discrete cross-modal hashing with relaxation and label semantic guidance","authors":"Shaohua Teng, Wenbiao Huang, Naiqi Wu, Guanglong Du, Tongbao Chen, Wei Zhang, Luyao Teng","doi":"10.1007/s11280-024-01239-6","DOIUrl":"https://doi.org/10.1007/s11280-024-01239-6","url":null,"abstract":"<p>Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online social networks facilitate the spread of information, while rumors can also propagate widely and fast, which may mislead some users. Therefore, suppressing the spread of rumors has become a daunting task. One of the widely used approaches is to select users in the social network to spread the truth and compete against the rumor, so that users who receive the truth before receiving rumors will not trust or propagate the rumor. However, the existing works only aim to speed up blocking rumors without considering the pertinency of users. For example, consider a social media platform operator aiming to enhance user online safety. Based on the user’s online behavior, the users who are at high risk should be alerted first. Motivated by this, we formally define the rumor blocking with pertinence set (RBP) problem, which aims to find a truth seed set that maximizes the number of nodes affected by truth and ensures that the number of influenced nodes within the pertinence set reaches at least a given threshold. To solve this problem, we design a hybrid greedy framework (HGF) algorithm with local and global phases. We prove that HGF can provide a ((1-1/e-epsilon ))-approximate solution with high probability while reducing the cost of the sampling process. Extensive experiments on 8 real social networks demonstrate the efficiency and effectiveness of our proposed algorithms.
{"title":"Rumor blocking with pertinence set in large graphs","authors":"Fangsong Xiang, Jinghao Wang, Yanping Wu, Xiaoyang Wang, Chen Chen, Ying Zhang","doi":"10.1007/s11280-024-01235-w","DOIUrl":"https://doi.org/10.1007/s11280-024-01235-w","url":null,"abstract":"<p>Online social networks facilitate the spread of information, while rumors can also propagate widely and fast, which may mislead some users. Therefore, suppressing the spread of rumors has become a daunting task. One of the widely used approaches is to select users in the social network to spread the truth and compete against the rumor, so that users who receive the truth before receiving rumors will not trust or propagate the rumor. However, the existing works only aim to speed up blocking rumors without considering the pertinency of users. For example, consider a social media platform operator aiming to enhance user online safety. Based on the user’s online behavior, the users who are at high risk should be alerted first. Motivated by this, we formally define the rumor blocking with pertinence set (RBP) problem, which aims to find a truth seed set that maximizes the number of nodes affected by truth and ensures that the number of influenced nodes within the pertinence set reaches at least a given threshold. To solve this problem, we design a hybrid greedy framework (HGF) algorithm with local and global phases. We prove that HGF can provide a <span>((1-1/e-epsilon ))</span>-approximate solution with high probability while reducing the cost of the sampling process. Extensive experiments on 8 real social networks demonstrate the efficiency and effectiveness of our proposed algorithms.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-20DOI: 10.1007/s11280-024-01244-9
Abstract
Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.
摘要 由于要处理连续的非结构化大数据流,挖掘实时流数据是一个比挖掘静态数据更具挑战性的研究课题。当流式数据中包含敏感数据时,隐私问题依然存在。近年来,静态数据匿名化研究取得了重大进展。对于准标识符的匿名化,两种典型的策略是泛化和抑制。然而,流数据的高动态性和潜在的无限属性使其成为一项具有挑战性的任务。为此,我们在本文中提出了一种新颖的高效逼近和隐私保护算法(EAPPA)框架,以最小的信息损失(IL)和计算要求实现对实时流数据的高效预处理及其隐私保护。由于现有的流数据隐私保护解决方案存在冗余数据的难题,我们首先提出了数据近似与数据预处理的高效技术。我们设计了 Flajolet Martin(FM)算法,通过数据清洗机制对数据流中的唯一元素进行稳健高效的逼近。我们将定期近似和预处理的流数据输入匿名化算法。利用自适应聚类,我们为数据流提出了创新的 k 匿名化和 l 多样性隐私原则。所提出的方法会扫描数据流,检测并重新使用符合 k-anonymity 和 l-diversity 标准的聚类,以减少匿名化时间和 IL。实验结果表明,与最先进的方法相比,EAPPA 框架非常高效。
{"title":"Efficient approximation and privacy preservation algorithms for real time online evolving data streams","authors":"","doi":"10.1007/s11280-024-01244-9","DOIUrl":"https://doi.org/10.1007/s11280-024-01244-9","url":null,"abstract":"<h3>Abstract</h3> <p>Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-17DOI: 10.1007/s11280-024-01236-9
Shuqi Ruan, Chao Yang, Dongsheng Li
Sequential recommendation aims to predict the next items that users will interact with according to the sequential dependencies within historical user interactions. Recently, self-attention based sequence modeling methods have become the mainstream method due to their competitive accuracy. Despite their effectiveness, these methods still have non-trivial limitations: (1) they mainly take the transition patterns between items into consideration but ignore the semantic associations between items, and (2) they mostly focus on dynamic short-term user preferences and fail to consider user static long-term preferences explicitly. To address these limitations, we propose a Knowledge Enhanced Personalized Hierarchical Attention Network (KPHAN), which can incorporate the semantic associations among items by learning from knowledge graphs and capture the fine-grained long- and short-term interests of users through a novel personalized hierarchical attention network. Specifically, we employ the entities and relationships in the knowledge graph to enrich semantic information for items while preserving the structural information of the knowledge graph. The self-attention mechanism then captures semantic associations among items to obtain short-term user preferences more accurately. Finally, a personalized hierarchical attention network is developed to generate the final user preference representations, which can fully capture user static long-term preferences while fusing dynamic short-term preferences. Experimental results on three real-world datasets demonstrate that our method can outperform prior works by 2.7% - 35.5% on HR metrics and 6.7% - 27.9% on NDCG metrics.
{"title":"Knowledge-enhanced personalized hierarchical attention network for sequential recommendation","authors":"Shuqi Ruan, Chao Yang, Dongsheng Li","doi":"10.1007/s11280-024-01236-9","DOIUrl":"https://doi.org/10.1007/s11280-024-01236-9","url":null,"abstract":"<p>Sequential recommendation aims to predict the next items that users will interact with according to the sequential dependencies within historical user interactions. Recently, self-attention based sequence modeling methods have become the mainstream method due to their competitive accuracy. Despite their effectiveness, these methods still have non-trivial limitations: (1) they mainly take the transition patterns between items into consideration but ignore the semantic associations between items, and (2) they mostly focus on dynamic short-term user preferences and fail to consider user static long-term preferences explicitly. To address these limitations, we propose a Knowledge Enhanced Personalized Hierarchical Attention Network (KPHAN), which can incorporate the semantic associations among items by learning from knowledge graphs and capture the fine-grained long- and short-term interests of users through a novel personalized hierarchical attention network. Specifically, we employ the entities and relationships in the knowledge graph to enrich semantic information for items while preserving the structural information of the knowledge graph. The self-attention mechanism then captures semantic associations among items to obtain short-term user preferences more accurately. Finally, a personalized hierarchical attention network is developed to generate the final user preference representations, which can fully capture user static long-term preferences while fusing dynamic short-term preferences. Experimental results on three real-world datasets demonstrate that our method can outperform prior works by 2.7% - 35.5% on HR metrics and 6.7% - 27.9% on NDCG metrics.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139481608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Networks in the real world are dynamic and evolving. The most critical process in networks is to determine the structure of the community, based on which we can detect hidden communities in a complex network. The design of strong network structures is of great importance, meaning that a system must maintain its function in the face of attacks and failures and have a strong community structure. In this paper, we proposed the robust memetic algorithm and used the idea to optimize the detection of dynamic communities in complex networks called RDMA_NET (Robust Dynamic Memetic Algorithm). In this method, we work on dynamic data that affects the two main parts of the initial population value and the calculation of the evaluation function of each population, and there is no need to determine the number of communities in advance. We used two sets of real-world networks and the LFR dataset. The results show that our proposed method, RDMA_Net, can find a better solution than modern approaches and provide near-optimal performance in search of network topologies with a strong community structure.
{"title":"A novel robust memetic algorithm for dynamic community structures detection in complex networks","authors":"Somayeh Ranjkesh, Behrooz Masoumi, Seyyed Mohsen Hashemi","doi":"10.1007/s11280-024-01238-7","DOIUrl":"https://doi.org/10.1007/s11280-024-01238-7","url":null,"abstract":"<p>Networks in the real world are dynamic and evolving. The most critical process in networks is to determine the structure of the community, based on which we can detect hidden communities in a complex network. The design of strong network structures is of great importance, meaning that a system must maintain its function in the face of attacks and failures and have a strong community structure. In this paper, we proposed the robust memetic algorithm and used the idea to optimize the detection of dynamic communities in complex networks called RDMA_NET (Robust Dynamic Memetic Algorithm). In this method, we work on dynamic data that affects the two main parts of the initial population value and the calculation of the evaluation function of each population, and there is no need to determine the number of communities in advance. We used two sets of real-world networks and the LFR dataset. The results show that our proposed method, RDMA_Net, can find a better solution than modern approaches and provide near-optimal performance in search of network topologies with a strong community structure.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139481931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an information-driven distributed genetic algorithm (ID-DGA) that aims to achieve optimal anonymization through attribute generalization and record suppression. The proposed algorithm incorporates various components, including an information-driven crossover operator, an information-driven mutation operator, an information-driven improvement operator, and a two-dimensional selection operator. Furthermore, a distributed population model is utilized to improve population diversity while reducing the running time. Experimental results confirm the superiority of ID-DGA in terms of solution accuracy, convergence speed, and the effectiveness of all the proposed components.
{"title":"Privacy-preserving data publishing: an information-driven distributed genetic algorithm","authors":"Yong-Feng Ge, Hua Wang, Jinli Cao, Yanchun Zhang, Xiaohong Jiang","doi":"10.1007/s11280-024-01241-y","DOIUrl":"https://doi.org/10.1007/s11280-024-01241-y","url":null,"abstract":"<p>The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an information-driven distributed genetic algorithm (ID-DGA) that aims to achieve optimal anonymization through attribute generalization and record suppression. The proposed algorithm incorporates various components, including an information-driven crossover operator, an information-driven mutation operator, an information-driven improvement operator, and a two-dimensional selection operator. Furthermore, a distributed population model is utilized to improve population diversity while reducing the running time. Experimental results confirm the superiority of ID-DGA in terms of solution accuracy, convergence speed, and the effectiveness of all the proposed components.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139469924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-26DOI: 10.1007/s11280-023-01212-9
Abstract
With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin transactions are expected to be populated to the Bitcoin blockchain system. As a result, many transactions can encounter different confirmation delays. Concerned about this, it becomes vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in the Bitcoin blockchain. In this work, we address the issue of predicting confirmation time within a block interval rather than pinpointing a specific timestamp. After dividing the future into a set of block intervals (i.e., classes), the prediction of a transaction’s confirmation is treated as a classification problem. To solve it, we propose a framework, Hybrid Confirmation Time Estimation Network (Hybrid-CTEN), based on neural networks and XGBoost to predict transaction confirmation time in the Bitcoin blockchain system using three different sources of information: historical transactions in the blockchain, unconfirmed transactions in the mempool, as well as the estimated transaction itself. Finally, experiments on real-world blockchain data demonstrate that, other than XGBoost excelling in the binary classification case (to predict whether a transaction will be confirmed in the next generated block), our proposed framework Hybrid-CTEN outperforms state-of-the-art methods on precision, recall and f1-score on all the multiclass classification cases (4-class, 6-class and 8-class) to predict in which future block interval a transaction will be confirmed.
{"title":"Enhancing bitcoin transaction confirmation prediction: a hybrid model combining neural networks and XGBoost","authors":"","doi":"10.1007/s11280-023-01212-9","DOIUrl":"https://doi.org/10.1007/s11280-023-01212-9","url":null,"abstract":"<h3>Abstract</h3> <p>With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin transactions are expected to be populated to the Bitcoin blockchain system. As a result, many transactions can encounter different confirmation delays. Concerned about this, it becomes vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in the Bitcoin blockchain. In this work, we address the issue of predicting confirmation time within a block interval rather than pinpointing a specific timestamp. After dividing the future into a set of block intervals (i.e., classes), the prediction of a transaction’s confirmation is treated as a classification problem. To solve it, we propose a framework, Hybrid Confirmation Time Estimation Network (<strong>Hybrid-CTEN</strong>), based on neural networks and XGBoost to predict transaction confirmation time in the Bitcoin blockchain system using three different sources of information: historical transactions in the blockchain, unconfirmed transactions in the mempool, as well as the estimated transaction itself. Finally, experiments on real-world blockchain data demonstrate that, other than XGBoost excelling in the binary classification case (to predict whether a transaction will be confirmed in the next generated block), our proposed framework Hybrid-CTEN outperforms state-of-the-art methods on precision, recall and f1-score on all the multiclass classification cases (4-class, 6-class and 8-class) to predict in which future block interval a transaction will be confirmed.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139052998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge representation models have been extensively studied and they provide an important foundation for artificial intelligence. However, the existing knowledge representation models or related knowledge embedding methods mostly aim at static or temporal knowledge, which are not suitable for highly spatio-temporal relevant knowledge, such as the cyber security knowledge. In this paper, we propose a knowledge embedding framework called TSEE to handle this problem, which builds on the MDATA model to represent and utilize dynamic knowledge for cyber security. TSEE is composed of knowledge extraction module, knowledge representation module, knowledge embedding module, and situational awareness module. There modules can obtain, transform, and embed cyber security knowledge from different sources, improving the detection capabilities of various complicated attacks. We conduct experiments on the cyber range for evaluation, and the experimental results validate the higher prediction accuracy and stronger extendability than existing embedding methods. The framework can effectively improve the cyber security defense capabilities in the future.
{"title":"TSEE: a novel knowledge embedding framework for cyberspace security","authors":"Angxiao Zhao, Zhaoquan Gu, Yan Jia, Wenying Feng, Jianye Yang, Yanchun Zhang","doi":"10.1007/s11280-023-01220-9","DOIUrl":"https://doi.org/10.1007/s11280-023-01220-9","url":null,"abstract":"<p>Knowledge representation models have been extensively studied and they provide an important foundation for artificial intelligence. However, the existing knowledge representation models or related knowledge embedding methods mostly aim at static or temporal knowledge, which are not suitable for highly spatio-temporal relevant knowledge, such as the cyber security knowledge. In this paper, we propose a knowledge embedding framework called TSEE to handle this problem, which builds on the MDATA model to represent and utilize dynamic knowledge for cyber security. TSEE is composed of knowledge extraction module, knowledge representation module, knowledge embedding module, and situational awareness module. There modules can obtain, transform, and embed cyber security knowledge from different sources, improving the detection capabilities of various complicated attacks. We conduct experiments on the cyber range for evaluation, and the experimental results validate the higher prediction accuracy and stronger extendability than existing embedding methods. The framework can effectively improve the cyber security defense capabilities in the future.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138819912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-20DOI: 10.1007/s11280-023-01215-6
Huiwang Zhang, Pengpeng Zhao, Xuefeng Xian, Victor S. Sheng, Yongjing Hao, Zhiming Cui
Reinforcement learning (RL) has achieved ideal performance in recommendation systems (RSs) by taking care of both immediate and future rewards from users. However, the existing RL-based recommendation methods assume that only a single type of interaction behavior (e.g., clicking) exists between user and item, whereas practical recommendation scenarios involve multiple types of user interaction behaviors (e.g., adding to cart, purchasing). In this paper, we propose a Multi-Task Reinforcement Learning model for multi-behavior Recommendation (MTRL4Rec), which gives different actions for users’ different behaviors with a single agent. Specifically, we first introduce a modular network in which modules can be shared or isolated to capture the commonalities and differences across users’ behaviors. Then a task routing network is used to generate routes in the modular network for each behavior task. We adopt a hierarchical reinforcement learning architecture to improve the efficiency of MTRL4Rec. Finally, a training algorithm and a further improved training algorithm are proposed for our model training. Experiments on two public datasets validated the effectiveness of MTRL4Rec.
{"title":"Click is not equal to purchase: multi-task reinforcement learning for multi-behavior recommendation","authors":"Huiwang Zhang, Pengpeng Zhao, Xuefeng Xian, Victor S. Sheng, Yongjing Hao, Zhiming Cui","doi":"10.1007/s11280-023-01215-6","DOIUrl":"https://doi.org/10.1007/s11280-023-01215-6","url":null,"abstract":"<p>Reinforcement learning (RL) has achieved ideal performance in recommendation systems (RSs) by taking care of both immediate and future rewards from users. However, the existing RL-based recommendation methods assume that only a single type of interaction behavior (e.g., clicking) exists between user and item, whereas practical recommendation scenarios involve multiple types of user interaction behaviors (e.g., adding to cart, purchasing). In this paper, we propose a Multi-Task Reinforcement Learning model for multi-behavior Recommendation (MTRL4Rec), which gives different actions for users’ different behaviors with a single agent. Specifically, we first introduce a modular network in which modules can be shared or isolated to capture the commonalities and differences across users’ behaviors. Then a task routing network is used to generate routes in the modular network for each behavior task. We adopt a hierarchical reinforcement learning architecture to improve the efficiency of MTRL4Rec. Finally, a training algorithm and a further improved training algorithm are proposed for our model training. Experiments on two public datasets validated the effectiveness of MTRL4Rec.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138821712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conversation Recommender System (CRS) engage in multi-turn conversations with users and provide recommendations through responses. As user preferences evolve dynamically during the course of the conversation, it is crucial to understand natural interaction utterances to capture the user’s dynamic preference accurately. Existing research has focused on obtaining user preference at the entity level and natural language level, and bridging the semantic gap through techniques such as knowledge augmentation, semantic fusion, and prompt learning. However, the representation of each level remains under-explored. At the entity level, user preference is typically extracted from Knowledge Graphs, while other modal data is often overlooked. At the natural language level, user representation is obtained from a fixed language model, disregarding the relationships between different contexts. In this paper, we propose User-augmented Conversation Recommendation via Multi-modal graph learning and Context Mining (UaMC) to address above limitations. At the entity level, we enrich user preference by leveraging multi-modal knowledge. At the natural language level, we employ contrast learning to extract user preference from similar contexts. By incorporating the enhanced representation of user preference, we utilize prompt learning techniques to generate responses related to recommended items. We conduct experiments on two public CRS benchmarks, demonstrating the effectiveness of our approach in both the recommendation and conversation subtasks.
{"title":"UaMC: user-augmented conversation recommendation via multi-modal graph learning and context mining","authors":"Siqi Fan, Yequan Wang, Xiaobing Pang, Lisi Chen, Peng Han, Shuo Shang","doi":"10.1007/s11280-023-01219-2","DOIUrl":"https://doi.org/10.1007/s11280-023-01219-2","url":null,"abstract":"<p>Conversation Recommender System (CRS) engage in multi-turn conversations with users and provide recommendations through responses. As user preferences evolve dynamically during the course of the conversation, it is crucial to understand natural interaction utterances to capture the user’s dynamic preference accurately. Existing research has focused on obtaining user preference at the entity level and natural language level, and bridging the semantic gap through techniques such as knowledge augmentation, semantic fusion, and prompt learning. However, the representation of each level remains under-explored. At the entity level, user preference is typically extracted from Knowledge Graphs, while other modal data is often overlooked. At the natural language level, user representation is obtained from a fixed language model, disregarding the relationships between different contexts. In this paper, we propose <u>U</u>ser-<u>a</u>ugmented Conversation Recommendation via <u>M</u>ulti-modal graph learning and <u>C</u>ontext Mining (<b>UaMC</b>) to address above limitations. At the entity level, we enrich user preference by leveraging multi-modal knowledge. At the natural language level, we employ contrast learning to extract user preference from similar contexts. By incorporating the enhanced representation of user preference, we utilize prompt learning techniques to generate responses related to recommended items. We conduct experiments on two public CRS benchmarks, demonstrating the effectiveness of our approach in both the recommendation and conversation subtasks.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138745557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}