首页 > 最新文献

World Wide Web最新文献

英文 中文
Discrete cross-modal hashing with relaxation and label semantic guidance 带松弛和标签语义引导的离散跨模态哈希算法
Pub Date : 2024-01-20 DOI: 10.1007/s11280-024-01239-6
Shaohua Teng, Wenbiao Huang, Naiqi Wu, Guanglong Du, Tongbao Chen, Wei Zhang, Luyao Teng

Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.

有监督的跨模态哈希算法吸引了许多研究人员。在这些研究中,他们寻求共同的语义空间,或直接将零一标签信息回归到汉明空间。虽然这些研究取得了不少成果,但也忽略了一些问题:1)分类任务的一些方法不适合检索任务,因为它们缺乏对样本个性化特征的学习;2)哈希检索的结果与哈希码的长度和编码方法都有关。由于样本拥有比标签语义更多的个性化特征,本文提出了一种新的有监督的跨模态哈希协作学习方法,称为 "带松弛和标签语义指导的离散跨模态哈希"(Discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance,CHRLSG)。首先,我们引入两个松弛变量作为潜在空间。一个用于协同提取文本特征和标签语义信息,另一个用于协同提取图像特征和标签语义信息。其次,由于 CHRLSG 以标签为主导,以特征为辅助,协同学习特征语义和标签语义,因此能从潜在空间生成更准确的哈希代码。第三,我们利用标签通过保持成对的接近性来加强模态间样本的相似关系。充分利用标签语义来避免分类错误。第四,在保持样本相似度不变的情况下,引入类权重进一步提高模内不同类样本的区分度。因此,CHRLSG 模型不仅保留了样本之间的关系,还在协作优化过程中保持了标签语义的一致性。三个常见基准数据集的实验结果表明,所提出的模型优于现有的先进方法。
{"title":"Discrete cross-modal hashing with relaxation and label semantic guidance","authors":"Shaohua Teng, Wenbiao Huang, Naiqi Wu, Guanglong Du, Tongbao Chen, Wei Zhang, Luyao Teng","doi":"10.1007/s11280-024-01239-6","DOIUrl":"https://doi.org/10.1007/s11280-024-01239-6","url":null,"abstract":"<p>Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rumor blocking with pertinence set in large graphs 在大型图形中利用相关性集阻断谣言
Pub Date : 2024-01-20 DOI: 10.1007/s11280-024-01235-w
Fangsong Xiang, Jinghao Wang, Yanping Wu, Xiaoyang Wang, Chen Chen, Ying Zhang

Online social networks facilitate the spread of information, while rumors can also propagate widely and fast, which may mislead some users. Therefore, suppressing the spread of rumors has become a daunting task. One of the widely used approaches is to select users in the social network to spread the truth and compete against the rumor, so that users who receive the truth before receiving rumors will not trust or propagate the rumor. However, the existing works only aim to speed up blocking rumors without considering the pertinency of users. For example, consider a social media platform operator aiming to enhance user online safety. Based on the user’s online behavior, the users who are at high risk should be alerted first. Motivated by this, we formally define the rumor blocking with pertinence set (RBP) problem, which aims to find a truth seed set that maximizes the number of nodes affected by truth and ensures that the number of influenced nodes within the pertinence set reaches at least a given threshold. To solve this problem, we design a hybrid greedy framework (HGF) algorithm with local and global phases. We prove that HGF can provide a ((1-1/e-epsilon ))-approximate solution with high probability while reducing the cost of the sampling process. Extensive experiments on 8 real social networks demonstrate the efficiency and effectiveness of our proposed algorithms.

在线社交网络为信息传播提供了便利,而谣言也会广泛而迅速地传播,可能会误导一些用户。因此,抑制谣言传播成了一项艰巨的任务。其中一种被广泛使用的方法是,在社交网络中选择用户传播真相,与谣言竞争,使用户在收到谣言之前先收到真相,从而不再相信或传播谣言。然而,现有的工作只是为了加快封堵谣言的速度,而没有考虑用户的针对性。例如,考虑到社交媒体平台运营商旨在提高用户的网络安全。根据用户的上网行为,应首先提醒高风险用户。受此启发,我们正式定义了带相关性集的谣言阻断(RBP)问题,其目的是找到一个真相种子集,使受真相影响的节点数量最大化,并确保相关性集内受影响的节点数量至少达到给定的阈值。为了解决这个问题,我们设计了一种具有局部和全局阶段的混合贪婪框架(HGF)算法。我们证明,HGF 可以提供高概率的((1-1/e-epsilon ))近似解,同时降低采样过程的成本。在 8 个真实社交网络上进行的大量实验证明了我们提出的算法的效率和有效性。
{"title":"Rumor blocking with pertinence set in large graphs","authors":"Fangsong Xiang, Jinghao Wang, Yanping Wu, Xiaoyang Wang, Chen Chen, Ying Zhang","doi":"10.1007/s11280-024-01235-w","DOIUrl":"https://doi.org/10.1007/s11280-024-01235-w","url":null,"abstract":"<p>Online social networks facilitate the spread of information, while rumors can also propagate widely and fast, which may mislead some users. Therefore, suppressing the spread of rumors has become a daunting task. One of the widely used approaches is to select users in the social network to spread the truth and compete against the rumor, so that users who receive the truth before receiving rumors will not trust or propagate the rumor. However, the existing works only aim to speed up blocking rumors without considering the pertinency of users. For example, consider a social media platform operator aiming to enhance user online safety. Based on the user’s online behavior, the users who are at high risk should be alerted first. Motivated by this, we formally define the rumor blocking with pertinence set (RBP) problem, which aims to find a truth seed set that maximizes the number of nodes affected by truth and ensures that the number of influenced nodes within the pertinence set reaches at least a given threshold. To solve this problem, we design a hybrid greedy framework (HGF) algorithm with local and global phases. We prove that HGF can provide a <span>((1-1/e-epsilon ))</span>-approximate solution with high probability while reducing the cost of the sampling process. Extensive experiments on 8 real social networks demonstrate the efficiency and effectiveness of our proposed algorithms.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient approximation and privacy preservation algorithms for real time online evolving data streams 实时在线演化数据流的高效近似和隐私保护算法
Pub Date : 2024-01-20 DOI: 10.1007/s11280-024-01244-9

Abstract

Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.

摘要 由于要处理连续的非结构化大数据流,挖掘实时流数据是一个比挖掘静态数据更具挑战性的研究课题。当流式数据中包含敏感数据时,隐私问题依然存在。近年来,静态数据匿名化研究取得了重大进展。对于准标识符的匿名化,两种典型的策略是泛化和抑制。然而,流数据的高动态性和潜在的无限属性使其成为一项具有挑战性的任务。为此,我们在本文中提出了一种新颖的高效逼近和隐私保护算法(EAPPA)框架,以最小的信息损失(IL)和计算要求实现对实时流数据的高效预处理及其隐私保护。由于现有的流数据隐私保护解决方案存在冗余数据的难题,我们首先提出了数据近似与数据预处理的高效技术。我们设计了 Flajolet Martin(FM)算法,通过数据清洗机制对数据流中的唯一元素进行稳健高效的逼近。我们将定期近似和预处理的流数据输入匿名化算法。利用自适应聚类,我们为数据流提出了创新的 k 匿名化和 l 多样性隐私原则。所提出的方法会扫描数据流,检测并重新使用符合 k-anonymity 和 l-diversity 标准的聚类,以减少匿名化时间和 IL。实验结果表明,与最先进的方法相比,EAPPA 框架非常高效。
{"title":"Efficient approximation and privacy preservation algorithms for real time online evolving data streams","authors":"","doi":"10.1007/s11280-024-01244-9","DOIUrl":"https://doi.org/10.1007/s11280-024-01244-9","url":null,"abstract":"<h3>Abstract</h3> <p>Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139506721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge-enhanced personalized hierarchical attention network for sequential recommendation 用于顺序推荐的知识增强型个性化分层注意力网络
Pub Date : 2024-01-17 DOI: 10.1007/s11280-024-01236-9
Shuqi Ruan, Chao Yang, Dongsheng Li

Sequential recommendation aims to predict the next items that users will interact with according to the sequential dependencies within historical user interactions. Recently, self-attention based sequence modeling methods have become the mainstream method due to their competitive accuracy. Despite their effectiveness, these methods still have non-trivial limitations: (1) they mainly take the transition patterns between items into consideration but ignore the semantic associations between items, and (2) they mostly focus on dynamic short-term user preferences and fail to consider user static long-term preferences explicitly. To address these limitations, we propose a Knowledge Enhanced Personalized Hierarchical Attention Network (KPHAN), which can incorporate the semantic associations among items by learning from knowledge graphs and capture the fine-grained long- and short-term interests of users through a novel personalized hierarchical attention network. Specifically, we employ the entities and relationships in the knowledge graph to enrich semantic information for items while preserving the structural information of the knowledge graph. The self-attention mechanism then captures semantic associations among items to obtain short-term user preferences more accurately. Finally, a personalized hierarchical attention network is developed to generate the final user preference representations, which can fully capture user static long-term preferences while fusing dynamic short-term preferences. Experimental results on three real-world datasets demonstrate that our method can outperform prior works by 2.7% - 35.5% on HR metrics and 6.7% - 27.9% on NDCG metrics.

序列推荐的目的是根据用户历史交互中的序列依赖关系,预测用户将与之交互的下一个项目。最近,基于自我关注的序列建模方法因其极具竞争力的准确性而成为主流方法。尽管这些方法非常有效,但仍存在一些不小的局限性:(1) 它们主要考虑了项目之间的过渡模式,却忽略了项目之间的语义关联;(2) 它们大多关注动态的短期用户偏好,却没有明确考虑用户静态的长期偏好。针对这些局限性,我们提出了一种知识增强型个性化分层注意力网络(KPHAN),它可以通过学习知识图谱来整合项目间的语义关联,并通过一种新颖的个性化分层注意力网络来捕捉用户细粒度的长期和短期兴趣。具体来说,我们利用知识图谱中的实体和关系来丰富项目的语义信息,同时保留知识图谱的结构信息。然后,自我关注机制捕捉项目之间的语义关联,从而更准确地获取用户的短期偏好。最后,我们开发了一个个性化的分层注意力网络来生成最终的用户偏好表征,它可以在融合动态短期偏好的同时充分捕捉用户的静态长期偏好。在三个真实数据集上的实验结果表明,我们的方法在 HR 指标上比之前的研究成果高出 2.7% - 35.5%,在 NDCG 指标上比之前的研究成果高出 6.7% - 27.9%。
{"title":"Knowledge-enhanced personalized hierarchical attention network for sequential recommendation","authors":"Shuqi Ruan, Chao Yang, Dongsheng Li","doi":"10.1007/s11280-024-01236-9","DOIUrl":"https://doi.org/10.1007/s11280-024-01236-9","url":null,"abstract":"<p>Sequential recommendation aims to predict the next items that users will interact with according to the sequential dependencies within historical user interactions. Recently, self-attention based sequence modeling methods have become the mainstream method due to their competitive accuracy. Despite their effectiveness, these methods still have non-trivial limitations: (1) they mainly take the transition patterns between items into consideration but ignore the semantic associations between items, and (2) they mostly focus on dynamic short-term user preferences and fail to consider user static long-term preferences explicitly. To address these limitations, we propose a Knowledge Enhanced Personalized Hierarchical Attention Network (KPHAN), which can incorporate the semantic associations among items by learning from knowledge graphs and capture the fine-grained long- and short-term interests of users through a novel personalized hierarchical attention network. Specifically, we employ the entities and relationships in the knowledge graph to enrich semantic information for items while preserving the structural information of the knowledge graph. The self-attention mechanism then captures semantic associations among items to obtain short-term user preferences more accurately. Finally, a personalized hierarchical attention network is developed to generate the final user preference representations, which can fully capture user static long-term preferences while fusing dynamic short-term preferences. Experimental results on three real-world datasets demonstrate that our method can outperform prior works by 2.7% - 35.5% on HR metrics and 6.7% - 27.9% on NDCG metrics.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139481608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel robust memetic algorithm for dynamic community structures detection in complex networks 用于复杂网络动态群落结构检测的新型鲁棒记忆算法
Pub Date : 2024-01-17 DOI: 10.1007/s11280-024-01238-7
Somayeh Ranjkesh, Behrooz Masoumi, Seyyed Mohsen Hashemi

Networks in the real world are dynamic and evolving. The most critical process in networks is to determine the structure of the community, based on which we can detect hidden communities in a complex network. The design of strong network structures is of great importance, meaning that a system must maintain its function in the face of attacks and failures and have a strong community structure. In this paper, we proposed the robust memetic algorithm and used the idea to optimize the detection of dynamic communities in complex networks called RDMA_NET (Robust Dynamic Memetic Algorithm). In this method, we work on dynamic data that affects the two main parts of the initial population value and the calculation of the evaluation function of each population, and there is no need to determine the number of communities in advance. We used two sets of real-world networks and the LFR dataset. The results show that our proposed method, RDMA_Net, can find a better solution than modern approaches and provide near-optimal performance in search of network topologies with a strong community structure.

现实世界中的网络是动态的、不断发展的。网络中最关键的过程是确定社区结构,根据社区结构,我们可以检测复杂网络中的隐藏社区。强网络结构的设计非常重要,这意味着一个系统在面对攻击和故障时必须保持其功能,并具有强大的社群结构。在本文中,我们提出了鲁棒记忆算法,并将其用于优化复杂网络中动态群落的检测,称为 RDMA_NET(鲁棒动态记忆算法)。在这种方法中,我们处理的是影响初始种群值和计算每个种群的评价函数两大部分的动态数据,而无需事先确定群落数量。我们使用了两组真实世界网络和 LFR 数据集。结果表明,与现代方法相比,我们提出的 RDMA_Net 方法能找到更好的解决方案,并在搜索具有强群落结构的网络拓扑时提供接近最优的性能。
{"title":"A novel robust memetic algorithm for dynamic community structures detection in complex networks","authors":"Somayeh Ranjkesh, Behrooz Masoumi, Seyyed Mohsen Hashemi","doi":"10.1007/s11280-024-01238-7","DOIUrl":"https://doi.org/10.1007/s11280-024-01238-7","url":null,"abstract":"<p>Networks in the real world are dynamic and evolving. The most critical process in networks is to determine the structure of the community, based on which we can detect hidden communities in a complex network. The design of strong network structures is of great importance, meaning that a system must maintain its function in the face of attacks and failures and have a strong community structure. In this paper, we proposed the robust memetic algorithm and used the idea to optimize the detection of dynamic communities in complex networks called RDMA_NET (Robust Dynamic Memetic Algorithm). In this method, we work on dynamic data that affects the two main parts of the initial population value and the calculation of the evaluation function of each population, and there is no need to determine the number of communities in advance. We used two sets of real-world networks and the LFR dataset. The results show that our proposed method, RDMA_Net, can find a better solution than modern approaches and provide near-optimal performance in search of network topologies with a strong community structure.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139481931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving data publishing: an information-driven distributed genetic algorithm 保护隐私的数据发布:信息驱动的分布式遗传算法
Pub Date : 2024-01-15 DOI: 10.1007/s11280-024-01241-y
Yong-Feng Ge, Hua Wang, Jinli Cao, Yanchun Zhang, Xiaohong Jiang

The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an information-driven distributed genetic algorithm (ID-DGA) that aims to achieve optimal anonymization through attribute generalization and record suppression. The proposed algorithm incorporates various components, including an information-driven crossover operator, an information-driven mutation operator, an information-driven improvement operator, and a two-dimensional selection operator. Furthermore, a distributed population model is utilized to improve population diversity while reducing the running time. Experimental results confirm the superiority of ID-DGA in terms of solution accuracy, convergence speed, and the effectiveness of all the proposed components.

由于对数据发布的要求越来越高,以及对数据隐私的担忧,隐私保护数据发布(PPDP)问题得到了研究界、行业和政府的广泛关注。然而,如何在保护隐私和保持数据质量之间取得平衡,仍然是 PPDP 中一项具有挑战性的任务。本文提出了一种信息驱动分布式遗传算法(ID-DGA),旨在通过属性泛化和记录抑制实现最佳匿名化。该算法包含多个组件,包括信息驱动的交叉算子、信息驱动的突变算子、信息驱动的改进算子和二维选择算子。此外,还利用分布式种群模型来提高种群多样性,同时减少运行时间。实验结果证实,ID-DGA 在求解精度、收敛速度以及所有建议组件的有效性方面都具有优越性。
{"title":"Privacy-preserving data publishing: an information-driven distributed genetic algorithm","authors":"Yong-Feng Ge, Hua Wang, Jinli Cao, Yanchun Zhang, Xiaohong Jiang","doi":"10.1007/s11280-024-01241-y","DOIUrl":"https://doi.org/10.1007/s11280-024-01241-y","url":null,"abstract":"<p>The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an information-driven distributed genetic algorithm (ID-DGA) that aims to achieve optimal anonymization through attribute generalization and record suppression. The proposed algorithm incorporates various components, including an information-driven crossover operator, an information-driven mutation operator, an information-driven improvement operator, and a two-dimensional selection operator. Furthermore, a distributed population model is utilized to improve population diversity while reducing the running time. Experimental results confirm the superiority of ID-DGA in terms of solution accuracy, convergence speed, and the effectiveness of all the proposed components.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139469924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing bitcoin transaction confirmation prediction: a hybrid model combining neural networks and XGBoost 增强比特币交易确认预测:结合神经网络和 XGBoost 的混合模型
Pub Date : 2023-12-26 DOI: 10.1007/s11280-023-01212-9

Abstract

With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin transactions are expected to be populated to the Bitcoin blockchain system. As a result, many transactions can encounter different confirmation delays. Concerned about this, it becomes vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in the Bitcoin blockchain. In this work, we address the issue of predicting confirmation time within a block interval rather than pinpointing a specific timestamp. After dividing the future into a set of block intervals (i.e., classes), the prediction of a transaction’s confirmation is treated as a classification problem. To solve it, we propose a framework, Hybrid Confirmation Time Estimation Network (Hybrid-CTEN), based on neural networks and XGBoost to predict transaction confirmation time in the Bitcoin blockchain system using three different sources of information: historical transactions in the blockchain, unconfirmed transactions in the mempool, as well as the estimated transaction itself. Finally, experiments on real-world blockchain data demonstrate that, other than XGBoost excelling in the binary classification case (to predict whether a transaction will be confirmed in the next generated block), our proposed framework Hybrid-CTEN outperforms state-of-the-art methods on precision, recall and f1-score on all the multiclass classification cases (4-class, 6-class and 8-class) to predict in which future block interval a transaction will be confirmed.

摘要 随着比特币被公认为最受欢迎的加密货币,预计会有更多的比特币交易填充到比特币区块链系统中。因此,许多交易可能会遇到不同的确认延迟。有鉴于此,帮助用户了解交易在比特币区块链中得到确认可能需要多长时间(如果可能的话)变得至关重要。在这项工作中,我们要解决的问题是预测一个区块间隔内的确认时间,而不是确定一个具体的时间戳。在将未来划分为一组区块区间(即类)后,交易确认的预测被视为一个分类问题。为了解决这个问题,我们提出了一个基于神经网络和 XGBoost 的框架--混合确认时间估算网络(Hybrid-CTEN),利用三种不同的信息来源预测比特币区块链系统中的交易确认时间:区块链中的历史交易、内存池中未确认的交易以及估计的交易本身。最后,真实区块链数据的实验表明,除了 XGBoost 在二元分类(预测交易是否会在下一个生成的区块中得到确认)情况下表现出色外,我们提出的框架 Hybrid-CTEN 在所有多类分类情况(4 类、6 类和 8 类)下的精确度、召回率和 f1 分数都优于最先进的方法,可以预测交易将在未来哪个区块区间得到确认。
{"title":"Enhancing bitcoin transaction confirmation prediction: a hybrid model combining neural networks and XGBoost","authors":"","doi":"10.1007/s11280-023-01212-9","DOIUrl":"https://doi.org/10.1007/s11280-023-01212-9","url":null,"abstract":"<h3>Abstract</h3> <p>With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin transactions are expected to be populated to the Bitcoin blockchain system. As a result, many transactions can encounter different confirmation delays. Concerned about this, it becomes vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in the Bitcoin blockchain. In this work, we address the issue of predicting confirmation time within a block interval rather than pinpointing a specific timestamp. After dividing the future into a set of block intervals (i.e., classes), the prediction of a transaction’s confirmation is treated as a classification problem. To solve it, we propose a framework, Hybrid Confirmation Time Estimation Network (<strong>Hybrid-CTEN</strong>), based on neural networks and XGBoost to predict transaction confirmation time in the Bitcoin blockchain system using three different sources of information: historical transactions in the blockchain, unconfirmed transactions in the mempool, as well as the estimated transaction itself. Finally, experiments on real-world blockchain data demonstrate that, other than XGBoost excelling in the binary classification case (to predict whether a transaction will be confirmed in the next generated block), our proposed framework Hybrid-CTEN outperforms state-of-the-art methods on precision, recall and f1-score on all the multiclass classification cases (4-class, 6-class and 8-class) to predict in which future block interval a transaction will be confirmed.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139052998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TSEE: a novel knowledge embedding framework for cyberspace security TSEE:用于网络空间安全的新型知识嵌入框架
Pub Date : 2023-12-20 DOI: 10.1007/s11280-023-01220-9
Angxiao Zhao, Zhaoquan Gu, Yan Jia, Wenying Feng, Jianye Yang, Yanchun Zhang

Knowledge representation models have been extensively studied and they provide an important foundation for artificial intelligence. However, the existing knowledge representation models or related knowledge embedding methods mostly aim at static or temporal knowledge, which are not suitable for highly spatio-temporal relevant knowledge, such as the cyber security knowledge. In this paper, we propose a knowledge embedding framework called TSEE to handle this problem, which builds on the MDATA model to represent and utilize dynamic knowledge for cyber security. TSEE is composed of knowledge extraction module, knowledge representation module, knowledge embedding module, and situational awareness module. There modules can obtain, transform, and embed cyber security knowledge from different sources, improving the detection capabilities of various complicated attacks. We conduct experiments on the cyber range for evaluation, and the experimental results validate the higher prediction accuracy and stronger extendability than existing embedding methods. The framework can effectively improve the cyber security defense capabilities in the future.

知识表示模型已被广泛研究,并为人工智能提供了重要基础。然而,现有的知识表示模型或相关知识嵌入方法大多针对静态或时态知识,并不适合时空相关性强的知识,如网络安全知识。本文提出了一种名为 TSEE 的知识嵌入框架来解决这一问题,该框架建立在 MDATA 模型的基础上,用于表示和利用网络安全动态知识。TSEE 由知识提取模块、知识表示模块、知识嵌入模块和态势感知模块组成。这些模块可以从不同来源获取、转换和嵌入网络安全知识,提高对各种复杂攻击的检测能力。我们在网络范围内进行了实验评估,实验结果验证了与现有的嵌入方法相比,该方法具有更高的预测精度和更强的可扩展性。该框架可在未来有效提高网络安全防御能力。
{"title":"TSEE: a novel knowledge embedding framework for cyberspace security","authors":"Angxiao Zhao, Zhaoquan Gu, Yan Jia, Wenying Feng, Jianye Yang, Yanchun Zhang","doi":"10.1007/s11280-023-01220-9","DOIUrl":"https://doi.org/10.1007/s11280-023-01220-9","url":null,"abstract":"<p>Knowledge representation models have been extensively studied and they provide an important foundation for artificial intelligence. However, the existing knowledge representation models or related knowledge embedding methods mostly aim at static or temporal knowledge, which are not suitable for highly spatio-temporal relevant knowledge, such as the cyber security knowledge. In this paper, we propose a knowledge embedding framework called TSEE to handle this problem, which builds on the MDATA model to represent and utilize dynamic knowledge for cyber security. TSEE is composed of knowledge extraction module, knowledge representation module, knowledge embedding module, and situational awareness module. There modules can obtain, transform, and embed cyber security knowledge from different sources, improving the detection capabilities of various complicated attacks. We conduct experiments on the cyber range for evaluation, and the experimental results validate the higher prediction accuracy and stronger extendability than existing embedding methods. The framework can effectively improve the cyber security defense capabilities in the future.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138819912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Click is not equal to purchase: multi-task reinforcement learning for multi-behavior recommendation 点击不等于购买:针对多种行为推荐的多任务强化学习
Pub Date : 2023-12-20 DOI: 10.1007/s11280-023-01215-6
Huiwang Zhang, Pengpeng Zhao, Xuefeng Xian, Victor S. Sheng, Yongjing Hao, Zhiming Cui

Reinforcement learning (RL) has achieved ideal performance in recommendation systems (RSs) by taking care of both immediate and future rewards from users. However, the existing RL-based recommendation methods assume that only a single type of interaction behavior (e.g., clicking) exists between user and item, whereas practical recommendation scenarios involve multiple types of user interaction behaviors (e.g., adding to cart, purchasing). In this paper, we propose a Multi-Task Reinforcement Learning model for multi-behavior Recommendation (MTRL4Rec), which gives different actions for users’ different behaviors with a single agent. Specifically, we first introduce a modular network in which modules can be shared or isolated to capture the commonalities and differences across users’ behaviors. Then a task routing network is used to generate routes in the modular network for each behavior task. We adopt a hierarchical reinforcement learning architecture to improve the efficiency of MTRL4Rec. Finally, a training algorithm and a further improved training algorithm are proposed for our model training. Experiments on two public datasets validated the effectiveness of MTRL4Rec.

强化学习(RL)通过兼顾用户当前和未来的回报,在推荐系统(RS)中取得了理想的性能。然而,现有的基于 RL 的推荐方法假定用户与商品之间只存在单一类型的交互行为(如点击),而实际的推荐场景涉及多种类型的用户交互行为(如添加到购物车、购买)。在本文中,我们提出了一种用于多行为推荐的多任务强化学习模型(MTRL4Rec),该模型通过单个代理对用户的不同行为采取不同的行动。具体来说,我们首先引入一个模块化网络,其中的模块可以共享或隔离,以捕捉用户行为的共性和差异。然后,使用任务路由网络在模块化网络中为每个行为任务生成路由。我们采用分层强化学习架构来提高 MTRL4Rec 的效率。最后,我们为模型训练提出了一种训练算法和一种进一步改进的训练算法。在两个公开数据集上的实验验证了 MTRL4Rec 的有效性。
{"title":"Click is not equal to purchase: multi-task reinforcement learning for multi-behavior recommendation","authors":"Huiwang Zhang, Pengpeng Zhao, Xuefeng Xian, Victor S. Sheng, Yongjing Hao, Zhiming Cui","doi":"10.1007/s11280-023-01215-6","DOIUrl":"https://doi.org/10.1007/s11280-023-01215-6","url":null,"abstract":"<p>Reinforcement learning (RL) has achieved ideal performance in recommendation systems (RSs) by taking care of both immediate and future rewards from users. However, the existing RL-based recommendation methods assume that only a single type of interaction behavior (e.g., clicking) exists between user and item, whereas practical recommendation scenarios involve multiple types of user interaction behaviors (e.g., adding to cart, purchasing). In this paper, we propose a Multi-Task Reinforcement Learning model for multi-behavior Recommendation (MTRL4Rec), which gives different actions for users’ different behaviors with a single agent. Specifically, we first introduce a modular network in which modules can be shared or isolated to capture the commonalities and differences across users’ behaviors. Then a task routing network is used to generate routes in the modular network for each behavior task. We adopt a hierarchical reinforcement learning architecture to improve the efficiency of MTRL4Rec. Finally, a training algorithm and a further improved training algorithm are proposed for our model training. Experiments on two public datasets validated the effectiveness of MTRL4Rec.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138821712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UaMC: user-augmented conversation recommendation via multi-modal graph learning and context mining UaMC:通过多模态图学习和语境挖掘实现用户增强对话推荐
Pub Date : 2023-12-19 DOI: 10.1007/s11280-023-01219-2
Siqi Fan, Yequan Wang, Xiaobing Pang, Lisi Chen, Peng Han, Shuo Shang

Conversation Recommender System (CRS) engage in multi-turn conversations with users and provide recommendations through responses. As user preferences evolve dynamically during the course of the conversation, it is crucial to understand natural interaction utterances to capture the user’s dynamic preference accurately. Existing research has focused on obtaining user preference at the entity level and natural language level, and bridging the semantic gap through techniques such as knowledge augmentation, semantic fusion, and prompt learning. However, the representation of each level remains under-explored. At the entity level, user preference is typically extracted from Knowledge Graphs, while other modal data is often overlooked. At the natural language level, user representation is obtained from a fixed language model, disregarding the relationships between different contexts. In this paper, we propose User-augmented Conversation Recommendation via Multi-modal graph learning and Context Mining (UaMC) to address above limitations. At the entity level, we enrich user preference by leveraging multi-modal knowledge. At the natural language level, we employ contrast learning to extract user preference from similar contexts. By incorporating the enhanced representation of user preference, we utilize prompt learning techniques to generate responses related to recommended items. We conduct experiments on two public CRS benchmarks, demonstrating the effectiveness of our approach in both the recommendation and conversation subtasks.

对话推荐系统(CRS)与用户进行多轮对话,并通过回复提供推荐。由于用户的偏好在对话过程中会发生动态变化,因此理解自然交互语句以准确捕捉用户的动态偏好至关重要。现有的研究侧重于在实体层面和自然语言层面获取用户偏好,并通过知识增强、语义融合和提示学习等技术弥合语义差距。然而,每个层面的表示方法仍未得到充分探索。在实体层面,用户偏好通常是从知识图谱中提取的,而其他模态数据往往被忽视。在自然语言层面,用户表征是从固定的语言模型中获得的,忽略了不同语境之间的关系。本文提出了通过多模态图学习和语境挖掘(UaMC)实现用户增强会话推荐(User-augmented Conversation Recommendation),以解决上述局限性。在实体层面,我们利用多模态知识丰富用户偏好。在自然语言层面,我们利用对比学习从相似语境中提取用户偏好。通过结合用户偏好的增强表示,我们利用提示学习技术生成与推荐项目相关的回复。我们在两个公开的 CRS 基准上进行了实验,证明了我们的方法在推荐和对话子任务中的有效性。
{"title":"UaMC: user-augmented conversation recommendation via multi-modal graph learning and context mining","authors":"Siqi Fan, Yequan Wang, Xiaobing Pang, Lisi Chen, Peng Han, Shuo Shang","doi":"10.1007/s11280-023-01219-2","DOIUrl":"https://doi.org/10.1007/s11280-023-01219-2","url":null,"abstract":"<p>Conversation Recommender System (CRS) engage in multi-turn conversations with users and provide recommendations through responses. As user preferences evolve dynamically during the course of the conversation, it is crucial to understand natural interaction utterances to capture the user’s dynamic preference accurately. Existing research has focused on obtaining user preference at the entity level and natural language level, and bridging the semantic gap through techniques such as knowledge augmentation, semantic fusion, and prompt learning. However, the representation of each level remains under-explored. At the entity level, user preference is typically extracted from Knowledge Graphs, while other modal data is often overlooked. At the natural language level, user representation is obtained from a fixed language model, disregarding the relationships between different contexts. In this paper, we propose <u>U</u>ser-<u>a</u>ugmented Conversation Recommendation via <u>M</u>ulti-modal graph learning and <u>C</u>ontext Mining (<b>UaMC</b>) to address above limitations. At the entity level, we enrich user preference by leveraging multi-modal knowledge. At the natural language level, we employ contrast learning to extract user preference from similar contexts. By incorporating the enhanced representation of user preference, we utilize prompt learning techniques to generate responses related to recommended items. We conduct experiments on two public CRS benchmarks, demonstrating the effectiveness of our approach in both the recommendation and conversation subtasks.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138745557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
World Wide Web
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1