首页 > 最新文献

ACM Transactions on Intelligent Systems and Technology最新文献

英文 中文
Robust Structure-Aware Graph-based Semi-Supervised Learning: Batch and Recursive Processing 稳健的结构感知图式半监督学习:批处理和递归处理
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-26 DOI: 10.1145/3653986
Xu Chen

Graph-based semi-supervised learning plays an important role in large scale image classification tasks. However, the problem becomes very challenging in the presence of noisy labels and outliers. Moreover, traditional robust semi-supervised learning solutions suffers from prohibitive computational burdens thus cannot be computed for streaming data. Motivated by that, we present a novel unified framework robust structure-aware semi-supervised learning called Unified RSSL (URSSL) for batch processing and recursive processing robust to both outliers and noisy labels. Particularly, URSSL applies joint semi-supervised dimensionality reduction with robust estimators and network sparse regularization simultaneously on the graph Laplacian matrix iteratively to preserve the intrinsic graph structure and ensure robustness to the compound noise. First, in order to relieve the influence from outliers, a novel semi-supervised robust dimensionality reduction is applied relying on robust estimators to suppress outliers. Meanwhile, to tackle noisy labels, the denoised graph similarity information is encoded into the network regularization. Moreover, by identifying strong relevance of dimensionality reduction and network regularization in the context of robust semi-supervised learning (RSSL), a two-step alternative optimization is derived to compute optimal solutions with guaranteed convergence. We further derive our framework to adapt to large scale semi-supervised learning particularly suitable for large scale image classification and demonstrate the model robustness under different adversarial attacks. For recursive processing, we rely on reparameterization to transform the formulation to unlock the challenging problem of robust streaming-based semi-supervised learning. Last but not least, we extend our solution into distributed solutions to resolve the challenging issue of distributed robust semi-supervised learning when images are captured by multiple cameras at different locations. Extensive experimental results demonstrate the promising performance of this framework when applied to multiple benchmark datasets with respect to state-of-the-art approaches for important applications in the areas of image classification and spam data analysis.

基于图的半监督学习在大规模图像分类任务中发挥着重要作用。然而,在存在噪声标签和异常值的情况下,这个问题变得非常具有挑战性。此外,传统的鲁棒性半监督学习解决方案存在过高的计算负担,因此无法计算流数据。受此启发,我们提出了一种新颖的统一框架--鲁棒性结构感知半监督学习,称为统一 RSSL(URSSL),用于批处理和递归处理,对异常值和噪声标签均具有鲁棒性。特别是,URSSL 在图拉普拉卡矩阵上同时迭代应用了鲁棒估计器和网络稀疏正则化的联合半监督降维,以保留图的内在结构并确保对复合噪声的鲁棒性。首先,为了减轻离群值的影响,我们采用了一种新型的半监督鲁棒降维方法,依靠鲁棒估计器来抑制离群值。同时,为了处理噪声标签,将去噪后的图相似性信息编码到网络正则化中。此外,在鲁棒半监督学习(RSSL)的背景下,通过确定降维和网络正则化的强相关性,得出了一种两步替代优化方法,以计算具有保证收敛性的最优解。我们进一步推导出适用于大规模半监督学习的框架,尤其适用于大规模图像分类,并证明了模型在不同对抗攻击下的鲁棒性。对于递归处理,我们依靠重参数化来转换公式,以解决基于流的鲁棒半监督学习这一具有挑战性的问题。最后但并非最不重要的一点是,我们将解决方案扩展为分布式解决方案,以解决由不同位置的多个摄像头捕获图像时分布式鲁棒半监督学习的挑战性问题。广泛的实验结果表明,在图像分类和垃圾数据分析领域的重要应用中,将该框架应用于多个基准数据集时,与最先进的方法相比,该框架的性能大有可为。
{"title":"Robust Structure-Aware Graph-based Semi-Supervised Learning: Batch and Recursive Processing","authors":"Xu Chen","doi":"10.1145/3653986","DOIUrl":"https://doi.org/10.1145/3653986","url":null,"abstract":"<p>Graph-based semi-supervised learning plays an important role in large scale image classification tasks. However, the problem becomes very challenging in the presence of noisy labels and outliers. Moreover, traditional robust semi-supervised learning solutions suffers from prohibitive computational burdens thus cannot be computed for streaming data. Motivated by that, we present a novel unified framework robust structure-aware semi-supervised learning called Unified RSSL (URSSL) for batch processing and recursive processing robust to both outliers and noisy labels. Particularly, URSSL applies joint semi-supervised dimensionality reduction with robust estimators and network sparse regularization simultaneously on the graph Laplacian matrix iteratively to preserve the intrinsic graph structure and ensure robustness to the compound noise. First, in order to relieve the influence from outliers, a novel semi-supervised robust dimensionality reduction is applied relying on robust estimators to suppress outliers. Meanwhile, to tackle noisy labels, the denoised graph similarity information is encoded into the network regularization. Moreover, by identifying strong relevance of dimensionality reduction and network regularization in the context of robust semi-supervised learning (RSSL), a two-step alternative optimization is derived to compute optimal solutions with guaranteed convergence. We further derive our framework to adapt to large scale semi-supervised learning particularly suitable for large scale image classification and demonstrate the model robustness under different adversarial attacks. For recursive processing, we rely on reparameterization to transform the formulation to unlock the challenging problem of robust streaming-based semi-supervised learning. Last but not least, we extend our solution into distributed solutions to resolve the challenging issue of distributed robust semi-supervised learning when images are captured by multiple cameras at different locations. Extensive experimental results demonstrate the promising performance of this framework when applied to multiple benchmark datasets with respect to state-of-the-art approaches for important applications in the areas of image classification and spam data analysis.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"2016 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140298449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Momentum Contrastive Clustering 联邦动量对比聚类
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-26 DOI: 10.1145/3653981
Runxuan Miao, Erdem Koyuncu

Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.

自我监督表征学习和深度聚类对在集中环境中同时学习高质量表征和聚类数据是互利的。然而,考虑到数据隐私要求和计算资源,在一个中心实体收集大量数据并不总是可行的。联邦学习(Federated Learning,FL)已经开发成功,可以在尊重边缘设备数据隐私的前提下,在对分布式本地数据进行训练的同时聚合全局模型。不过,大多数联合学习研究工作都集中在监督学习算法上。现有文献尚未考虑完全无监督的联合聚类方案。我们提出了联合动量对比聚类(FedMCC),这是一种通用的联合聚类框架,不仅能自动对数据进行聚类,还能从多个用户的分布式本地数据中提取判别表征训练。在 FedMCC 中,我们展示了一种两阶段联合学习范式,第一阶段旨在学习可区分的实例嵌入,第二阶段则自动对数据进行聚类。实验结果表明,FedMCC 不仅实现了卓越的聚类性能,而且在线性评估和半监督学习任务中的表现也优于现有的几种联合自监督方法。此外,通过我们称之为动量对比聚类(MCC)的方法,FedMCC 可以很容易地适应普通的集中式聚类。我们的研究表明,MCC 在某些数据集(如 STL-10 和 ImageNet-10)中达到了最先进的聚类精度。我们还提出了一种减少聚类方案内存占用的方法。
{"title":"Federated Momentum Contrastive Clustering","authors":"Runxuan Miao, Erdem Koyuncu","doi":"10.1145/3653981","DOIUrl":"https://doi.org/10.1145/3653981","url":null,"abstract":"<p>Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"6 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140298603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable finite mixture of mixtures of bounded asymmetric generalized Gaussian and Uniform distributions learning for energy demand management 用于能源需求管理的有界非对称广义高斯分布和均匀分布学习的可解释有限混合物
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-26 DOI: 10.1145/3653980
Hussein Al-Bazzaz, Muhammad Azam, Manar Amayri, Nizar Bouguila

We introduce a mixture of mixtures of bounded asymmetric generalized Gaussian and uniform distributions. Based on this framework, we propose model-based classification and model-based clustering algorithms. We develop an objective function for the minimum message length (MML) model selection criterion to discover the optimal number of clusters for the unsupervised approach of our proposed model. Given the crucial attention received by Explainable AI (XAI) in recent years, we introduce a method to interpret the predictions obtained from the proposed model in both learning settings by defining their boundaries in terms of the crucial features. Integrating Explainability within our proposed algorithm increases the credibility of the algorithm’s predictions since it would be explainable to the user’s perspective through simple If-Then statements using a small binary decision tree. In this paper, the proposed algorithm proves its reliability and superiority to several state-of-the-art machine learning algorithms within the following real-world applications: fault detection and diagnosis (FDD) in chillers, occupancy estimation and categorization of residential energy consumers.

我们引入了有界非对称广义高斯分布和均匀分布的混合物。基于这一框架,我们提出了基于模型的分类和基于模型的聚类算法。我们为最小信息长度(MML)模型选择标准开发了一个目标函数,以发现我们所提模型的无监督方法的最佳聚类数量。鉴于可解释人工智能(XAI)近年来受到的极大关注,我们引入了一种方法,通过定义关键特征的边界来解释在两种学习设置中从所提模型中获得的预测。将 "可解释性 "整合到我们提出的算法中,可以提高算法预测的可信度,因为从用户的角度来看,可以通过使用小型二叉决策树的简单 "如果-那么 "语句对算法进行解释。本文提出的算法在以下实际应用中证明了其可靠性和优于几种最先进的机器学习算法:冷水机组的故障检测和诊断(FDD)、住宅能源消费者的占用率估计和分类。
{"title":"Explainable finite mixture of mixtures of bounded asymmetric generalized Gaussian and Uniform distributions learning for energy demand management","authors":"Hussein Al-Bazzaz, Muhammad Azam, Manar Amayri, Nizar Bouguila","doi":"10.1145/3653980","DOIUrl":"https://doi.org/10.1145/3653980","url":null,"abstract":"<p>We introduce a mixture of mixtures of bounded asymmetric generalized Gaussian and uniform distributions. Based on this framework, we propose model-based classification and model-based clustering algorithms. We develop an objective function for the minimum message length (MML) model selection criterion to discover the optimal number of clusters for the unsupervised approach of our proposed model. Given the crucial attention received by Explainable AI (XAI) in recent years, we introduce a method to interpret the predictions obtained from the proposed model in both learning settings by defining their boundaries in terms of the crucial features. Integrating Explainability within our proposed algorithm increases the credibility of the algorithm’s predictions since it would be explainable to the user’s perspective through simple If-Then statements using a small binary decision tree. In this paper, the proposed algorithm proves its reliability and superiority to several state-of-the-art machine learning algorithms within the following real-world applications: fault detection and diagnosis (FDD) in chillers, occupancy estimation and categorization of residential energy consumers.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"117 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140315532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting Items 减轻动态排名学习中不准确反馈的影响:被忽视的有趣项目研究
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-26 DOI: 10.1145/3653983
Chenhao Zhang, Weitong Chen, Wei Emma Zhang, Miao Xu

Dynamic Learning-to-Rank (DLTR) is a method of updating a ranking policy in real-time based on user feedback, which may not always be accurate. Although previous DLTR work has achieved fair and unbiased DLTR under inaccurate feedback, they face the trade-off between fairness and user utility and also have limitations in the setting of feeding items. Existing DLTR works improve ranking utility by eliminating bias from inaccurate feedback on observed items, but the impact of another pervasive form of inaccurate feedback, overlooked or ignored interesting items, remains unclear. For example, users may browse the rankings too quickly to catch interesting items or miss interesting items because the snippets are not optimized enough. This phenomenon raises two questions: i) Will overlooked interesting items affect the ranking results? ii) Is it possible to improve utility without sacrificing fairness if these effects are eliminated? These questions are particularly relevant for small and medium-sized retailers who are just starting out and may have limited data, leading to the use of inaccurate feedback to update their models. In this paper, we find that inaccurate feedback in the form of overlooked interesting items has a negative impact on DLTR performance in terms of utility. To address this, we treat the overlooked interesting items as noise and propose a novel DLTR method, the Co-teaching Rank (CoTeR), that has good utility and fairness performance when inaccurate feedback is present in the form of overlooked interesting items. Our solution incorporates a co-teaching-based component with a customized loss function and data sampling strategy, as well as a mean pooling strategy to further accommodate newly added products without historical data. Through experiments, we demonstrate that CoTeRx not only enhances utilities but also preserves ranking fairness, and can smoothly handle newly introduced items.

动态学习排名(DLTR)是一种根据用户反馈实时更新排名策略的方法,而用户反馈不一定总是准确的。虽然之前的 DLTR 工作在不准确反馈的情况下实现了公平无偏的 DLTR,但它们面临着公平性和用户效用之间的权衡,而且在喂养项目的设置上也有局限性。现有的 DLTR 工作通过消除对观察到的项目的不准确反馈所产生的偏差来提高排名效用,但另一种普遍存在的不准确反馈形式--被忽视或忽略的有趣项目--的影响仍不清楚。例如,用户可能会因为浏览排名过快而无法捕捉到有趣的条目,或者因为片段优化不够而错过了有趣的条目。这种现象提出了两个问题:i) 被忽略的有趣条目是否会影响排名结果?ii) 如果消除这些影响,是否有可能在不牺牲公平性的情况下提高效用?这些问题对于刚刚起步的中小型零售商尤为重要,因为他们的数据可能有限,导致使用不准确的反馈来更新模型。在本文中,我们发现以被忽视的有趣商品为形式的不准确反馈会对 DLTR 的效用表现产生负面影响。为了解决这个问题,我们将被忽略的有趣条目视为噪音,并提出了一种新颖的 DLTR 方法--协同教学排名(CoTeR),当以被忽略的有趣条目形式出现不准确反馈时,该方法具有良好的实用性和公平性。我们的解决方案包含一个基于协同教学的组件,该组件具有定制的损失函数和数据采样策略,以及一个均值池策略,以进一步适应没有历史数据的新添加产品。通过实验,我们证明了 CoTeRx 不仅能提高效用,还能保持排名的公平性,并能顺利处理新引入的项目。
{"title":"Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting Items","authors":"Chenhao Zhang, Weitong Chen, Wei Emma Zhang, Miao Xu","doi":"10.1145/3653983","DOIUrl":"https://doi.org/10.1145/3653983","url":null,"abstract":"<p>Dynamic Learning-to-Rank (DLTR) is a method of updating a ranking policy in real-time based on user feedback, which may not always be accurate. Although previous DLTR work has achieved fair and unbiased DLTR under inaccurate feedback, they face the trade-off between fairness and user utility and also have limitations in the setting of feeding items. Existing DLTR works improve ranking utility by eliminating bias from inaccurate feedback on observed items, but the impact of another pervasive form of inaccurate feedback, overlooked or ignored interesting items, remains unclear. For example, users may browse the rankings too quickly to catch interesting items or miss interesting items because the snippets are not optimized enough. This phenomenon raises two questions: i) <i>Will overlooked interesting items affect the ranking results?</i> ii) <i>Is it possible to improve utility without sacrificing fairness if these effects are eliminated?</i> These questions are particularly relevant for small and medium-sized retailers who are just starting out and may have limited data, leading to the use of inaccurate feedback to update their models. In this paper, we find that inaccurate feedback in the form of overlooked interesting items has a negative impact on DLTR performance in terms of utility. To address this, we treat the overlooked interesting items as noise and propose a novel DLTR method, the Co-teaching Rank (CoTeR), that has good utility and fairness performance when inaccurate feedback is present in the form of overlooked interesting items. Our solution incorporates a co-teaching-based component with a customized loss function and data sampling strategy, as well as a mean pooling strategy to further accommodate newly added products without historical data. Through experiments, we demonstrate that CoTeRx not only enhances utilities but also preserves ranking fairness, and can smoothly handle newly introduced items.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"72 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empowering Predictive Modeling by GAN-based Causal Information Learning 通过基于 GAN 的因果信息学习增强预测建模能力
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-20 DOI: 10.1145/3652610
Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin

Generally speaking, we can easily specify many causal relationships in the prediction tasks of ubiquitous computing, such as human activity prediction, mobility prediction, and health prediction. However, most of the existing methods in these fields failed to take advantage of this prior causal knowledge. They typically make predictions only based on correlations in the data, which hinders the prediction performance in real-world scenarios because a distribution shift between training data and testing data generally exists. To fill in this gap, we proposed a GAN-based Causal Information Learning prediction framework (GCIL), which can effectively leverage causal information to improve the prediction performance of existing ubiquitous computing deep learning models. Specifically, faced with a unique challenge that the treatment variable, referring to the intervention that influences the target in a causal relationship, is generally continuous in ubiquitous computing, the framework employs a representation learning approach with a GAN-based deep learning model. By projecting all variables except the treatment into a latent space, it effectively minimizes confounding bias and leverages the learned latent representation for accurate predictions. In this way, it deals with the continuous treatment challenge, and in the meantime, it can be easily integrated with existing deep learning models to lift their prediction performance in practical scenarios with causal information. Extensive experiments on two large-scale real-world datasets demonstrate its superior performance over multiple state-of-the-art baselines. We also propose an analytical framework together with extensive experiments to empirically show that our framework achieves better performance gain under two conditions: when the distribution differences between the training data and the testing data are more significant and when the treatment effects are larger. Overall, this work suggests that learning causal information is a promising way to improve the prediction performance of ubiquitous computing tasks. We open both our dataset and code1 and call for more research attention in this area.

一般来说,在泛在计算的预测任务中,如人类活动预测、移动性预测和健康预测,我们可以很容易地指定许多因果关系。然而,这些领域的大多数现有方法都未能利用这些先验因果知识。它们通常只根据数据中的相关性进行预测,这就阻碍了实际场景中的预测性能,因为训练数据和测试数据之间通常存在分布偏移。为了填补这一空白,我们提出了基于 GAN 的因果信息学习预测框架(GCIL),它可以有效利用因果信息来提高现有泛在计算深度学习模型的预测性能。具体来说,面对处理变量(指在因果关系中影响目标的干预)在泛在计算中通常是连续的这一独特挑战,该框架采用了基于 GAN 深度学习模型的表示学习方法。通过将除治疗外的所有变量投射到一个潜空间,它能有效地减少混杂偏差,并利用学习到的潜表征进行准确预测。这样,它就能应对连续治疗的挑战,同时,它还能与现有的深度学习模型轻松集成,以提高其在具有因果信息的实际场景中的预测性能。在两个大规模真实世界数据集上进行的广泛实验证明,它的性能优于多个最先进的基线。我们还提出了一个分析框架,并通过大量实验实证表明,我们的框架在两种情况下取得了更好的性能提升:当训练数据和测试数据之间的分布差异更显著时,以及当处理效应更大时。总之,这项工作表明,学习因果信息是提高泛在计算任务预测性能的一种可行方法。我们将开放我们的数据集和代码1,并呼吁更多研究人员关注这一领域。
{"title":"Empowering Predictive Modeling by GAN-based Causal Information Learning","authors":"Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin","doi":"10.1145/3652610","DOIUrl":"https://doi.org/10.1145/3652610","url":null,"abstract":"<p>Generally speaking, we can easily specify many causal relationships in the prediction tasks of ubiquitous computing, such as human activity prediction, mobility prediction, and health prediction. However, most of the existing methods in these fields failed to take advantage of this prior causal knowledge. They typically make predictions only based on correlations in the data, which hinders the prediction performance in real-world scenarios because a distribution shift between training data and testing data generally exists. To fill in this gap, we proposed a <underline>G</underline>AN-based <underline>C</underline>ausal <underline>I</underline>nformation <underline>L</underline>earning prediction framework (GCIL), which can effectively leverage causal information to improve the prediction performance of existing ubiquitous computing deep learning models. Specifically, faced with a unique challenge that the treatment variable, referring to the intervention that influences the target in a causal relationship, is generally continuous in ubiquitous computing, the framework employs a representation learning approach with a GAN-based deep learning model. By projecting all variables except the treatment into a latent space, it effectively minimizes confounding bias and leverages the learned latent representation for accurate predictions. In this way, it deals with the continuous treatment challenge, and in the meantime, it can be easily integrated with existing deep learning models to lift their prediction performance in practical scenarios with causal information. Extensive experiments on two large-scale real-world datasets demonstrate its superior performance over multiple state-of-the-art baselines. We also propose an analytical framework together with extensive experiments to empirically show that our framework achieves better performance gain under two conditions: when the distribution differences between the training data and the testing data are more significant and when the treatment effects are larger. Overall, this work suggests that learning causal information is a promising way to improve the prediction performance of ubiquitous computing tasks. We open both our dataset and code<sup>1</sup> and call for more research attention in this area.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"123 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140172409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Meta-learning Framework for Tuning Parameters of Protection Mechanisms in Trustworthy Federated Learning 用于调整可信联合学习中保护机制参数的元学习框架
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-18 DOI: 10.1145/3652612
Xiaojin Zhang, Yan Kang, Lixin Fan, Kai Chen, Qiang Yang

Trustworthy Federated Learning (TFL) typically leverages protection mechanisms to guarantee privacy. However, protection mechanisms inevitably introduce utility loss or efficiency reduction while protecting data privacy. Therefore, protection mechanisms and their parameters should be carefully chosen to strike an optimal trade-off between privacy leakage, utility loss, and efficiency reduction. To this end, federated learning practitioners need tools to measure the three factors and optimize the trade-off between them to choose the protection mechanism that is most appropriate to the application at hand. Motivated by this requirement, we propose a framework that (1) formulates TFL as a problem of finding a protection mechanism to optimize the trade-off between privacy leakage, utility loss, and efficiency reduction and (2) formally defines bounded measurements of the three factors. We then propose a meta-learning algorithm to approximate this optimization problem and find optimal protection parameters for representative protection mechanisms, including Randomization, Homomorphic Encryption, Secret Sharing, and Compression. We further design estimation algorithms to quantify these found optimal protection parameters in a practical horizontal federated learning setting and provide a theoretical analysis of the estimation error.

可信联合学习(TFL)通常利用保护机制来保证隐私。然而,保护机制在保护数据隐私的同时,不可避免地会带来效用损失或效率降低。因此,应谨慎选择保护机制及其参数,以便在隐私泄露、效用损失和效率降低之间取得最佳平衡。为此,联合学习实践者需要一些工具来衡量这三个因素,并优化它们之间的权衡,以选择最适合当前应用的保护机制。在这一要求的激励下,我们提出了一个框架:(1) 将 TFL 表述为一个寻找保护机制的问题,以优化隐私泄露、效用损失和效率降低之间的权衡;(2) 正式定义这三个因素的有界测量。然后,我们提出了一种元学习算法来逼近这一优化问题,并为随机化、同态加密、秘密共享和压缩等代表性保护机制找到最佳保护参数。我们进一步设计了估算算法,以便在实际的水平联合学习环境中量化这些找到的最佳保护参数,并对估算误差进行了理论分析。
{"title":"A Meta-learning Framework for Tuning Parameters of Protection Mechanisms in Trustworthy Federated Learning","authors":"Xiaojin Zhang, Yan Kang, Lixin Fan, Kai Chen, Qiang Yang","doi":"10.1145/3652612","DOIUrl":"https://doi.org/10.1145/3652612","url":null,"abstract":"<p>Trustworthy Federated Learning (TFL) typically leverages protection mechanisms to guarantee privacy. However, protection mechanisms inevitably introduce utility loss or efficiency reduction while protecting data privacy. Therefore, protection mechanisms and their parameters should be carefully chosen to strike an optimal trade-off between <i>privacy leakage</i>, <i>utility loss</i>, and <i>efficiency reduction</i>. To this end, federated learning practitioners need tools to measure the three factors and optimize the trade-off between them to choose the protection mechanism that is most appropriate to the application at hand. Motivated by this requirement, we propose a framework that (1) formulates TFL as a problem of finding a protection mechanism to optimize the trade-off between privacy leakage, utility loss, and efficiency reduction and (2) formally defines bounded measurements of the three factors. We then propose a meta-learning algorithm to approximate this optimization problem and find optimal protection parameters for representative protection mechanisms, including Randomization, Homomorphic Encryption, Secret Sharing, and Compression. We further design estimation algorithms to quantify these found optimal protection parameters in a practical horizontal federated learning setting and provide a theoretical analysis of the estimation error.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"142 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perceiving Actions via Temporal Video Frame Pairs 通过时态视频帧对感知动作
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-17 DOI: 10.1145/3652611
Rongchang Li, Tianyang Xu, Xiao-Jun Wu, Zhongwei Shen, Josef Kittler

Video action recognition aims to classify the action category in given videos. In general, semantic-relevant video frame pairs reflect significant action patterns such as object appearance variation and abstract temporal concepts like speed, rhythm, etc. However, existing action recognition approaches tend to holistically extract spatiotemporal features. Though effective, there is still a risk of neglecting the crucial action features occurring across frames with a long-term temporal span. Motivated by this, in this paper, we propose to perceive actions via frame pairs directly and devise a novel Nest Structure with frame pairs as basic units. Specifically, we decompose a video sequence into all possible frame pairs and hierarchically organize them according to temporal frequency and order, thus transforming the original video sequence into a Nest Structure. Through naturally decomposing actions, the proposed structure can flexibly adapt to diverse action variations such as speed or rhythm changes. Next, we devise a Temporal Pair Analysis module (TPA) to extract discriminative action patterns based on the proposed Nest Structure. The designed TPA module consists of a pair calculation part to calculate the pair features and a pair fusion part to hierarchically fuse the pair features for recognizing actions. The proposed TPA can be flexibly integrated into existing backbones, serving as a side branch to capture various action patterns from multi-level features. Extensive experiments show that the proposed TPA module can achieve consistent improvements over several typical backbones, reaching or updating CNN-based SOTA results on several challenging action recognition benchmarks.

视频动作识别旨在对给定视频中的动作类别进行分类。一般来说,语义相关的视频帧对反映了重要的动作模式,如物体外观变化和抽象的时间概念,如速度、节奏等。然而,现有的动作识别方法倾向于整体提取时空特征。这种方法虽然有效,但仍有可能忽略跨帧、跨时空的关键动作特征。受此启发,我们在本文中提议直接通过帧对来感知动作,并设计了一种以帧对为基本单元的新型 Nest 结构。具体来说,我们将视频序列分解为所有可能的帧对,并根据时间频率和顺序对其进行分层组织,从而将原始视频序列转化为 Nest 结构。通过自然分解动作,所提出的结构可以灵活地适应各种动作变化,如速度或节奏的变化。接下来,我们设计了一个时序配对分析模块(TPA),根据所提出的 Nest 结构提取具有区分性的动作模式。所设计的 TPA 模块包括用于计算配对特征的配对计算部分和用于分层融合配对特征以识别动作的配对融合部分。所提出的 TPA 可以灵活地集成到现有的骨干网中,作为侧枝从多层次特征中捕捉各种动作模式。广泛的实验表明,与几种典型的骨干网相比,所提出的 TPA 模块可以实现持续的改进,在几种具有挑战性的动作识别基准上达到或更新基于 CNN 的 SOTA 结果。
{"title":"Perceiving Actions via Temporal Video Frame Pairs","authors":"Rongchang Li, Tianyang Xu, Xiao-Jun Wu, Zhongwei Shen, Josef Kittler","doi":"10.1145/3652611","DOIUrl":"https://doi.org/10.1145/3652611","url":null,"abstract":"<p>Video action recognition aims to classify the action category in given videos. In general, semantic-relevant video frame pairs reflect significant action patterns such as object appearance variation and abstract temporal concepts like speed, rhythm, etc. However, existing action recognition approaches tend to holistically extract spatiotemporal features. Though effective, there is still a risk of neglecting the crucial action features occurring across frames with a long-term temporal span. Motivated by this, in this paper, we propose to perceive actions via frame pairs directly and devise a novel Nest Structure with frame pairs as basic units. Specifically, we decompose a video sequence into all possible frame pairs and hierarchically organize them according to temporal frequency and order, thus transforming the original video sequence into a Nest Structure. Through naturally decomposing actions, the proposed structure can flexibly adapt to diverse action variations such as speed or rhythm changes. Next, we devise a Temporal Pair Analysis module (TPA) to extract discriminative action patterns based on the proposed Nest Structure. The designed TPA module consists of a pair calculation part to calculate the pair features and a pair fusion part to hierarchically fuse the pair features for recognizing actions. The proposed TPA can be flexibly integrated into existing backbones, serving as a side branch to capture various action patterns from multi-level features. Extensive experiments show that the proposed TPA module can achieve consistent improvements over several typical backbones, reaching or updating CNN-based SOTA results on several challenging action recognition benchmarks.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"57 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140156575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensuring Fairness and Gradient Privacy in Personalized Heterogeneous Federated Learning 确保个性化异构联合学习的公平性和梯度隐私性
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-13 DOI: 10.1145/3652613
Cody Lewis, Vijay Varadharajan, Nasimul Noman, Uday Tupakula

With the increasing tension between conflicting requirements of the availability of large amounts of data for effective machine learning based analysis, and for ensuring their privacy, the paradigm of federated learning has emerged, a distributed machine learning setting where the clients provide only the machine learning model updates to the server rather than the actual data for decision making. However, the distributed nature of federated learning raises specific challenges related to fairness in a heterogeneous setting. This motivates the focus of our paper, on the heterogeneity of client devices having different computational capabilities and their impact on fairness in federated learning. Furthermore, our aim is to achieve fairness in heterogeneity while ensuring privacy. As far as we are aware there are no existing works that address all these three aspects of fairness, device heterogeneity and privacy simultaneously in federated learning. In this paper, we propose a novel federated learning algorithm with personalization in the context of heterogeneous devices while maintaining compatibility with the gradient privacy preservation techniques of secure aggregation. We analyze the proposed federated learning algorithm under different environments with different datasets, and show that it achieves performance close to or greater than the state-of-the-art in heterogeneous device personalized federated learning. We also provide theoretical proofs for the fairness and convergence properties of our proposed algorithm.

随着基于机器学习的有效分析需要大量数据,同时又要确保数据的隐私,这两种需求之间的矛盾日益突出,于是出现了联合学习的模式,这是一种分布式机器学习环境,客户只向服务器提供机器学习模型的更新,而不是用于决策的实际数据。然而,联合学习的分布式特性提出了与异构环境中的公平性有关的具体挑战。这促使我们将本文的重点放在具有不同计算能力的客户端设备的异构性及其对联合学习公平性的影响上。此外,我们的目标是在异构中实现公平性,同时确保隐私。据我们所知,目前还没有任何作品能同时解决联合学习中的公平性、设备异构性和隐私性这三个方面的问题。在本文中,我们提出了一种在异构设备背景下具有个性化的新型联合学习算法,同时与安全聚合的梯度隐私保护技术保持兼容。我们分析了所提出的联合学习算法在不同环境和不同数据集下的表现,结果表明该算法在异构设备个性化联合学习方面取得了接近或超过最先进水平的性能。我们还为所提算法的公平性和收敛性提供了理论证明。
{"title":"Ensuring Fairness and Gradient Privacy in Personalized Heterogeneous Federated Learning","authors":"Cody Lewis, Vijay Varadharajan, Nasimul Noman, Uday Tupakula","doi":"10.1145/3652613","DOIUrl":"https://doi.org/10.1145/3652613","url":null,"abstract":"<p>With the increasing tension between conflicting requirements of the availability of large amounts of data for effective machine learning based analysis, and for ensuring their privacy, the paradigm of federated learning has emerged, a distributed machine learning setting where the clients provide only the machine learning model updates to the server rather than the actual data for decision making. However, the distributed nature of federated learning raises specific challenges related to fairness in a heterogeneous setting. This motivates the focus of our paper, on the heterogeneity of client devices having different computational capabilities and their impact on fairness in federated learning. Furthermore, our aim is to achieve fairness in heterogeneity while ensuring privacy. As far as we are aware there are no existing works that address all these three aspects of fairness, device heterogeneity and privacy simultaneously in federated learning. In this paper, we propose a novel federated learning algorithm with personalization in the context of heterogeneous devices while maintaining compatibility with the gradient privacy preservation techniques of secure aggregation. We analyze the proposed federated learning algorithm under different environments with different datasets, and show that it achieves performance close to or greater than the state-of-the-art in heterogeneous device personalized federated learning. We also provide theoretical proofs for the fairness and convergence properties of our proposed algorithm.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"9 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140124512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal Dialogue Systems via Capturing Context-aware Dependencies and Ordinal Information of Semantic Elements 通过捕捉语义要素的上下文感知依赖关系和排序信息实现多模态对话系统
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-12 DOI: 10.1145/3645099
Weidong He, Zhi Li, Hao Wang, Tong Xu, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, Enhong Chen

The topic of multimodal conversation systems has recently garnered significant attention across various industries, including travel, retail, and others. While pioneering works in this field have shown promising performance, they often focus solely on context information at the utterance level, overlooking the context-aware dependencies of multimodal semantic elements like words and images. Furthermore, the ordinal information of images, which indicates the relevance between visual context and users’ demands, remains underutilized during the integration of visual content. Additionally, the exploration of how to effectively utilize corresponding attributes provided by users when searching for desired products is still largely unexplored. To address these challenges, we propose a Position-aware Multimodal diAlogue system with semanTic Elements, abbreviated as PMATE. Specifically, to obtain semantic representations at the element-level, we first unfold the multimodal historical utterances and devise a position-aware multimodal element-level encoder. This component considers all images that may be relevant to the current turn and introduces a novel position-aware image selector to choose related images before fusing the information from the two modalities. Finally, we present a knowledge-aware two-stage decoder and an attribute-enhanced image searcher for the tasks of generating textual responses and selecting image responses, respectively. We extensively evaluate our model on two large-scale multimodal dialog datasets, and the results of our experiments demonstrate that our approach outperforms several baseline methods.

多模态对话系统这一话题最近在旅游、零售等各行各业引起了极大关注。虽然这一领域的先驱作品表现出了良好的性能,但它们往往只关注语篇层面的语境信息,而忽略了单词和图像等多模态语义元素的语境感知依赖性。此外,在整合视觉内容的过程中,表示视觉语境与用户需求之间相关性的图像序号信息仍未得到充分利用。此外,在搜索所需产品时,如何有效利用用户提供的相应属性,这一问题在很大程度上仍未得到探讨。为了应对这些挑战,我们提出了一种带有语义元素的位置感知多模态对话系统,简称 PMATE。具体来说,为了获得元素级的语义表征,我们首先展开多模态历史语篇,并设计了一个位置感知多模态元素级编码器。该组件考虑了可能与当前转向相关的所有图像,并引入了一个新颖的位置感知图像选择器,以便在融合两种模态的信息之前选择相关图像。最后,我们提出了一个知识感知两阶段解码器和一个属性增强图像搜索器,分别用于生成文本回复和选择图像回复。我们在两个大型多模态对话数据集上广泛评估了我们的模型,实验结果表明我们的方法优于几种基线方法。
{"title":"Multimodal Dialogue Systems via Capturing Context-aware Dependencies and Ordinal Information of Semantic Elements","authors":"Weidong He, Zhi Li, Hao Wang, Tong Xu, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, Enhong Chen","doi":"10.1145/3645099","DOIUrl":"https://doi.org/10.1145/3645099","url":null,"abstract":"<p>The topic of multimodal conversation systems has recently garnered significant attention across various industries, including travel, retail, and others. While pioneering works in this field have shown promising performance, they often focus solely on context information at the utterance level, overlooking the context-aware dependencies of multimodal semantic elements like words and images. Furthermore, the ordinal information of images, which indicates the relevance between visual context and users’ demands, remains underutilized during the integration of visual content. Additionally, the exploration of how to effectively utilize corresponding attributes provided by users when searching for desired products is still largely unexplored. To address these challenges, we propose a Position-aware Multimodal diAlogue system with semanTic Elements, abbreviated as PMATE. Specifically, to obtain semantic representations at the element-level, we first unfold the multimodal historical utterances and devise a position-aware multimodal element-level encoder. This component considers all images that may be relevant to the current turn and introduces a novel position-aware image selector to choose related images before fusing the information from the two modalities. Finally, we present a knowledge-aware two-stage decoder and an attribute-enhanced image searcher for the tasks of generating textual responses and selecting image responses, respectively. We extensively evaluate our model on two large-scale multimodal dialog datasets, and the results of our experiments demonstrate that our approach outperforms several baseline methods.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"60 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised Bipartite Graph Representation Learning: A Dirichlet Max-margin Matrix Factorization Approach 自我监督的双方图表示学习:一种 Dirichlet 最大边际矩阵因式分解方法
IF 5 4区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-03-08 DOI: 10.1145/3645098
Shenghai Zhong, Shu Guo, Jing Liu, Hongren Huang, Lihong Wang, Jianxin Li, Chen Li, Yiming Hei

Bipartite graph representation learning aims to obtain node embeddings by compressing sparse vectorized representations of interactions between two types of nodes, e.g., users and items. Incorporating structural attributes among homogeneous nodes, such as user communities, improves the identification of similar interaction preferences, namely, user/item embeddings, for downstream tasks. However, existing methods often fail to proactively discover and fully utilize these latent structural attributes. Moreover, the manual collection and labeling of structural attributes is always costly. In this paper, we propose a novel approach called Dirichlet Max-margin Matrix Factorization (DM3F), which adopts a self-supervised strategy to discover latent structural attributes and model discriminative node representations. Specifically, in self-supervised learning, our approach generates pseudo group labels (i.e., structural attributes) as a supervised signal using the Dirichlet process without relying on manual collection and labeling, and employs them in a max-margin classification. Additionally, we introduce a Variational Markov Chain Monte Carlo algorithm (Variational MCMC) to effectively update the parameters. The experimental results on six real datasets demonstrate that, in the majority of cases, the proposed method outperforms existing approaches based on matrix factorization and neural networks. Furthermore, the modularity analysis confirms the effectiveness of our model in capturing structural attributes to produce high-quality user embeddings.

双向图表示学习旨在通过压缩两类节点(如用户和物品)之间交互的稀疏向量表示来获得节点嵌入。将用户社区等同类节点之间的结构属性纳入其中,可提高下游任务对类似交互偏好(即用户/物品嵌入)的识别能力。然而,现有的方法往往无法主动发现和充分利用这些潜在的结构属性。此外,手动收集和标注结构属性总是成本高昂。在本文中,我们提出了一种名为 "Dirichlet Max-margin Matrix Factorization"(DM3F)的新方法,该方法采用自我监督策略来发现潜在结构属性并对节点表征进行判别建模。具体来说,在自我监督学习中,我们的方法利用 Dirichlet 过程生成伪组标签(即结构属性)作为监督信号,而无需依赖人工收集和标记,并将其用于最大边际分类。此外,我们还引入了变异马尔可夫链蒙特卡罗算法(Variational Markov Chain Monte Carlo algorithm,Variational MCMC)来有效更新参数。在六个真实数据集上的实验结果表明,在大多数情况下,所提出的方法优于现有的基于矩阵因式分解和神经网络的方法。此外,模块化分析证实了我们的模型在捕捉结构属性以生成高质量用户嵌入方面的有效性。
{"title":"Self-supervised Bipartite Graph Representation Learning: A Dirichlet Max-margin Matrix Factorization Approach","authors":"Shenghai Zhong, Shu Guo, Jing Liu, Hongren Huang, Lihong Wang, Jianxin Li, Chen Li, Yiming Hei","doi":"10.1145/3645098","DOIUrl":"https://doi.org/10.1145/3645098","url":null,"abstract":"<p>Bipartite graph representation learning aims to obtain node embeddings by compressing sparse vectorized representations of interactions between two types of nodes, e.g., users and items. Incorporating structural attributes among homogeneous nodes, such as user communities, improves the identification of similar interaction preferences, namely, user/item embeddings, for downstream tasks. However, existing methods often fail to proactively discover and fully utilize these latent structural attributes. Moreover, the manual collection and labeling of structural attributes is always costly. In this paper, we propose a novel approach called Dirichlet Max-margin Matrix Factorization (DM3F), which adopts a self-supervised strategy to discover latent structural attributes and model discriminative node representations. Specifically, in self-supervised learning, our approach generates pseudo group labels (i.e., structural attributes) as a supervised signal using the Dirichlet process without relying on manual collection and labeling, and employs them in a max-margin classification. Additionally, we introduce a Variational Markov Chain Monte Carlo algorithm (Variational MCMC) to effectively update the parameters. The experimental results on six real datasets demonstrate that, in the majority of cases, the proposed method outperforms existing approaches based on matrix factorization and neural networks. Furthermore, the modularity analysis confirms the effectiveness of our model in capturing structural attributes to produce high-quality user embeddings.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"57 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140070486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Intelligent Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1