Robust Structure-Aware Graph-based Semi-Supervised Learning: Batch and Recursive Processing

IF 7.2 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE ACM Transactions on Intelligent Systems and Technology Pub Date : 2024-03-26 DOI:10.1145/3653986

Xu Chen

{"title":"Robust Structure-Aware Graph-based Semi-Supervised Learning: Batch and Recursive Processing","authors":"Xu Chen","doi":"10.1145/3653986","DOIUrl":null,"url":null,"abstract":"<p>Graph-based semi-supervised learning plays an important role in large scale image classification tasks. However, the problem becomes very challenging in the presence of noisy labels and outliers. Moreover, traditional robust semi-supervised learning solutions suffers from prohibitive computational burdens thus cannot be computed for streaming data. Motivated by that, we present a novel unified framework robust structure-aware semi-supervised learning called Unified RSSL (URSSL) for batch processing and recursive processing robust to both outliers and noisy labels. Particularly, URSSL applies joint semi-supervised dimensionality reduction with robust estimators and network sparse regularization simultaneously on the graph Laplacian matrix iteratively to preserve the intrinsic graph structure and ensure robustness to the compound noise. First, in order to relieve the influence from outliers, a novel semi-supervised robust dimensionality reduction is applied relying on robust estimators to suppress outliers. Meanwhile, to tackle noisy labels, the denoised graph similarity information is encoded into the network regularization. Moreover, by identifying strong relevance of dimensionality reduction and network regularization in the context of robust semi-supervised learning (RSSL), a two-step alternative optimization is derived to compute optimal solutions with guaranteed convergence. We further derive our framework to adapt to large scale semi-supervised learning particularly suitable for large scale image classification and demonstrate the model robustness under different adversarial attacks. For recursive processing, we rely on reparameterization to transform the formulation to unlock the challenging problem of robust streaming-based semi-supervised learning. Last but not least, we extend our solution into distributed solutions to resolve the challenging issue of distributed robust semi-supervised learning when images are captured by multiple cameras at different locations. Extensive experimental results demonstrate the promising performance of this framework when applied to multiple benchmark datasets with respect to state-of-the-art approaches for important applications in the areas of image classification and spam data analysis.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"2016 1","pages":""},"PeriodicalIF":7.2000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653986","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Graph-based semi-supervised learning plays an important role in large scale image classification tasks. However, the problem becomes very challenging in the presence of noisy labels and outliers. Moreover, traditional robust semi-supervised learning solutions suffers from prohibitive computational burdens thus cannot be computed for streaming data. Motivated by that, we present a novel unified framework robust structure-aware semi-supervised learning called Unified RSSL (URSSL) for batch processing and recursive processing robust to both outliers and noisy labels. Particularly, URSSL applies joint semi-supervised dimensionality reduction with robust estimators and network sparse regularization simultaneously on the graph Laplacian matrix iteratively to preserve the intrinsic graph structure and ensure robustness to the compound noise. First, in order to relieve the influence from outliers, a novel semi-supervised robust dimensionality reduction is applied relying on robust estimators to suppress outliers. Meanwhile, to tackle noisy labels, the denoised graph similarity information is encoded into the network regularization. Moreover, by identifying strong relevance of dimensionality reduction and network regularization in the context of robust semi-supervised learning (RSSL), a two-step alternative optimization is derived to compute optimal solutions with guaranteed convergence. We further derive our framework to adapt to large scale semi-supervised learning particularly suitable for large scale image classification and demonstrate the model robustness under different adversarial attacks. For recursive processing, we rely on reparameterization to transform the formulation to unlock the challenging problem of robust streaming-based semi-supervised learning. Last but not least, we extend our solution into distributed solutions to resolve the challenging issue of distributed robust semi-supervised learning when images are captured by multiple cameras at different locations. Extensive experimental results demonstrate the promising performance of this framework when applied to multiple benchmark datasets with respect to state-of-the-art approaches for important applications in the areas of image classification and spam data analysis.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

稳健的结构感知图式半监督学习：批处理和递归处理

基于图的半监督学习在大规模图像分类任务中发挥着重要作用。然而，在存在噪声标签和异常值的情况下，这个问题变得非常具有挑战性。此外，传统的鲁棒性半监督学习解决方案存在过高的计算负担，因此无法计算流数据。受此启发，我们提出了一种新颖的统一框架--鲁棒性结构感知半监督学习，称为统一 RSSL（URSSL），用于批处理和递归处理，对异常值和噪声标签均具有鲁棒性。特别是，URSSL 在图拉普拉卡矩阵上同时迭代应用了鲁棒估计器和网络稀疏正则化的联合半监督降维，以保留图的内在结构并确保对复合噪声的鲁棒性。首先，为了减轻离群值的影响，我们采用了一种新型的半监督鲁棒降维方法，依靠鲁棒估计器来抑制离群值。同时，为了处理噪声标签，将去噪后的图相似性信息编码到网络正则化中。此外，在鲁棒半监督学习（RSSL）的背景下，通过确定降维和网络正则化的强相关性，得出了一种两步替代优化方法，以计算具有保证收敛性的最优解。我们进一步推导出适用于大规模半监督学习的框架，尤其适用于大规模图像分类，并证明了模型在不同对抗攻击下的鲁棒性。对于递归处理，我们依靠重参数化来转换公式，以解决基于流的鲁棒半监督学习这一具有挑战性的问题。最后但并非最不重要的一点是，我们将解决方案扩展为分布式解决方案，以解决由不同位置的多个摄像头捕获图像时分布式鲁棒半监督学习的挑战性问题。广泛的实验结果表明，在图像分类和垃圾数据分析领域的重要应用中，将该框架应用于多个基准数据集时，与最先进的方法相比，该框架的性能大有可为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

9.30

自引率

2.00%

发文量

131

期刊介绍： ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world. ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.