{"title":"基于 Manifold Learning 的对比学习降维方法","authors":"Jinghao Situ","doi":"10.56028/aetr.9.1.522.2024","DOIUrl":null,"url":null,"abstract":" With the development of the times, more and more high-dimensional datasets come into people's view. In order to reduce the time complexity and space complexity of downstream tasks, data dimensionality reduction becomes the primary choice. Classical dimensionality reduction algorithms are mainly divided into linear dimensionality reduction algorithms and nonlinear dimensionality reduction algorithms. Some of the traditional dimensionality reduction methods have the problems of not considering the nonlinear structure of the original dataset and the existence of weak generalisation, which makes the dimensionality reduction effect not good or model need to be recalculated because of the addition of new samples. In order to solve these problems, the research in this paper is a comparative learning dimensionality reduction method based on manifold learning. The idea of manifold learning using geodesic distance can fully consider the nonlinear structure of the original dataset. In this paper, comparative learning is the main framework. When the neural network completes the training, it only need to take the new data as input to calculate, the result can be obtained, no need to reconstruct the model which means the generality is high. Starting from the related work, this paper briefly introduces manifold learning, comparative learning and neural network algorithms. Subsequently, an innovative model is proposed, including three modules, Isomap to extract nonlinear structure, expanding neighbourhood to make pseudo-labels, and comparative learning training. Detailed analyses are carried out through experiments, comparing with PCA and LLE algorithms with the geodetic distance retention rate as an indicator, which proves that the data dimensionality reduction method of this model is more effective and ubiquitous.","PeriodicalId":355471,"journal":{"name":"Advances in Engineering Technology Research","volume":"24 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contrastive Learning Dimensionality Reduction Method Based on Manifold Learning\",\"authors\":\"Jinghao Situ\",\"doi\":\"10.56028/aetr.9.1.522.2024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\" With the development of the times, more and more high-dimensional datasets come into people's view. In order to reduce the time complexity and space complexity of downstream tasks, data dimensionality reduction becomes the primary choice. Classical dimensionality reduction algorithms are mainly divided into linear dimensionality reduction algorithms and nonlinear dimensionality reduction algorithms. Some of the traditional dimensionality reduction methods have the problems of not considering the nonlinear structure of the original dataset and the existence of weak generalisation, which makes the dimensionality reduction effect not good or model need to be recalculated because of the addition of new samples. In order to solve these problems, the research in this paper is a comparative learning dimensionality reduction method based on manifold learning. The idea of manifold learning using geodesic distance can fully consider the nonlinear structure of the original dataset. In this paper, comparative learning is the main framework. When the neural network completes the training, it only need to take the new data as input to calculate, the result can be obtained, no need to reconstruct the model which means the generality is high. Starting from the related work, this paper briefly introduces manifold learning, comparative learning and neural network algorithms. Subsequently, an innovative model is proposed, including three modules, Isomap to extract nonlinear structure, expanding neighbourhood to make pseudo-labels, and comparative learning training. Detailed analyses are carried out through experiments, comparing with PCA and LLE algorithms with the geodetic distance retention rate as an indicator, which proves that the data dimensionality reduction method of this model is more effective and ubiquitous.\",\"PeriodicalId\":355471,\"journal\":{\"name\":\"Advances in Engineering Technology Research\",\"volume\":\"24 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Engineering Technology Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.56028/aetr.9.1.522.2024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Engineering Technology Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.56028/aetr.9.1.522.2024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
随着时代的发展,越来越多的高维数据集进入人们的视野。为了降低下游任务的时间复杂度和空间复杂度,数据降维成为首要选择。经典的降维算法主要分为线性降维算法和非线性降维算法。一些传统的降维方法存在不考虑原始数据集的非线性结构、存在弱泛化等问题,使得降维效果不佳,或因增加新样本而需要重新计算模型。为了解决这些问题,本文研究了一种基于流形学习的比较学习降维方法。流形学习利用大地距离的思想可以充分考虑原始数据集的非线性结构。本文以比较学习为主要框架。当神经网络完成训练后,只需将新数据作为输入进行计算,即可得到结果,无需重构模型,通用性高。本文从相关工作出发,简要介绍了流形学习、比较学习和神经网络算法。随后,本文提出了一个创新模型,包括三个模块:提取非线性结构的 Isomap 模块、扩展邻域制作伪标签模块和比较学习训练模块。通过实验进行了详细分析,以大地测量距离保留率为指标,与 PCA 算法和 LLE 算法进行了比较,证明该模型的数据降维方法更有效、更普遍。
Contrastive Learning Dimensionality Reduction Method Based on Manifold Learning
With the development of the times, more and more high-dimensional datasets come into people's view. In order to reduce the time complexity and space complexity of downstream tasks, data dimensionality reduction becomes the primary choice. Classical dimensionality reduction algorithms are mainly divided into linear dimensionality reduction algorithms and nonlinear dimensionality reduction algorithms. Some of the traditional dimensionality reduction methods have the problems of not considering the nonlinear structure of the original dataset and the existence of weak generalisation, which makes the dimensionality reduction effect not good or model need to be recalculated because of the addition of new samples. In order to solve these problems, the research in this paper is a comparative learning dimensionality reduction method based on manifold learning. The idea of manifold learning using geodesic distance can fully consider the nonlinear structure of the original dataset. In this paper, comparative learning is the main framework. When the neural network completes the training, it only need to take the new data as input to calculate, the result can be obtained, no need to reconstruct the model which means the generality is high. Starting from the related work, this paper briefly introduces manifold learning, comparative learning and neural network algorithms. Subsequently, an innovative model is proposed, including three modules, Isomap to extract nonlinear structure, expanding neighbourhood to make pseudo-labels, and comparative learning training. Detailed analyses are carried out through experiments, comparing with PCA and LLE algorithms with the geodetic distance retention rate as an indicator, which proves that the data dimensionality reduction method of this model is more effective and ubiquitous.