{"title":"基于密度峰聚类的混合蚱蜢和变色龙群优化文本特征选择算法","authors":"R. Purushothaman, S. Selvakumar, S. Rajagopalan","doi":"10.1142/s1469026822500183","DOIUrl":null,"url":null,"abstract":"Clustering consists of various applications on machine learning, image segmentation, data mining and pattern recognition. The proper selection of clustering is significant in feature selection. Therefore, in this paper, a Text Feature Selection (FS) and Clustering using Grasshopper–Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA) is proposed. Initially, the input features are pre-processed for converting text into numerical form. These preprocessed text features are given to Grasshopper–Chameleon Swarm Optimization Algorithm, which selects important text features. In Grasshopper–Chameleon Swarm Optimization Algorithm, the Grasshopper Optimization Algorithm selects local feature from text document and Chameleon Swarm Optimization Algorithm selects the best global feature from local feature. These important features are tested using density peaks clustering algorithm to maximize the reliability and minimize the computational time cost. The performance of Grasshopper–Chameleon Swarm Optimization Algorithm is analyzed with 20 News groups dataset. Moreover, the performance metrics, like accuracy, precision, sensitivity, specificity, execution time and memory usage are analyzed. The simulation process shows that the proposed TFSC-G-CSOA-DPCA method provides better accuracy of 97.36%, 95.14%, 94.67% and 91.91% and maximum sensitivity of 96.25%, 87.25%, 93.96% and 92.59% compared to the existing methods such as TFSC-BBA-MCL, TFSC-MVO-K-Means C, TFSC-GWO-GOA-FCM and TFSC-WM-K-Means C, respectively.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"154 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid Grasshopper and Chameleon Swarm Optimization Algorithm for Text Feature Selection with Density Peaks Clustering\",\"authors\":\"R. Purushothaman, S. Selvakumar, S. Rajagopalan\",\"doi\":\"10.1142/s1469026822500183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering consists of various applications on machine learning, image segmentation, data mining and pattern recognition. The proper selection of clustering is significant in feature selection. Therefore, in this paper, a Text Feature Selection (FS) and Clustering using Grasshopper–Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA) is proposed. Initially, the input features are pre-processed for converting text into numerical form. These preprocessed text features are given to Grasshopper–Chameleon Swarm Optimization Algorithm, which selects important text features. In Grasshopper–Chameleon Swarm Optimization Algorithm, the Grasshopper Optimization Algorithm selects local feature from text document and Chameleon Swarm Optimization Algorithm selects the best global feature from local feature. These important features are tested using density peaks clustering algorithm to maximize the reliability and minimize the computational time cost. The performance of Grasshopper–Chameleon Swarm Optimization Algorithm is analyzed with 20 News groups dataset. Moreover, the performance metrics, like accuracy, precision, sensitivity, specificity, execution time and memory usage are analyzed. The simulation process shows that the proposed TFSC-G-CSOA-DPCA method provides better accuracy of 97.36%, 95.14%, 94.67% and 91.91% and maximum sensitivity of 96.25%, 87.25%, 93.96% and 92.59% compared to the existing methods such as TFSC-BBA-MCL, TFSC-MVO-K-Means C, TFSC-GWO-GOA-FCM and TFSC-WM-K-Means C, respectively.\",\"PeriodicalId\":422521,\"journal\":{\"name\":\"Int. J. Comput. Intell. Appl.\",\"volume\":\"154 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Comput. Intell. Appl.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s1469026822500183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Intell. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1469026822500183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
聚类包括机器学习、图像分割、数据挖掘和模式识别等多种应用。聚类的正确选择在特征选择中具有重要意义。为此,本文提出了一种基于Grasshopper-Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA)的文本特征选择(FS)和聚类算法。首先,对输入特征进行预处理,以便将文本转换为数字形式。将这些预处理后的文本特征输入到Grasshopper-Chameleon Swarm Optimization算法中,从中选择重要的文本特征。在Grasshopper - Chameleon Swarm Optimization Algorithm中,Grasshopper Optimization Algorithm从文本文档中选择局部特征,Chameleon Swarm Optimization Algorithm从局部特征中选择最优的全局特征。使用密度峰值聚类算法对这些重要特征进行测试,以最大限度地提高可靠性和最小化计算时间开销。用20个新闻组数据集分析了蝗虫-变色龙群优化算法的性能。此外,还分析了准确性、精密度、灵敏度、特异性、执行时间和内存使用等性能指标。仿真结果表明,与现有的TFSC-BBA-MCL、TFSC-MVO-K-Means C、TFSC-GWO-GOA-FCM和TFSC-WM-K-Means C方法相比,所提出的TFSC-G-CSOA-DPCA方法准确率分别为97.36%、95.14%、94.67%和91.91%,最大灵敏度分别为96.25%、87.25%、93.96%和92.59%。
Hybrid Grasshopper and Chameleon Swarm Optimization Algorithm for Text Feature Selection with Density Peaks Clustering
Clustering consists of various applications on machine learning, image segmentation, data mining and pattern recognition. The proper selection of clustering is significant in feature selection. Therefore, in this paper, a Text Feature Selection (FS) and Clustering using Grasshopper–Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA) is proposed. Initially, the input features are pre-processed for converting text into numerical form. These preprocessed text features are given to Grasshopper–Chameleon Swarm Optimization Algorithm, which selects important text features. In Grasshopper–Chameleon Swarm Optimization Algorithm, the Grasshopper Optimization Algorithm selects local feature from text document and Chameleon Swarm Optimization Algorithm selects the best global feature from local feature. These important features are tested using density peaks clustering algorithm to maximize the reliability and minimize the computational time cost. The performance of Grasshopper–Chameleon Swarm Optimization Algorithm is analyzed with 20 News groups dataset. Moreover, the performance metrics, like accuracy, precision, sensitivity, specificity, execution time and memory usage are analyzed. The simulation process shows that the proposed TFSC-G-CSOA-DPCA method provides better accuracy of 97.36%, 95.14%, 94.67% and 91.91% and maximum sensitivity of 96.25%, 87.25%, 93.96% and 92.59% compared to the existing methods such as TFSC-BBA-MCL, TFSC-MVO-K-Means C, TFSC-GWO-GOA-FCM and TFSC-WM-K-Means C, respectively.