Saeed Sarbazi-Azad, Mohammad Saniee Abadeh, Mohammad Erfan Mowlaei
{"title":"利用数据复杂性度量和进化文化算法在微阵列数据中进行基因选择","authors":"Saeed Sarbazi-Azad, Mohammad Saniee Abadeh, Mohammad Erfan Mowlaei","doi":"10.1016/j.socl.2020.100007","DOIUrl":null,"url":null,"abstract":"<div><p>Cancer detection using gene expression data has been a major trend of research for the last decade. Microarray gene expression data is one of the most challenging types of data due to high dimensionality and rarity of available samples. Feature redundancy greatly contributes to the difficulty of prediction task. Therefore, it is essential to apply feature selection to datasets to reduce the number of features selected for the classification task. In this paper, a novel two-staged framework is proposed to confront curse of dimensionality in microarray data using data complexity measures and a customized cultural algorithm, incorporating a static belief space into the genetic algorithm in order to reduce the search space and prioritize important genes. Experimental results indicate highly improved accuracy and reduction in number of selected genes compared to the state-of-the-art methods on Gli85, Colon, DLBCL, SMK and CNS datasets.</p></div>","PeriodicalId":101169,"journal":{"name":"Soft Computing Letters","volume":"3 ","pages":"Article 100007"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.socl.2020.100007","citationCount":"9","resultStr":"{\"title\":\"Using data complexity measures and an evolutionary cultural algorithm for gene selection in microarray data\",\"authors\":\"Saeed Sarbazi-Azad, Mohammad Saniee Abadeh, Mohammad Erfan Mowlaei\",\"doi\":\"10.1016/j.socl.2020.100007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Cancer detection using gene expression data has been a major trend of research for the last decade. Microarray gene expression data is one of the most challenging types of data due to high dimensionality and rarity of available samples. Feature redundancy greatly contributes to the difficulty of prediction task. Therefore, it is essential to apply feature selection to datasets to reduce the number of features selected for the classification task. In this paper, a novel two-staged framework is proposed to confront curse of dimensionality in microarray data using data complexity measures and a customized cultural algorithm, incorporating a static belief space into the genetic algorithm in order to reduce the search space and prioritize important genes. Experimental results indicate highly improved accuracy and reduction in number of selected genes compared to the state-of-the-art methods on Gli85, Colon, DLBCL, SMK and CNS datasets.</p></div>\",\"PeriodicalId\":101169,\"journal\":{\"name\":\"Soft Computing Letters\",\"volume\":\"3 \",\"pages\":\"Article 100007\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.socl.2020.100007\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S266622212030006X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing Letters","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266622212030006X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using data complexity measures and an evolutionary cultural algorithm for gene selection in microarray data
Cancer detection using gene expression data has been a major trend of research for the last decade. Microarray gene expression data is one of the most challenging types of data due to high dimensionality and rarity of available samples. Feature redundancy greatly contributes to the difficulty of prediction task. Therefore, it is essential to apply feature selection to datasets to reduce the number of features selected for the classification task. In this paper, a novel two-staged framework is proposed to confront curse of dimensionality in microarray data using data complexity measures and a customized cultural algorithm, incorporating a static belief space into the genetic algorithm in order to reduce the search space and prioritize important genes. Experimental results indicate highly improved accuracy and reduction in number of selected genes compared to the state-of-the-art methods on Gli85, Colon, DLBCL, SMK and CNS datasets.