{"title":"An adaptive feature reduction algorithm for cancer classification using wavelet decomposition of serum proteomic and DNA microarray data","authors":"S. Rashid, G. M. Maruf","doi":"10.1109/BIBMW.2011.6112391","DOIUrl":null,"url":null,"abstract":"A significant challenge in DNA microarray and mass spectrometric data analysis can be attributed to the problem of having a large number of features with a small number of samples or patients in the data set. Particular care is required to deal with such a problem as the low classification accuracy of a model brought about by the small number of features may depict a low predictive capability. To overcome the associated challenges, proper approaches for data preprocessing, feature reduction and identifying the optimal set of features are critical. In this paper, a novel technique has been proposed for feature reduction and cancer classification; which is applicable for two different types of biological data. The proposed method has been implemented on Surface enhanced laser desorption/ionization time-of-flight mass spectrometric (SELDI-TOF-MS) and DNA microarray data sets. This technique is self adaptive and independent of the type data sets. We have developed a two step strategy for feature reduction such as (1) data preprocessing which includes merging and t-testing and (2) wavelet decomposition. For classification purpose, support vector machine (SVM) has been proposed. By evaluating the performance of the proposed algorithm on the two types of datasets it has been shown that the classification accuracy, sensitivity and specificity obtained by the features selected by the proposed method consistently give excellent performance.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"40 1","pages":"305-312"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2011.6112391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
A significant challenge in DNA microarray and mass spectrometric data analysis can be attributed to the problem of having a large number of features with a small number of samples or patients in the data set. Particular care is required to deal with such a problem as the low classification accuracy of a model brought about by the small number of features may depict a low predictive capability. To overcome the associated challenges, proper approaches for data preprocessing, feature reduction and identifying the optimal set of features are critical. In this paper, a novel technique has been proposed for feature reduction and cancer classification; which is applicable for two different types of biological data. The proposed method has been implemented on Surface enhanced laser desorption/ionization time-of-flight mass spectrometric (SELDI-TOF-MS) and DNA microarray data sets. This technique is self adaptive and independent of the type data sets. We have developed a two step strategy for feature reduction such as (1) data preprocessing which includes merging and t-testing and (2) wavelet decomposition. For classification purpose, support vector machine (SVM) has been proposed. By evaluating the performance of the proposed algorithm on the two types of datasets it has been shown that the classification accuracy, sensitivity and specificity obtained by the features selected by the proposed method consistently give excellent performance.