{"title":"Improving Gene Expression Prediction of Cancer Data Using Nature Inspired Optimization Algorithms","authors":"Payal Patel, K. Passi, Chakresh Kumar Jain","doi":"10.1109/CSCI54926.2021.00128","DOIUrl":null,"url":null,"abstract":"Cancer being one of the most vital diseases in the medical history needs adequate focus on its causes, symptoms and detection. Various algorithms and software have been designed so far to predict the cancer at cellular level. The most crucial aspect for sorting the cancerous tissues is the classification of such tissues based on the gene expression data. Gene expression data consists of high amount of genetic data as compared to the number of data samples. Thus, sample size and dimensions are a major challenge for researchers. In this work, four different types of cancer microarray datasets are analyzed viz., breast cancer, lung cancer, leukemia and colon cancer. The analysis of the cancer microarray datasets was done using various nature-inspired algorithms like Grasshopper Optimization (GOA), Particle Swarm Optimization (PSO), and Interval Value-based Particle Swarm Optimization (IVPSO). To study the accuracy of the prediction, five different classifiers were used: Random Forest, K-Nearest Neighborhood (KNN), Neural Network, Naïve Bayes and Support Vector Machine (SVM). The Grasshopper Optimization (GOA) outperforms in accuracy compared to the other two optimization algorithms with SVM classifier on leukemia, lung and breast cancer datasets selecting the best genes/attributes to correctly classify the dataset.","PeriodicalId":206881,"journal":{"name":"2021 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI54926.2021.00128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Cancer being one of the most vital diseases in the medical history needs adequate focus on its causes, symptoms and detection. Various algorithms and software have been designed so far to predict the cancer at cellular level. The most crucial aspect for sorting the cancerous tissues is the classification of such tissues based on the gene expression data. Gene expression data consists of high amount of genetic data as compared to the number of data samples. Thus, sample size and dimensions are a major challenge for researchers. In this work, four different types of cancer microarray datasets are analyzed viz., breast cancer, lung cancer, leukemia and colon cancer. The analysis of the cancer microarray datasets was done using various nature-inspired algorithms like Grasshopper Optimization (GOA), Particle Swarm Optimization (PSO), and Interval Value-based Particle Swarm Optimization (IVPSO). To study the accuracy of the prediction, five different classifiers were used: Random Forest, K-Nearest Neighborhood (KNN), Neural Network, Naïve Bayes and Support Vector Machine (SVM). The Grasshopper Optimization (GOA) outperforms in accuracy compared to the other two optimization algorithms with SVM classifier on leukemia, lung and breast cancer datasets selecting the best genes/attributes to correctly classify the dataset.