一种基于集成分类器的仿生混合多过滤器基因选择方法。

IF 4.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Computing & Applications Pub Date : 2023-01-01 DOI:10.1007/s00521-021-06459-9
Babak Nouri-Moghaddam, Mehdi Ghazanfari, Mohammad Fathian
{"title":"一种基于集成分类器的仿生混合多过滤器基因选择方法。","authors":"Babak Nouri-Moghaddam,&nbsp;Mehdi Ghazanfari,&nbsp;Mohammad Fathian","doi":"10.1007/s00521-021-06459-9","DOIUrl":null,"url":null,"abstract":"<p><p>Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s00521-021-06459-9.</p>","PeriodicalId":49766,"journal":{"name":"Neural Computing & Applications","volume":null,"pages":null},"PeriodicalIF":4.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8435304/pdf/","citationCount":"9","resultStr":"{\"title\":\"A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data.\",\"authors\":\"Babak Nouri-Moghaddam,&nbsp;Mehdi Ghazanfari,&nbsp;Mohammad Fathian\",\"doi\":\"10.1007/s00521-021-06459-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s00521-021-06459-9.</p>\",\"PeriodicalId\":49766,\"journal\":{\"name\":\"Neural Computing & Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8435304/pdf/\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Computing & Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00521-021-06459-9\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computing & Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00521-021-06459-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 9

摘要

微阵列技术被认为是收集DNA表达数据最重要的工具之一。这项技术使研究人员能够调查和检查疾病的类型及其起源。然而,微阵列数据往往样本量小,基因数量多,数据不平衡等,使得分类模型效率低下。为此,提出了一种基于多滤波器和自适应混沌多目标森林优化算法(AC-MOFOA)的混合解决方案来解决基因选择问题并构建集成分类器。在该解决方案中,提出了一个多滤波模型(即集成滤波)作为预处理步骤来降低数据集的维数,使用五种滤波方法的组合来去除冗余和不相关的基因。因此,使用基于投票的函数将五种过滤方法的结果组合在一起。此外,所提出的多滤波器在减小基因子集大小和选择相关基因方面具有良好的能力。然后,提出了一种基于非支配排序、拥挤距离、混沌理论和自适应算子的AC-MOFOA算法。AC-MOFOA作为一种包装方法,旨在降低数据集维数,优化KELM,同时提高分类精度。其次,在该方法中,提出了一个集成分类器模型,利用AC-MOFOA结果对微阵列数据进行分类。采用5种混合多目标方法和3种混合单目标方法,在基因选择数、分类效率、执行时间、时间复杂度、超大体积指标和间隔度量等方面对所提算法进行了性能评估。结果表明,本文提出的混合方法可以通过降低数据集的维数来提高KELM在大多数数据集上的准确率,并取得与其他多目标方法相似或更好的性能。此外,与传统集成方法相比,所提出的集成分类器模型在9个微阵列数据集中的7个数据集上具有更好的分类精度和泛化性。此外,集成分类器模型与三种最先进的集成生成方法的比较结果表明了其竞争性能,其中所提出的集成模型在9个数据集中的5个数据集上取得了更好的结果。补充信息:在线版本包含补充资料,下载地址为10.1007/s00521-021-06459-9。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data.

Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets.

Supplementary information: The online version contains supplementary material available at 10.1007/s00521-021-06459-9.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Computing & Applications
Neural Computing & Applications 工程技术-计算机:人工智能
CiteScore
11.40
自引率
8.30%
发文量
1280
审稿时长
6.9 months
期刊介绍: Neural Computing & Applications is an international journal which publishes original research and other information in the field of practical applications of neural computing and related techniques such as genetic algorithms, fuzzy logic and neuro-fuzzy systems. All items relevant to building practical systems are within its scope, including but not limited to: -adaptive computing- algorithms- applicable neural networks theory- applied statistics- architectures- artificial intelligence- benchmarks- case histories of innovative applications- fuzzy logic- genetic algorithms- hardware implementations- hybrid intelligent systems- intelligent agents- intelligent control systems- intelligent diagnostics- intelligent forecasting- machine learning- neural networks- neuro-fuzzy systems- pattern recognition- performance measures- self-learning systems- software simulations- supervised and unsupervised learning methods- system engineering and integration. Featured contributions fall into several categories: Original Articles, Review Articles, Book Reviews and Announcements.
期刊最新文献
Stress monitoring using wearable sensors: IoT techniques in medical field. A new hybrid model of convolutional neural networks and hidden Markov chains for image classification. Analysing sentiment change detection of Covid-19 tweets. Normal vibration distribution search-based differential evolution algorithm for multimodal biomedical image registration. Special issue on deep learning and big data analytics for medical e-diagnosis/AI-based e-diagnosis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1