JoMIC: A joint MI-based filter feature selection method

Journal of Computational Mathematics and Data Science Pub Date : 2023-01-01 DOI:10.1016/j.jcmds.2023.100075

Khumukcham Robindro , Urikhimbam Boby Clinton , Nazrul Hoque , Dhruba K. Bhattacharyya

{"title":"JoMIC: A joint MI-based filter feature selection method","authors":"Khumukcham Robindro , Urikhimbam Boby Clinton , Nazrul Hoque , Dhruba K. Bhattacharyya","doi":"10.1016/j.jcmds.2023.100075","DOIUrl":null,"url":null,"abstract":"<div><p>Feature selection (FS) is a common preprocessing step of machine learning that selects informative subset of features which fuels a model to perform better during prediction or classification. It helps in the design of an intelligent and expert system used in computer vision, image processing, gene expression data analysis, intrusion detection and natural language processing. In this paper, we introduce an effective filter method called Joint Mutual Information with Class relevance (JoMIC) using multivariate Joint Mutual Information (JMI) and Mutual Information (MI). Our method considers both JMI and MI of a non selected feature with selected ones w.r.t a given class to select a feature that is highly relevant to the class but non redundant to other selected features. We compare our method with seven other filter-based methods using the machine learning classifiers viz., Logistic Regression, Support Vector Machine, K-nearest Neighbor (KNN), Decision Tree, Random Forest, Naïve Bayes, and Stochastic Gradient Descent on various datasets. Experimental results reveal that our method yields better performance in terms of accuracy, Matthew’s Correlation Coefficient (MCC) and F1-score over 16 benchmark datasets, as compared to other competent methods. The superiority of our proposed method is that it uses an effective objective function that combines both JMI and MI to choose the relevant and non redundant features.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"6 ","pages":"Article 100075"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Mathematics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772415823000020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Feature selection (FS) is a common preprocessing step of machine learning that selects informative subset of features which fuels a model to perform better during prediction or classification. It helps in the design of an intelligent and expert system used in computer vision, image processing, gene expression data analysis, intrusion detection and natural language processing. In this paper, we introduce an effective filter method called Joint Mutual Information with Class relevance (JoMIC) using multivariate Joint Mutual Information (JMI) and Mutual Information (MI). Our method considers both JMI and MI of a non selected feature with selected ones w.r.t a given class to select a feature that is highly relevant to the class but non redundant to other selected features. We compare our method with seven other filter-based methods using the machine learning classifiers viz., Logistic Regression, Support Vector Machine, K-nearest Neighbor (KNN), Decision Tree, Random Forest, Naïve Bayes, and Stochastic Gradient Descent on various datasets. Experimental results reveal that our method yields better performance in terms of accuracy, Matthew’s Correlation Coefficient (MCC) and F1-score over 16 benchmark datasets, as compared to other competent methods. The superiority of our proposed method is that it uses an effective objective function that combines both JMI and MI to choose the relevant and non redundant features.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

JoMIC：一种基于MI的联合滤波器特征选择方法

特征选择（FS）是机器学习的一个常见预处理步骤，它选择信息丰富的特征子集，为模型在预测或分类过程中表现更好提供燃料。它有助于设计一个用于计算机视觉、图像处理、基因表达数据分析、入侵检测和自然语言处理的智能专家系统。在本文中，我们介绍了一种有效的过滤方法，称为具有类相关性的联合互信息（JoMIC），使用多元联合互信息和互信息。我们的方法考虑了一个非选定特征的JMI和MI，以及给定类的选定特征，以选择与该类高度相关但与其他选定特征不冗余的特征。我们将我们的方法与其他七种使用机器学习分类器的基于滤波器的方法进行了比较，即在各种数据集上的Logistic回归、支持向量机、K-最近邻（KNN）、决策树、随机森林、朴素贝叶斯和随机梯度下降。实验结果表明，与其他有能力的方法相比，我们的方法在16个基准数据集上的准确性、Matthew相关系数（MCC）和F1分数方面具有更好的性能。我们提出的方法的优势在于，它使用了一个有效的目标函数，该函数结合了JMI和MI来选择相关和非冗余的特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Computational Mathematics and Data Science

CiteScore

3.00

自引率

0.00%

发文量