A Genetic Asexual Reproduction Optimization Algorithm for Imputing Missing Values

M. Noei, M. S. Abadeh
{"title":"A Genetic Asexual Reproduction Optimization Algorithm for Imputing Missing Values","authors":"M. Noei, M. S. Abadeh","doi":"10.1109/ICCKE48569.2019.8964808","DOIUrl":null,"url":null,"abstract":"In this paper, we suggest a new technique that significantly improve the computational time of the genetic algorithm for imputing missing values. Data contain noise and missing values, which made them unreliable for scientific purposes. Due to this, we are required to preprocess these data before using them. Researchers either avoid or impute missing data. It is necessary to choose an appropriate imputation method, and it is based on several factors such as datatypes and numbers of missing data. For a higher missing value rate, missing value imputation (MVI) can be suitable way for imputing missing data in incomplete dataset. One of the MVI methods is the genetic algorithm; although genetic algorithm may produce good results, the computational time is very high. The proposed algorithm is a combination of the genetic and Asexual Reproduction Optimization (ARO) algorithm. We present an experimental evaluation of Pima and mammographic mass dataset that collected from UCI repository. In the small percentage of missing values, those instances can be imputed by the ARO algorithm, but in the case of large amounts, our approach illustrates much better results. This proposed technique works even better when the rate of missing values is higher. The accuracy and computational time of our proposed algorithm are compared with another techniques like Mean, K-Nearest Neighbor, and SVM. On average our approach 8% improved the accuracy and 4% improved the ROC, and it requires less computational time than a basic genetic algorithm.","PeriodicalId":6685,"journal":{"name":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"9 1","pages":"214-218"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE48569.2019.8964808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

In this paper, we suggest a new technique that significantly improve the computational time of the genetic algorithm for imputing missing values. Data contain noise and missing values, which made them unreliable for scientific purposes. Due to this, we are required to preprocess these data before using them. Researchers either avoid or impute missing data. It is necessary to choose an appropriate imputation method, and it is based on several factors such as datatypes and numbers of missing data. For a higher missing value rate, missing value imputation (MVI) can be suitable way for imputing missing data in incomplete dataset. One of the MVI methods is the genetic algorithm; although genetic algorithm may produce good results, the computational time is very high. The proposed algorithm is a combination of the genetic and Asexual Reproduction Optimization (ARO) algorithm. We present an experimental evaluation of Pima and mammographic mass dataset that collected from UCI repository. In the small percentage of missing values, those instances can be imputed by the ARO algorithm, but in the case of large amounts, our approach illustrates much better results. This proposed technique works even better when the rate of missing values is higher. The accuracy and computational time of our proposed algorithm are compared with another techniques like Mean, K-Nearest Neighbor, and SVM. On average our approach 8% improved the accuracy and 4% improved the ROC, and it requires less computational time than a basic genetic algorithm.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
缺失值输入的遗传无性繁殖优化算法
在本文中,我们提出了一种新的技术,可以显著提高缺失值的遗传算法的计算时间。数据包含噪声和缺失值,这使得它们对科学目的不可靠。因此,我们需要在使用这些数据之前对其进行预处理。研究人员要么回避,要么归咎于缺失的数据。选择合适的归算方法是必要的,这是基于数据类型和缺失数据数量等几个因素。对于缺失值率较高的不完整数据集,缺失值插值(MVI)是一种适合的缺失数据的插值方法。其中一种MVI方法是遗传算法;虽然遗传算法可以产生很好的结果,但计算时间非常高。该算法是遗传算法和无性生殖优化算法的结合。我们提出了从UCI存储库收集的皮马和乳房x线摄影质量数据集的实验评估。在缺失值的一小部分情况下,这些实例可以通过ARO算法进行估算,但是在缺失值很大的情况下,我们的方法显示了更好的结果。当缺失值的比率较高时,这种建议的技术效果更好。我们提出的算法的精度和计算时间与其他技术如均值,k近邻和支持向量机进行了比较。平均而言,我们的方法提高了8%的准确率,提高了4%的ROC,并且它比基本的遗传算法需要更少的计算时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Novel Parallel Jobs Scheduling Algorithm in The Cloud Computing Online QoS Multicast Routing in Multi-Channel Multi-Radio Wireless Mesh Networks using Network Coding Tasks Decomposition for Improvement of Genetic Network Programming Robust Real-time Magnetic-based Object Localization to Sensor’s Fault using Recurrent Neural Networks A Case Study for Presenting Bank Recommender Systems based on Bon Card Transaction Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1