A New Fuzzy Adaptive Algorithm to Classify Imbalanced Data

IF 2 4区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Cmc-computers Materials & Continua Pub Date : 2022-01-01 DOI:10.32604/cmc.2022.017114
Harshita Patel, D. Rajput, O. Stan, L. Miclea
{"title":"A New Fuzzy Adaptive Algorithm to Classify Imbalanced Data","authors":"Harshita Patel, D. Rajput, O. Stan, L. Miclea","doi":"10.32604/cmc.2022.017114","DOIUrl":null,"url":null,"abstract":"Classification of imbalanced data is a well explored issue in the data mining and machine learning community where one class representation is overwhelmed by other classes. The Imbalanced distribution of data is a natural occurrence in real world datasets, so needed to be dealt with carefully to get important insights. In case of imbalance in data sets, traditional classifiers have to sacrifice their performances, therefore lead to misclassifications. This paper suggests a weighted nearest neighbor approach in a fuzzy manner to deal with this issue. We have adapted the ‘existing algorithm modification solution’ to learn from imbalanced datasets that classify data without manipulating the natural distribution of data unlike the other popular data balancing methods. The K nearest neighbor is a non-parametric classification method that is mostly used in machine learning problems. Fuzzy classification with the nearest neighbor clears the belonging of an instance to classes and optimal weights with improved nearest neighbor concept helping to correctly classify imbalanced data. The proposed hybrid approach takes care of imbalance nature of data and reduces the inaccuracies appear in applications of original and traditional classifiers. Results show that it performs well over the existing fuzzy nearest neighbor and weighted neighbor strategies for imbalanced learning.","PeriodicalId":10440,"journal":{"name":"Cmc-computers Materials & Continua","volume":"26 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cmc-computers Materials & Continua","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.32604/cmc.2022.017114","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 7

Abstract

Classification of imbalanced data is a well explored issue in the data mining and machine learning community where one class representation is overwhelmed by other classes. The Imbalanced distribution of data is a natural occurrence in real world datasets, so needed to be dealt with carefully to get important insights. In case of imbalance in data sets, traditional classifiers have to sacrifice their performances, therefore lead to misclassifications. This paper suggests a weighted nearest neighbor approach in a fuzzy manner to deal with this issue. We have adapted the ‘existing algorithm modification solution’ to learn from imbalanced datasets that classify data without manipulating the natural distribution of data unlike the other popular data balancing methods. The K nearest neighbor is a non-parametric classification method that is mostly used in machine learning problems. Fuzzy classification with the nearest neighbor clears the belonging of an instance to classes and optimal weights with improved nearest neighbor concept helping to correctly classify imbalanced data. The proposed hybrid approach takes care of imbalance nature of data and reduces the inaccuracies appear in applications of original and traditional classifiers. Results show that it performs well over the existing fuzzy nearest neighbor and weighted neighbor strategies for imbalanced learning.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种新的模糊自适应不平衡数据分类算法
不平衡数据的分类是数据挖掘和机器学习社区中一个很好的探索问题,其中一个类表示被其他类淹没。数据的不平衡分布在现实世界的数据集中是一种自然现象,因此需要仔细处理以获得重要的见解。在数据集不平衡的情况下,传统的分类器不得不牺牲其性能,从而导致误分类。本文提出了一种模糊加权最近邻法来处理这一问题。我们已经调整了“现有的算法修改解决方案”,从不平衡的数据集中学习数据分类,而不像其他流行的数据平衡方法那样操纵数据的自然分布。K近邻是一种非参数分类方法,主要用于机器学习问题。基于最近邻的模糊分类清除了实例对类的归属,改进了最近邻概念的最优权值有助于正确分类不平衡数据。该方法兼顾了数据的不平衡性,降低了传统分类器和原始分类器在应用中出现的不准确性。结果表明,该方法在不平衡学习方面优于现有的模糊近邻和加权近邻策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cmc-computers Materials & Continua
Cmc-computers Materials & Continua 工程技术-材料科学:综合
CiteScore
5.30
自引率
19.40%
发文量
345
审稿时长
1 months
期刊介绍: This journal publishes original research papers in the areas of computer networks, artificial intelligence, big data management, software engineering, multimedia, cyber security, internet of things, materials genome, integrated materials science, data analysis, modeling, and engineering of designing and manufacturing of modern functional and multifunctional materials. Novel high performance computing methods, big data analysis, and artificial intelligence that advance material technologies are especially welcome.
期刊最新文献
Estimating Fuel-Efficient Air Plane Trajectories Using Machine Learning 2D Finite Element Analysis of Asynchronous Machine Influenced Under Power Quality Perturbations Multi-Attribute Selection Procedures Based on Regret and Rejoice for the Decision-Maker Disease Diagnosis System Using IoT Empowered with Fuzzy Inference System Automated Grading of Breast Cancer Histopathology Images Using Multilayered Autoencoder
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1