MiNB:少数派敏感Naïve非平衡数据多类分类的贝叶斯算法

Pratik A. Barot, H. Jethva
{"title":"MiNB:少数派敏感Naïve非平衡数据多类分类的贝叶斯算法","authors":"Pratik A. Barot, H. Jethva","doi":"10.34028/iajit/19/4/5","DOIUrl":null,"url":null,"abstract":"The unbalanced nature of data makes it tough to achieve the desire performance goal for classification algorithms. The sub-optimal prediction system isn't a viable solution due to the high misclassification cost of minority events. Thus accurate imbalanced data classification could be a path changer for prediction in domains like medical diagnosis, judiciary, and disaster management systems. To date, most of the existing studies of imbalanced data are for the binary class dataset and supported by data sampling techniques that suffer from loss of information and over-fitting. In this paper, we present the modified naïve Bayesian algorithm for unbalanced data classification that eliminates the requirement of data level sampling. We compared our proposed model with the data sampling technique and cost-sensitive techniques. We use minority sensitive TP Rate, class-specific misclassification rate, and overall performance parameters such as accuracy, f-measure and G-mean. The result shows that our proposed algorithm shows a more optimal result for unbalanced data classification. Results shows reduction in misclassification rate and improve predictive performance for the minority class.","PeriodicalId":13624,"journal":{"name":"Int. Arab J. Inf. Technol.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MiNB: Minority Sensitive Naïve Bayesian Algorithm for Multi-Class Classification of Unbalanced Data\",\"authors\":\"Pratik A. Barot, H. Jethva\",\"doi\":\"10.34028/iajit/19/4/5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The unbalanced nature of data makes it tough to achieve the desire performance goal for classification algorithms. The sub-optimal prediction system isn't a viable solution due to the high misclassification cost of minority events. Thus accurate imbalanced data classification could be a path changer for prediction in domains like medical diagnosis, judiciary, and disaster management systems. To date, most of the existing studies of imbalanced data are for the binary class dataset and supported by data sampling techniques that suffer from loss of information and over-fitting. In this paper, we present the modified naïve Bayesian algorithm for unbalanced data classification that eliminates the requirement of data level sampling. We compared our proposed model with the data sampling technique and cost-sensitive techniques. We use minority sensitive TP Rate, class-specific misclassification rate, and overall performance parameters such as accuracy, f-measure and G-mean. The result shows that our proposed algorithm shows a more optimal result for unbalanced data classification. Results shows reduction in misclassification rate and improve predictive performance for the minority class.\",\"PeriodicalId\":13624,\"journal\":{\"name\":\"Int. Arab J. Inf. Technol.\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. Arab J. Inf. Technol.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34028/iajit/19/4/5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. Arab J. Inf. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34028/iajit/19/4/5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

数据的不平衡特性使得分类算法很难达到理想的性能目标。由于少数事件的错误分类成本高,次优预测系统不是一个可行的解决方案。因此,准确的不平衡数据分类可能会改变医疗诊断、司法和灾害管理系统等领域的预测路径。到目前为止,大多数对不平衡数据的研究都是针对二值类数据集的,并且通过数据采样技术来支持,这些技术存在信息丢失和过拟合的问题。本文提出了一种改进的naïve贝叶斯算法用于非平衡数据分类,该算法消除了对数据级采样的要求。我们将所提出的模型与数据抽样技术和成本敏感技术进行了比较。我们使用少数敏感的TP率、特定类别的误分类率和总体性能参数,如准确性、f-measure和G-mean。结果表明,本文提出的算法对不平衡数据的分类具有较好的效果。结果表明,少数类别的错误分类率降低,预测性能提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MiNB: Minority Sensitive Naïve Bayesian Algorithm for Multi-Class Classification of Unbalanced Data
The unbalanced nature of data makes it tough to achieve the desire performance goal for classification algorithms. The sub-optimal prediction system isn't a viable solution due to the high misclassification cost of minority events. Thus accurate imbalanced data classification could be a path changer for prediction in domains like medical diagnosis, judiciary, and disaster management systems. To date, most of the existing studies of imbalanced data are for the binary class dataset and supported by data sampling techniques that suffer from loss of information and over-fitting. In this paper, we present the modified naïve Bayesian algorithm for unbalanced data classification that eliminates the requirement of data level sampling. We compared our proposed model with the data sampling technique and cost-sensitive techniques. We use minority sensitive TP Rate, class-specific misclassification rate, and overall performance parameters such as accuracy, f-measure and G-mean. The result shows that our proposed algorithm shows a more optimal result for unbalanced data classification. Results shows reduction in misclassification rate and improve predictive performance for the minority class.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Novel Energy Efficient Harvesting Technique for SDWSN using RF Transmitters with MISO Beamforming Incorporating triple attention and multi-scale pyramid network for underwater image enhancement Generative adversarial networks with data augmentation and multiple penalty areas for image synthesis MAPNEWS: a framework for aggregating and organizing online news articles Deep learning based mobilenet and multi-head attention model for facial expression recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1