CORE:基于核的合成少数过采样和边缘多数欠采样技术。

IF 0.4 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI:10.1504/ijdmb.2015.068952

Chumphol Bunkhumpornpat, Krung Sinapiromsaran

{"title":"CORE:基于核的合成少数过采样和边缘多数欠采样技术。","authors":"Chumphol Bunkhumpornpat, Krung Sinapiromsaran","doi":"10.1504/ijdmb.2015.068952","DOIUrl":null,"url":null,"abstract":"Class imbalance learning has recently drawn considerable attention among researchers. In this area, a rare class is the class of primary interest from the aim of classification. Unfortunately, traditional machine learning algorithms fail to detect this class because a huge majority class overwhelms a tiny minority class. In this paper, we propose a new technique called CORE to handle the class imbalance problem. The objective of CORE is to strengthen the core of a minority class and weaken the risk of misclassified minority instances nearby the borderline of a majority class. These core and borderline regions are defined by the applicability of a safe level. As a result, a minority class is more crowed and dominant. The experiment shows that CORE can significantly improve the predictive performance of a minority class when its dataset is imbalance.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 1","pages":"44-58"},"PeriodicalIF":0.4000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.068952","citationCount":"8","resultStr":"{\"title\":\"CORE: core-based synthetic minority over-sampling and borderline majority under-sampling technique.\",\"authors\":\"Chumphol Bunkhumpornpat, Krung Sinapiromsaran\",\"doi\":\"10.1504/ijdmb.2015.068952\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Class imbalance learning has recently drawn considerable attention among researchers. In this area, a rare class is the class of primary interest from the aim of classification. Unfortunately, traditional machine learning algorithms fail to detect this class because a huge majority class overwhelms a tiny minority class. In this paper, we propose a new technique called CORE to handle the class imbalance problem. The objective of CORE is to strengthen the core of a minority class and weaken the risk of misclassified minority instances nearby the borderline of a majority class. These core and borderline regions are defined by the applicability of a safe level. As a result, a minority class is more crowed and dominant. The experiment shows that CORE can significantly improve the predictive performance of a minority class when its dataset is imbalance.\",\"PeriodicalId\":54964,\"journal\":{\"name\":\"International Journal of Data Mining and Bioinformatics\",\"volume\":\"12 1\",\"pages\":\"44-58\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2015-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1504/ijdmb.2015.068952\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Data Mining and Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1504/ijdmb.2015.068952\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1504/ijdmb.2015.068952","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 8

摘要

在这个领域中，从分类的目的来看，一个罕见的类是主要感兴趣的类。不幸的是，传统的机器学习算法无法检测到这类，因为巨大的多数类压倒了极小的少数类。在本文中，我们提出了一种叫做CORE的新技术来处理类不平衡问题。CORE的目标是加强少数类的核心，并削弱少数类在多数类边缘附近被错误分类的风险。这些核心和边界区域由安全水平的适用性来定义。其结果是，少数阶级更加拥挤和占主导地位。实验表明，当少数类的数据不平衡时，CORE可以显著提高其预测性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

CORE: core-based synthetic minority over-sampling and borderline majority under-sampling technique.

Class imbalance learning has recently drawn considerable attention among researchers. In this area, a rare class is the class of primary interest from the aim of classification. Unfortunately, traditional machine learning algorithms fail to detect this class because a huge majority class overwhelms a tiny minority class. In this paper, we propose a new technique called CORE to handle the class imbalance problem. The objective of CORE is to strengthen the core of a minority class and weaken the risk of misclassified minority instances nearby the borderline of a majority class. These core and borderline regions are defined by the applicability of a safe level. As a result, a minority class is more crowed and dominant. The experiment shows that CORE can significantly improve the predictive performance of a minority class when its dataset is imbalance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Data Mining and Bioinformatics 生物-数学与计算生物学

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Mining bioinformatics data is an emerging area at the intersection between bioinformatics and data mining. The objective of IJDMB is to facilitate collaboration between data mining researchers and bioinformaticians by presenting cutting edge research topics and methodologies in the area of data mining for bioinformatics. This perspective acknowledges the inter-disciplinary nature of research in data mining and bioinformatics and provides a unified forum for researchers/practitioners/students/policy makers to share the latest research and developments in this fast growing multi-disciplinary research area.