针对不平衡数据的模糊近邻混合正样本增强算法

IF 3.6 3区 计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS International Journal of Fuzzy Systems Pub Date : 2024-06-03 DOI:10.1007/s40815-024-01721-3
Jiapeng Yang, Lei Shi, Tielin Lu, Lu Yuan, Nanchang Cheng, Xiaohui Yang, Jia Luo, Mingying Xu
{"title":"针对不平衡数据的模糊近邻混合正样本增强算法","authors":"Jiapeng Yang, Lei Shi, Tielin Lu, Lu Yuan, Nanchang Cheng, Xiaohui Yang, Jia Luo, Mingying Xu","doi":"10.1007/s40815-024-01721-3","DOIUrl":null,"url":null,"abstract":"<p>The class imbalance problem is one of the critical research areas of machine learning and deep learning and has received widespread attention from researchers. To solve the class imbalance problem, current typical methods only use positive samples to generate synthetic samples that are similar to the minority class while ignoring the characteristic information of negative samples. Therefore, when the number of positive samples is too small and has highly similar features, it will cause the classifier to have fitting problems. In response to the above problems, we propose a new positive sample enhancement algorithm (PENH) to solve the class imbalance by simulating the process of chromosome cross-fusion. We select the fuzzy negative sample set around the positive sample by the <i>K</i>-nearest neighbor algorithm and adopt the beyond empirical risk minimization (Mixup) to randomly hybridize the positive sample with the negative sample of the set. To overcome the problem of sample imbalance, we adopt the One-class SVM with overfitting of positive samples to select the newly generated unlabeled samples to obtain the balanced dataset. We construct multiple experiments in 20 open datasets. The results show that our PENH outperforms the other six baseline methods in multiple evaluation indicator.</p>","PeriodicalId":14056,"journal":{"name":"International Journal of Fuzzy Systems","volume":"41 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Positive Sample Enhancement Algorithm with Fuzzy Nearest Neighbor Hybridization for Imbalance Data\",\"authors\":\"Jiapeng Yang, Lei Shi, Tielin Lu, Lu Yuan, Nanchang Cheng, Xiaohui Yang, Jia Luo, Mingying Xu\",\"doi\":\"10.1007/s40815-024-01721-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The class imbalance problem is one of the critical research areas of machine learning and deep learning and has received widespread attention from researchers. To solve the class imbalance problem, current typical methods only use positive samples to generate synthetic samples that are similar to the minority class while ignoring the characteristic information of negative samples. Therefore, when the number of positive samples is too small and has highly similar features, it will cause the classifier to have fitting problems. In response to the above problems, we propose a new positive sample enhancement algorithm (PENH) to solve the class imbalance by simulating the process of chromosome cross-fusion. We select the fuzzy negative sample set around the positive sample by the <i>K</i>-nearest neighbor algorithm and adopt the beyond empirical risk minimization (Mixup) to randomly hybridize the positive sample with the negative sample of the set. To overcome the problem of sample imbalance, we adopt the One-class SVM with overfitting of positive samples to select the newly generated unlabeled samples to obtain the balanced dataset. We construct multiple experiments in 20 open datasets. The results show that our PENH outperforms the other six baseline methods in multiple evaluation indicator.</p>\",\"PeriodicalId\":14056,\"journal\":{\"name\":\"International Journal of Fuzzy Systems\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Fuzzy Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40815-024-01721-3\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Fuzzy Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40815-024-01721-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

类不平衡问题是机器学习和深度学习的重要研究领域之一,受到了研究人员的广泛关注。为了解决类不平衡问题,目前的典型方法只使用正样本生成与少数类相似的合成样本,而忽略了负样本的特征信息。因此,当正向样本数量太少且特征高度相似时,会导致分类器出现拟合问题。针对上述问题,我们提出了一种新的正样本增强算法(PENH),通过模拟染色体交叉融合过程来解决类不平衡问题。我们通过 K-nearest neighbor 算法选择正样本周围的模糊负样本集,并采用超越经验风险最小化(Mixup)算法随机混合正样本和负样本集。为了克服样本不平衡的问题,我们采用对正样本进行过拟合的单类 SVM 来选择新生成的未标记样本,从而获得平衡的数据集。我们在 20 个开放数据集上进行了多次实验。结果表明,我们的 PENH 在多个评价指标上都优于其他六种基线方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Positive Sample Enhancement Algorithm with Fuzzy Nearest Neighbor Hybridization for Imbalance Data

The class imbalance problem is one of the critical research areas of machine learning and deep learning and has received widespread attention from researchers. To solve the class imbalance problem, current typical methods only use positive samples to generate synthetic samples that are similar to the minority class while ignoring the characteristic information of negative samples. Therefore, when the number of positive samples is too small and has highly similar features, it will cause the classifier to have fitting problems. In response to the above problems, we propose a new positive sample enhancement algorithm (PENH) to solve the class imbalance by simulating the process of chromosome cross-fusion. We select the fuzzy negative sample set around the positive sample by the K-nearest neighbor algorithm and adopt the beyond empirical risk minimization (Mixup) to randomly hybridize the positive sample with the negative sample of the set. To overcome the problem of sample imbalance, we adopt the One-class SVM with overfitting of positive samples to select the newly generated unlabeled samples to obtain the balanced dataset. We construct multiple experiments in 20 open datasets. The results show that our PENH outperforms the other six baseline methods in multiple evaluation indicator.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Fuzzy Systems
International Journal of Fuzzy Systems 工程技术-计算机:人工智能
CiteScore
7.80
自引率
9.30%
发文量
188
审稿时长
16 months
期刊介绍: The International Journal of Fuzzy Systems (IJFS) is an official journal of Taiwan Fuzzy Systems Association (TFSA) and is published semi-quarterly. IJFS will consider high quality papers that deal with the theory, design, and application of fuzzy systems, soft computing systems, grey systems, and extension theory systems ranging from hardware to software. Survey and expository submissions are also welcome.
期刊最新文献
Event-Based Finite-Time $$H_\infty $$ Security Control for Networked Control Systems with Deception Attacks A Distance-Based Approach to Fuzzy Cognitive Maps Using Pythagorean Fuzzy Sets Relaxed Stability and Non-weighted $$L_2$$ -Gain Analysis for Asynchronously Switched Polynomial Fuzzy Systems Nonsingular Fast Terminal Sliding Mode Control of Uncertain Robotic Manipulator System Based on Adaptive Fuzzy Wavelet Neural Network Efficient and Effective Anomaly Detection in Autonomous Vehicles: A Combination of Gradient Boosting and ANFIS Algorithms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1