基于模糊成员函数实现有效的 SVM 样本缩减

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-09-10 DOI:10.1016/j.chemolab.2024.105233

Tinghua Wang, Daili Zhang, Hanming Liu

{"title":"基于模糊成员函数实现有效的 SVM 样本缩减","authors":"Tinghua Wang, Daili Zhang, Hanming Liu","doi":"10.1016/j.chemolab.2024.105233","DOIUrl":null,"url":null,"abstract":"<div><p>Support vector machine (SVM) is known for its good generalization performance and wide application in various fields. Despite its success, the learning efficiency of SVM decreases significantly originating from the assumption that the number of training samples increases rapidly. Consequently, the traditional SVM with standard optimization methods faces challenges such as excessive memory requirements and slow training speed, especially for large-scale training sets. To address this issue, this paper draws inspiration from the fuzzy support vector machine (FSVM). Considering that each sample has varying contributions to the decision plane, we propose an effective SVM sample reduction method based on the fuzzy membership function (FMF). This method uses FMF to calculate the fuzzy membership of each training sample. Training samples with low fuzzy memberships are then deleted. Specifically, we propose SVM sample reduction algorithms based on class center distance, kernel target alignment, centered kernel alignment, slack factor, entropy, and bilateral weighted FMF, respectively. Comprehensive experiments on UCI and KEEL datasets demonstrate that our proposed algorithms outperform other comparative methods in terms of accuracy, F-measure, and hinge-loss measures.</p></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105233"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward effective SVM sample reduction based on fuzzy membership functions\",\"authors\":\"Tinghua Wang, Daili Zhang, Hanming Liu\",\"doi\":\"10.1016/j.chemolab.2024.105233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Support vector machine (SVM) is known for its good generalization performance and wide application in various fields. Despite its success, the learning efficiency of SVM decreases significantly originating from the assumption that the number of training samples increases rapidly. Consequently, the traditional SVM with standard optimization methods faces challenges such as excessive memory requirements and slow training speed, especially for large-scale training sets. To address this issue, this paper draws inspiration from the fuzzy support vector machine (FSVM). Considering that each sample has varying contributions to the decision plane, we propose an effective SVM sample reduction method based on the fuzzy membership function (FMF). This method uses FMF to calculate the fuzzy membership of each training sample. Training samples with low fuzzy memberships are then deleted. Specifically, we propose SVM sample reduction algorithms based on class center distance, kernel target alignment, centered kernel alignment, slack factor, entropy, and bilateral weighted FMF, respectively. Comprehensive experiments on UCI and KEEL datasets demonstrate that our proposed algorithms outperform other comparative methods in terms of accuracy, F-measure, and hinge-loss measures.</p></div>\",\"PeriodicalId\":9774,\"journal\":{\"name\":\"Chemometrics and Intelligent Laboratory Systems\",\"volume\":\"254 \",\"pages\":\"Article 105233\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemometrics and Intelligent Laboratory Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169743924001734\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743924001734","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

支持向量机（SVM）以其良好的泛化性能和在各个领域的广泛应用而著称。尽管 SVM 取得了成功，但由于假设训练样本数量迅速增加，SVM 的学习效率明显降低。因此，采用标准优化方法的传统 SVM 面临着内存需求过大、训练速度慢等挑战，尤其是在大规模训练集的情况下。为解决这一问题，本文从模糊支持向量机（FSVM）中汲取灵感。考虑到每个样本对决策平面的贡献各不相同，我们提出了一种基于模糊成员函数（FMF）的有效 SVM 样本缩减方法。该方法使用 FMF 计算每个训练样本的模糊成员度。然后删除模糊成员度较低的训练样本。具体来说，我们分别提出了基于类中心距、核目标对齐、中心核对齐、松弛因子、熵和双边加权 FMF 的 SVM 样本缩减算法。在 UCI 和 KEEL 数据集上进行的综合实验表明，我们提出的算法在准确度、F-measure 和铰链损失度量方面都优于其他比较方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Toward effective SVM sample reduction based on fuzzy membership functions

Support vector machine (SVM) is known for its good generalization performance and wide application in various fields. Despite its success, the learning efficiency of SVM decreases significantly originating from the assumption that the number of training samples increases rapidly. Consequently, the traditional SVM with standard optimization methods faces challenges such as excessive memory requirements and slow training speed, especially for large-scale training sets. To address this issue, this paper draws inspiration from the fuzzy support vector machine (FSVM). Considering that each sample has varying contributions to the decision plane, we propose an effective SVM sample reduction method based on the fuzzy membership function (FMF). This method uses FMF to calculate the fuzzy membership of each training sample. Training samples with low fuzzy memberships are then deleted. Specifically, we propose SVM sample reduction algorithms based on class center distance, kernel target alignment, centered kernel alignment, slack factor, entropy, and bilateral weighted FMF, respectively. Comprehensive experiments on UCI and KEEL datasets demonstrate that our proposed algorithms outperform other comparative methods in terms of accuracy, F-measure, and hinge-loss measures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Chemometrics and Intelligent Laboratory Systems 工程技术-分析化学

CiteScore

7.50

自引率

7.70%

发文量

169

审稿时长

3.4 months

期刊介绍： Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.