An Improved Density Based Support Vector Machine (DBSVM)

K. E. Moutaouakil, Abdellatif el Ouissari, A. Touhafi, N. Aharrane
{"title":"An Improved Density Based Support Vector Machine (DBSVM)","authors":"K. E. Moutaouakil, Abdellatif el Ouissari, A. Touhafi, N. Aharrane","doi":"10.1109/CloudTech49835.2020.9365893","DOIUrl":null,"url":null,"abstract":"Support Vector Machines (SVM) is a classification model based on the duality optimization approach. Non-zero Lagrange multipliers correspond to the data selected to be support vectors used to build the margin decision. Unfortunately, SVM has two major drawbacks: the noisy and redundant data cause an overfitting; moreover, the number of local minima increases with the size of data, even worse when it comes to Big Data. To overcome these shortcoming, we propose a new version of SVM, called Density Based Support Vector Machine (DBVSM), which performs on three steps: first, we set two parameters, the radius of the neighborhood and the size of this latter. Second, we determine three types of points: noisy, cord and interior. Third, we solve the dual problem based on the cord data only. To justify this choice, we demonstrate that the cord points cannot be support vectors. Moreover, we show that the kernel functions don't change the cord point nature even. The DBSVM is benchmarked on several datasets and is compared with a variety of methods in the literature. The results of the tests prove that the proposed algorithm is able to provide very competitive results in terms of time, classification performance, and capacity to tackle datasets of very large size. Finally, to point out the consistency of the DBSVM, several tests were performed for different values of the ratio and the neighborhood size.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudTech49835.2020.9365893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Support Vector Machines (SVM) is a classification model based on the duality optimization approach. Non-zero Lagrange multipliers correspond to the data selected to be support vectors used to build the margin decision. Unfortunately, SVM has two major drawbacks: the noisy and redundant data cause an overfitting; moreover, the number of local minima increases with the size of data, even worse when it comes to Big Data. To overcome these shortcoming, we propose a new version of SVM, called Density Based Support Vector Machine (DBVSM), which performs on three steps: first, we set two parameters, the radius of the neighborhood and the size of this latter. Second, we determine three types of points: noisy, cord and interior. Third, we solve the dual problem based on the cord data only. To justify this choice, we demonstrate that the cord points cannot be support vectors. Moreover, we show that the kernel functions don't change the cord point nature even. The DBSVM is benchmarked on several datasets and is compared with a variety of methods in the literature. The results of the tests prove that the proposed algorithm is able to provide very competitive results in terms of time, classification performance, and capacity to tackle datasets of very large size. Finally, to point out the consistency of the DBSVM, several tests were performed for different values of the ratio and the neighborhood size.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种改进的基于密度的支持向量机
支持向量机是一种基于对偶优化方法的分类模型。非零拉格朗日乘数对应于选择作为支持向量用于构建边际决策的数据。不幸的是,支持向量机有两个主要缺点:噪声和冗余数据导致过拟合;此外,局部最小值的数量随着数据的大小而增加,在大数据中情况更糟。为了克服这些缺点,我们提出了一种新的支持向量机,称为基于密度的支持向量机(DBVSM),它分三步执行:首先,我们设置两个参数,邻域的半径和后者的大小。其次,我们确定了三种类型的点:噪声点、线状点和内部点。第三,我们解决了仅基于脐带数据的双重问题。为了证明这个选择是正确的,我们证明了线点不能是支持向量。此外,我们还证明了核函数甚至不改变脐带点的性质。DBSVM在多个数据集上进行基准测试,并与文献中的各种方法进行比较。测试结果证明,所提出的算法能够在时间、分类性能和处理超大规模数据集的能力方面提供非常有竞争力的结果。最后,为了指出DBSVM的一致性,对不同的比率值和邻域大小进行了多次测试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CloudTech 2020 Copyright Page An IoT data logging instrument for monitoring and early efficiency loss detection at a photovoltaic generation plant A cloud-based foundational infrastructure for water management ecosystem Medical Image Registration via Similarity Measure based on Convolutional Neural Network Quality Approach to Analyze the Causes of Failures in MOOC
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1