K. E. Moutaouakil, Abdellatif el Ouissari, A. Touhafi, N. Aharrane
{"title":"An Improved Density Based Support Vector Machine (DBSVM)","authors":"K. E. Moutaouakil, Abdellatif el Ouissari, A. Touhafi, N. Aharrane","doi":"10.1109/CloudTech49835.2020.9365893","DOIUrl":null,"url":null,"abstract":"Support Vector Machines (SVM) is a classification model based on the duality optimization approach. Non-zero Lagrange multipliers correspond to the data selected to be support vectors used to build the margin decision. Unfortunately, SVM has two major drawbacks: the noisy and redundant data cause an overfitting; moreover, the number of local minima increases with the size of data, even worse when it comes to Big Data. To overcome these shortcoming, we propose a new version of SVM, called Density Based Support Vector Machine (DBVSM), which performs on three steps: first, we set two parameters, the radius of the neighborhood and the size of this latter. Second, we determine three types of points: noisy, cord and interior. Third, we solve the dual problem based on the cord data only. To justify this choice, we demonstrate that the cord points cannot be support vectors. Moreover, we show that the kernel functions don't change the cord point nature even. The DBSVM is benchmarked on several datasets and is compared with a variety of methods in the literature. The results of the tests prove that the proposed algorithm is able to provide very competitive results in terms of time, classification performance, and capacity to tackle datasets of very large size. Finally, to point out the consistency of the DBSVM, several tests were performed for different values of the ratio and the neighborhood size.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudTech49835.2020.9365893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Support Vector Machines (SVM) is a classification model based on the duality optimization approach. Non-zero Lagrange multipliers correspond to the data selected to be support vectors used to build the margin decision. Unfortunately, SVM has two major drawbacks: the noisy and redundant data cause an overfitting; moreover, the number of local minima increases with the size of data, even worse when it comes to Big Data. To overcome these shortcoming, we propose a new version of SVM, called Density Based Support Vector Machine (DBVSM), which performs on three steps: first, we set two parameters, the radius of the neighborhood and the size of this latter. Second, we determine three types of points: noisy, cord and interior. Third, we solve the dual problem based on the cord data only. To justify this choice, we demonstrate that the cord points cannot be support vectors. Moreover, we show that the kernel functions don't change the cord point nature even. The DBSVM is benchmarked on several datasets and is compared with a variety of methods in the literature. The results of the tests prove that the proposed algorithm is able to provide very competitive results in terms of time, classification performance, and capacity to tackle datasets of very large size. Finally, to point out the consistency of the DBSVM, several tests were performed for different values of the ratio and the neighborhood size.