{"title":"对支持向量机算法进行预处理以适应高斯数据的边缘先验和离群先验","authors":"Shaira Lee L. Pabalan, Louie John D. Vallejo","doi":"10.1063/1.5139170","DOIUrl":null,"url":null,"abstract":"The Support Vector Machine (SVM) Algorithm is one of the most popular classification method in machine learning and statistics. However, in the presence of outliers, the classifier may be adversely affected. In this paper, we experiment on the hinge loss function of the unconstrained SVM Algorithm to suit prior information about nonlinearly separable sets of Gaussian data. First, we determine if an altered hinge loss function x ↦ max(0, α − x) with several positive values of α will be significantly better in classification compared when α = 1. Then, taking an inspiration from Huber’s least informative distribution model to desensitize regression from outliers, we smoothen the hinge loss function to promote insensitivity of the classification to outliers. Using statistical analysis, we determine that at some level of significance, there is a considerable improvement in classification with respect to the number of misclassified data.The Support Vector Machine (SVM) Algorithm is one of the most popular classification method in machine learning and statistics. However, in the presence of outliers, the classifier may be adversely affected. In this paper, we experiment on the hinge loss function of the unconstrained SVM Algorithm to suit prior information about nonlinearly separable sets of Gaussian data. First, we determine if an altered hinge loss function x ↦ max(0, α − x) with several positive values of α will be significantly better in classification compared when α = 1. Then, taking an inspiration from Huber’s least informative distribution model to desensitize regression from outliers, we smoothen the hinge loss function to promote insensitivity of the classification to outliers. Using statistical analysis, we determine that at some level of significance, there is a considerable improvement in classification with respect to the number of misclassified data.","PeriodicalId":209108,"journal":{"name":"PROCEEDINGS OF THE 8TH SEAMS-UGM INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS APPLICATIONS 2019: Deepening Mathematical Concepts for Wider Application through Multidisciplinary Research and Industries Collaborations","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Preconditioning the support vector machine algorithm to suit margin and outlier priors of Gaussian data\",\"authors\":\"Shaira Lee L. Pabalan, Louie John D. Vallejo\",\"doi\":\"10.1063/1.5139170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Support Vector Machine (SVM) Algorithm is one of the most popular classification method in machine learning and statistics. However, in the presence of outliers, the classifier may be adversely affected. In this paper, we experiment on the hinge loss function of the unconstrained SVM Algorithm to suit prior information about nonlinearly separable sets of Gaussian data. First, we determine if an altered hinge loss function x ↦ max(0, α − x) with several positive values of α will be significantly better in classification compared when α = 1. Then, taking an inspiration from Huber’s least informative distribution model to desensitize regression from outliers, we smoothen the hinge loss function to promote insensitivity of the classification to outliers. Using statistical analysis, we determine that at some level of significance, there is a considerable improvement in classification with respect to the number of misclassified data.The Support Vector Machine (SVM) Algorithm is one of the most popular classification method in machine learning and statistics. However, in the presence of outliers, the classifier may be adversely affected. In this paper, we experiment on the hinge loss function of the unconstrained SVM Algorithm to suit prior information about nonlinearly separable sets of Gaussian data. First, we determine if an altered hinge loss function x ↦ max(0, α − x) with several positive values of α will be significantly better in classification compared when α = 1. Then, taking an inspiration from Huber’s least informative distribution model to desensitize regression from outliers, we smoothen the hinge loss function to promote insensitivity of the classification to outliers. Using statistical analysis, we determine that at some level of significance, there is a considerable improvement in classification with respect to the number of misclassified data.\",\"PeriodicalId\":209108,\"journal\":{\"name\":\"PROCEEDINGS OF THE 8TH SEAMS-UGM INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS APPLICATIONS 2019: Deepening Mathematical Concepts for Wider Application through Multidisciplinary Research and Industries Collaborations\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PROCEEDINGS OF THE 8TH SEAMS-UGM INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS APPLICATIONS 2019: Deepening Mathematical Concepts for Wider Application through Multidisciplinary Research and Industries Collaborations\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1063/1.5139170\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PROCEEDINGS OF THE 8TH SEAMS-UGM INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS APPLICATIONS 2019: Deepening Mathematical Concepts for Wider Application through Multidisciplinary Research and Industries Collaborations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1063/1.5139170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Preconditioning the support vector machine algorithm to suit margin and outlier priors of Gaussian data
The Support Vector Machine (SVM) Algorithm is one of the most popular classification method in machine learning and statistics. However, in the presence of outliers, the classifier may be adversely affected. In this paper, we experiment on the hinge loss function of the unconstrained SVM Algorithm to suit prior information about nonlinearly separable sets of Gaussian data. First, we determine if an altered hinge loss function x ↦ max(0, α − x) with several positive values of α will be significantly better in classification compared when α = 1. Then, taking an inspiration from Huber’s least informative distribution model to desensitize regression from outliers, we smoothen the hinge loss function to promote insensitivity of the classification to outliers. Using statistical analysis, we determine that at some level of significance, there is a considerable improvement in classification with respect to the number of misclassified data.The Support Vector Machine (SVM) Algorithm is one of the most popular classification method in machine learning and statistics. However, in the presence of outliers, the classifier may be adversely affected. In this paper, we experiment on the hinge loss function of the unconstrained SVM Algorithm to suit prior information about nonlinearly separable sets of Gaussian data. First, we determine if an altered hinge loss function x ↦ max(0, α − x) with several positive values of α will be significantly better in classification compared when α = 1. Then, taking an inspiration from Huber’s least informative distribution model to desensitize regression from outliers, we smoothen the hinge loss function to promote insensitivity of the classification to outliers. Using statistical analysis, we determine that at some level of significance, there is a considerable improvement in classification with respect to the number of misclassified data.