{"title":"Feature selection using random probes and linear support vector machines","authors":"Hoi-Ming Chi, O. Ersoy, H. Moskowitz","doi":"10.1109/CIMA.2005.1662318","DOIUrl":null,"url":null,"abstract":"A novel feature selection algorithm that combines the ideas of linear support vector machines (SVMs) and random probes is proposed. A random probe is first artificially generated from a Gaussian distribution and appended to the data set as an extra input variable. Next, a standard 2-norm or 1-norm linear support vector machine is trained using this new data set. Each coefficient, or weight, in a linear SVM is compared to that of the random probe feature. Under several statistical assumptions, the probability of each input feature being more relevant than the random probe can be computed easily. The proposed feature selection method is intuitive to use in real-world problems, and it automatically determines the optimal number of features needed. It can also be extended to selecting significant interaction and/or quadratic terms in a 2nd-order polynomial representation","PeriodicalId":306045,"journal":{"name":"2005 ICSC Congress on Computational Intelligence Methods and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 ICSC Congress on Computational Intelligence Methods and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIMA.2005.1662318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A novel feature selection algorithm that combines the ideas of linear support vector machines (SVMs) and random probes is proposed. A random probe is first artificially generated from a Gaussian distribution and appended to the data set as an extra input variable. Next, a standard 2-norm or 1-norm linear support vector machine is trained using this new data set. Each coefficient, or weight, in a linear SVM is compared to that of the random probe feature. Under several statistical assumptions, the probability of each input feature being more relevant than the random probe can be computed easily. The proposed feature selection method is intuitive to use in real-world problems, and it automatically determines the optimal number of features needed. It can also be extended to selecting significant interaction and/or quadratic terms in a 2nd-order polynomial representation