{"title":"用两阶段法最小化最近邻规则的误分类率","authors":"Yunlong Gao, Si-Zhe Luo, Jinyan Pan, Baihua Chen, Peng Gao","doi":"10.1145/3318299.3318339","DOIUrl":null,"url":null,"abstract":"The kNN classification performance entirely depends on the selected neighbors. In the past, many nearest neighbor (NN)-based methods mainly focus on learning distance measure metrics so that a neighborhood of an approximately constant posteriori probability can be produced, whereas limited works are performed to study the influences of the distribution characteristics of each neighbor. In this paper, we point out why the best distance measurement (BDM) is sensitive to malicious samples, and then a robust best distance measurement (RBDM) is suggested to solve this problem. Moreover, we also investigated the influences of the distribution characteristics of each neighbor for the classification performance, so that a two-stage method, called weighted robust best distance measurement kNN method (WRBDMkNN), is proposed aiming to minimize the misclassification rate of the nearest neighbor rule. Extensive experiments on diversity datasets indicate that the proposed method can achieve more encouraging results compared with some state-of-the-art NN-based methods.","PeriodicalId":164987,"journal":{"name":"International Conference on Machine Learning and Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimizing the Misclassification Rate of the Nearest Neighbor Rule Using a Two-stage Method\",\"authors\":\"Yunlong Gao, Si-Zhe Luo, Jinyan Pan, Baihua Chen, Peng Gao\",\"doi\":\"10.1145/3318299.3318339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The kNN classification performance entirely depends on the selected neighbors. In the past, many nearest neighbor (NN)-based methods mainly focus on learning distance measure metrics so that a neighborhood of an approximately constant posteriori probability can be produced, whereas limited works are performed to study the influences of the distribution characteristics of each neighbor. In this paper, we point out why the best distance measurement (BDM) is sensitive to malicious samples, and then a robust best distance measurement (RBDM) is suggested to solve this problem. Moreover, we also investigated the influences of the distribution characteristics of each neighbor for the classification performance, so that a two-stage method, called weighted robust best distance measurement kNN method (WRBDMkNN), is proposed aiming to minimize the misclassification rate of the nearest neighbor rule. Extensive experiments on diversity datasets indicate that the proposed method can achieve more encouraging results compared with some state-of-the-art NN-based methods.\",\"PeriodicalId\":164987,\"journal\":{\"name\":\"International Conference on Machine Learning and Computing\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3318299.3318339\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3318299.3318339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Minimizing the Misclassification Rate of the Nearest Neighbor Rule Using a Two-stage Method
The kNN classification performance entirely depends on the selected neighbors. In the past, many nearest neighbor (NN)-based methods mainly focus on learning distance measure metrics so that a neighborhood of an approximately constant posteriori probability can be produced, whereas limited works are performed to study the influences of the distribution characteristics of each neighbor. In this paper, we point out why the best distance measurement (BDM) is sensitive to malicious samples, and then a robust best distance measurement (RBDM) is suggested to solve this problem. Moreover, we also investigated the influences of the distribution characteristics of each neighbor for the classification performance, so that a two-stage method, called weighted robust best distance measurement kNN method (WRBDMkNN), is proposed aiming to minimize the misclassification rate of the nearest neighbor rule. Extensive experiments on diversity datasets indicate that the proposed method can achieve more encouraging results compared with some state-of-the-art NN-based methods.