{"title":"基于直方图的非对称重标注学习方法","authors":"Tom Arjannikov, G. Tzanetakis","doi":"10.1109/ICMLA.2017.000-8","DOIUrl":null,"url":null,"abstract":"In this paper, we demonstrate how to use asymmetric data relabeling based on feature histograms as a pre-processing step for improving the overall classification performance of different classifiers in situations when only positive and unlabeled data is available. Additionally, this strategy can be used to identify with some level of confidence those data instances that should probably be labeled as positive. Moreover, this approach can be adapted to assess the quality of a given dataset, in terms of how many positive instances are not labeled. We examine our approach using synthetic data and demonstrate its applicability using real, publicly available data.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"21 1","pages":"1065-1070"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Histogram-Based Asymmetric Relabeling for Learning from Only Positive and Unlabeled Data\",\"authors\":\"Tom Arjannikov, G. Tzanetakis\",\"doi\":\"10.1109/ICMLA.2017.000-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we demonstrate how to use asymmetric data relabeling based on feature histograms as a pre-processing step for improving the overall classification performance of different classifiers in situations when only positive and unlabeled data is available. Additionally, this strategy can be used to identify with some level of confidence those data instances that should probably be labeled as positive. Moreover, this approach can be adapted to assess the quality of a given dataset, in terms of how many positive instances are not labeled. We examine our approach using synthetic data and demonstrate its applicability using real, publicly available data.\",\"PeriodicalId\":6636,\"journal\":{\"name\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"21 1\",\"pages\":\"1065-1070\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2017.000-8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2017.000-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Histogram-Based Asymmetric Relabeling for Learning from Only Positive and Unlabeled Data
In this paper, we demonstrate how to use asymmetric data relabeling based on feature histograms as a pre-processing step for improving the overall classification performance of different classifiers in situations when only positive and unlabeled data is available. Additionally, this strategy can be used to identify with some level of confidence those data instances that should probably be labeled as positive. Moreover, this approach can be adapted to assess the quality of a given dataset, in terms of how many positive instances are not labeled. We examine our approach using synthetic data and demonstrate its applicability using real, publicly available data.