{"title":"从流到在线合奏的多类不平衡半监督学习","authors":"P. Vafaie, H. Viktor, W. Michalowski","doi":"10.1109/ICDMW51313.2020.00124","DOIUrl":null,"url":null,"abstract":"Multi-class imbalance, in which the rates of instances in the various classes differ substantially, poses a major challenge when learning from evolving streams. In this setting, minority class instances may arrive infrequently and in bursts, making accurate model construction problematic. Further, skewed streams are not only susceptible to concept drifts, but class labels may also be absent, expensive to obtain, or only arrive after some delay. The combined effects of multi-class skew, concept drift and semi-supervised learning have received limited attention in the online learning community. In this paper, we introduce a multi-class online ensemble algorithm that is suitable for learning in such settings. Specifically, our algorithm uses sampling with replacement while dynamically increasing the weights of underrepresented classes based on recall in order to produce models that benefit all classes. Our approach addresses the potential lack of labels by incorporating a self-training semi-supervised learning method for labeling instances. Our experimental results show that our online ensemble performs well against multi-class imbalanced data containing concept drifts. In addition, our algorithm produces accurate predictions, even in the presence of unlabeled data.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Multi-class imbalanced semi-supervised learning from streams through online ensembles\",\"authors\":\"P. Vafaie, H. Viktor, W. Michalowski\",\"doi\":\"10.1109/ICDMW51313.2020.00124\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-class imbalance, in which the rates of instances in the various classes differ substantially, poses a major challenge when learning from evolving streams. In this setting, minority class instances may arrive infrequently and in bursts, making accurate model construction problematic. Further, skewed streams are not only susceptible to concept drifts, but class labels may also be absent, expensive to obtain, or only arrive after some delay. The combined effects of multi-class skew, concept drift and semi-supervised learning have received limited attention in the online learning community. In this paper, we introduce a multi-class online ensemble algorithm that is suitable for learning in such settings. Specifically, our algorithm uses sampling with replacement while dynamically increasing the weights of underrepresented classes based on recall in order to produce models that benefit all classes. Our approach addresses the potential lack of labels by incorporating a self-training semi-supervised learning method for labeling instances. Our experimental results show that our online ensemble performs well against multi-class imbalanced data containing concept drifts. In addition, our algorithm produces accurate predictions, even in the presence of unlabeled data.\",\"PeriodicalId\":426846,\"journal\":{\"name\":\"2020 International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW51313.2020.00124\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW51313.2020.00124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-class imbalanced semi-supervised learning from streams through online ensembles
Multi-class imbalance, in which the rates of instances in the various classes differ substantially, poses a major challenge when learning from evolving streams. In this setting, minority class instances may arrive infrequently and in bursts, making accurate model construction problematic. Further, skewed streams are not only susceptible to concept drifts, but class labels may also be absent, expensive to obtain, or only arrive after some delay. The combined effects of multi-class skew, concept drift and semi-supervised learning have received limited attention in the online learning community. In this paper, we introduce a multi-class online ensemble algorithm that is suitable for learning in such settings. Specifically, our algorithm uses sampling with replacement while dynamically increasing the weights of underrepresented classes based on recall in order to produce models that benefit all classes. Our approach addresses the potential lack of labels by incorporating a self-training semi-supervised learning method for labeling instances. Our experimental results show that our online ensemble performs well against multi-class imbalanced data containing concept drifts. In addition, our algorithm produces accurate predictions, even in the presence of unlabeled data.