Class imbalance is a major issue when adopting machine learning algorithms to build a predictive model for fault detection of industrial plants in smart factories. In this paper, we propose a data oversampling method termed WASSKIL. WASSKIL is developed based on MAHAKIL that simulates the genetic breeding process, where Wasserstein distance is leveraged rather than Mahalanobis distance when partitioning two sets of data for oversampling. We evaluate the performance of WASSKIL over 5 industrial plants of PHM 2015 dataset, using raw features of sensors and statistical features of the dataset in time series. The results show that WASSKIL can outperform MAHAKIL under both raw features and statistical features. Consequently, our proposed oversampling method has the potential to tame class imbalance, which can be used for prognostics and health management in smart factories.
{"title":"WASSKIL: An Oversampling Method for Fault Detection of Industrial Plants","authors":"Jiawen Yan, Weiwen Zhang, Yuxiang Peng","doi":"10.1145/3421515.3421535","DOIUrl":"https://doi.org/10.1145/3421515.3421535","url":null,"abstract":"Class imbalance is a major issue when adopting machine learning algorithms to build a predictive model for fault detection of industrial plants in smart factories. In this paper, we propose a data oversampling method termed WASSKIL. WASSKIL is developed based on MAHAKIL that simulates the genetic breeding process, where Wasserstein distance is leveraged rather than Mahalanobis distance when partitioning two sets of data for oversampling. We evaluate the performance of WASSKIL over 5 industrial plants of PHM 2015 dataset, using raw features of sensors and statistical features of the dataset in time series. The results show that WASSKIL can outperform MAHAKIL under both raw features and statistical features. Consequently, our proposed oversampling method has the potential to tame class imbalance, which can be used for prognostics and health management in smart factories.","PeriodicalId":294293,"journal":{"name":"2020 2nd Symposium on Signal Processing Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121543593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}