I. Alkhawaldeh, Ibrahem Albalkhi, Abdulqadir J Naswhan
{"title":"机器学习中合成少数群体超采样技术的挑战和局限性","authors":"I. Alkhawaldeh, Ibrahem Albalkhi, Abdulqadir J Naswhan","doi":"10.5662/wjm.v13.i5.373","DOIUrl":null,"url":null,"abstract":"Oversampling is the most utilized approach to deal with class-imbalanced datasets, as seen by the plethora of oversampling methods developed in the last two decades. We argue in the following editorial the issues with oversampling that stem from the possibility of overfitting and the generation of synthetic cases that might not accurately represent the minority class. These limitations should be considered when using oversampling techniques. We also propose several alternate strategies for dealing with imbalanced data, as well as a future work perspective.","PeriodicalId":94271,"journal":{"name":"World journal of methodology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Challenges and limitations of synthetic minority oversampling techniques in machine learning\",\"authors\":\"I. Alkhawaldeh, Ibrahem Albalkhi, Abdulqadir J Naswhan\",\"doi\":\"10.5662/wjm.v13.i5.373\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Oversampling is the most utilized approach to deal with class-imbalanced datasets, as seen by the plethora of oversampling methods developed in the last two decades. We argue in the following editorial the issues with oversampling that stem from the possibility of overfitting and the generation of synthetic cases that might not accurately represent the minority class. These limitations should be considered when using oversampling techniques. We also propose several alternate strategies for dealing with imbalanced data, as well as a future work perspective.\",\"PeriodicalId\":94271,\"journal\":{\"name\":\"World journal of methodology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World journal of methodology\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.5662/wjm.v13.i5.373\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World journal of methodology","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.5662/wjm.v13.i5.373","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Challenges and limitations of synthetic minority oversampling techniques in machine learning
Oversampling is the most utilized approach to deal with class-imbalanced datasets, as seen by the plethora of oversampling methods developed in the last two decades. We argue in the following editorial the issues with oversampling that stem from the possibility of overfitting and the generation of synthetic cases that might not accurately represent the minority class. These limitations should be considered when using oversampling techniques. We also propose several alternate strategies for dealing with imbalanced data, as well as a future work perspective.