{"title":"NoSimple: Data Bias Evaluation Metrics","authors":"S. Rahardja, P. Fränti","doi":"10.1109/IMCOM60618.2024.10418419","DOIUrl":null,"url":null,"abstract":"Simple objects are defined as objects invariably correctly classified by all outlier detectors. Its presence impairs performance of binary classifiers such as ROC or F1 score. A large number of simple objects falsely improve performance of binary classifiers when evaluated by ROC or F1 score. This impairs reliability of classifier evaluation. This manuscript proposes evaluation without simple objects (NoSimple). NoSimple preprocesses data to factor in simple objects by removing the simple objects for the evaluation phase. Experiments with 30 realworld datasets demonstrate that NoSimple significantly reduced the average ROC of all classifiers by $0.04 \\sim 0.06$. NoSimple is most effective when the percentage of simple objects exceeds $30{\\% }$. By introducing a new method to reliably evaluate outlier classifiers, NoSimple has the potential to revolutionize evaluation metrics and has a multitude of applications in data science research.","PeriodicalId":518057,"journal":{"name":"2024 18th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"284 5","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 18th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM60618.2024.10418419","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Simple objects are defined as objects invariably correctly classified by all outlier detectors. Its presence impairs performance of binary classifiers such as ROC or F1 score. A large number of simple objects falsely improve performance of binary classifiers when evaluated by ROC or F1 score. This impairs reliability of classifier evaluation. This manuscript proposes evaluation without simple objects (NoSimple). NoSimple preprocesses data to factor in simple objects by removing the simple objects for the evaluation phase. Experiments with 30 realworld datasets demonstrate that NoSimple significantly reduced the average ROC of all classifiers by $0.04 \sim 0.06$. NoSimple is most effective when the percentage of simple objects exceeds $30{\% }$. By introducing a new method to reliably evaluate outlier classifiers, NoSimple has the potential to revolutionize evaluation metrics and has a multitude of applications in data science research.