{"title":"Data transformation and attribute subset selection: Do they help make differences in software failure prediction?","authors":"Hao Jia, Fengdi Shu, Ye Yang, Qi Li","doi":"10.1109/ICSM.2009.5306382","DOIUrl":null,"url":null,"abstract":"Data transformation and attribute subset selection have been adopted in improving software defect/failure prediction methods. However, little consensus was achieved on their effectiveness. This paper reports a comparative study on these two kinds of techniques combined with four classifier and datasets from two projects. The results indicate that data transformation displays unobvious influence on improving the performance, while attribute subset selection methods show distinguishably inconsistent output. Besides, consistency across releases and discrepancy between the open-source and in-house maintenance projects in the evaluation of these methods are discussed.","PeriodicalId":247441,"journal":{"name":"2009 IEEE International Conference on Software Maintenance","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Software Maintenance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSM.2009.5306382","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Data transformation and attribute subset selection have been adopted in improving software defect/failure prediction methods. However, little consensus was achieved on their effectiveness. This paper reports a comparative study on these two kinds of techniques combined with four classifier and datasets from two projects. The results indicate that data transformation displays unobvious influence on improving the performance, while attribute subset selection methods show distinguishably inconsistent output. Besides, consistency across releases and discrepancy between the open-source and in-house maintenance projects in the evaluation of these methods are discussed.