{"title":"Identifying Similar Software Datasets through Fuzzy Inference System","authors":"S. Anwar, Z. Rana, M. Awais","doi":"10.1109/FIT.2012.40","DOIUrl":null,"url":null,"abstract":"Similar software have similar software measurements. Defect data from one software can be used to anticipate defects in a similar software. Although, not many defect datasets are made public in software engineering domain, PROMISE repository is a reasonable collection of software data. This paper presents a two step approach to identify similar software and applies the proposed technique to find similar datasets in PROMISE repository. As step 1, the approach generates associations rules for each dataset to determine dataset's behavior in terms of frequent patterns. As step 2, overlap between the association rules is calculated using Fuzzy Inference Systems (FIS). The FIS generated for the study have been expert-based as well as auto-generated. Similarity between 28 dataset pairs has been found KC2 and PC1 turned out to be most similar datasets with 86% similarity using Mamdani, 92% with Sugeno models. Results from expert-based and auto generated FIS have been comparable.","PeriodicalId":166149,"journal":{"name":"2012 10th International Conference on Frontiers of Information Technology","volume":" 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 10th International Conference on Frontiers of Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FIT.2012.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Similar software have similar software measurements. Defect data from one software can be used to anticipate defects in a similar software. Although, not many defect datasets are made public in software engineering domain, PROMISE repository is a reasonable collection of software data. This paper presents a two step approach to identify similar software and applies the proposed technique to find similar datasets in PROMISE repository. As step 1, the approach generates associations rules for each dataset to determine dataset's behavior in terms of frequent patterns. As step 2, overlap between the association rules is calculated using Fuzzy Inference Systems (FIS). The FIS generated for the study have been expert-based as well as auto-generated. Similarity between 28 dataset pairs has been found KC2 and PC1 turned out to be most similar datasets with 86% similarity using Mamdani, 92% with Sugeno models. Results from expert-based and auto generated FIS have been comparable.