{"title":"Rough sets used in the measurement of similarity of mixed mode data","authors":"S. Coppock, L. Mazlack","doi":"10.1109/NAFIPS.2003.1226781","DOIUrl":null,"url":null,"abstract":"Similarity is important in knowledge discovery. Cluster analysis, classification, and granulation each involve some notion or definition of similarity. The measurement of similarity is selected based on the domain and distribution of the data. Even within a specific domain, some similarity metrics may be considered more useful than others. There is an amount of uncertainty in quantitatively measuring the similarity between records of mixed data. The uncertainty develops from the lack of scale that both nominal and ordinal data have. Rough set theory is one tool developed for handling uncertainty. Rough sets can be used in dissimilarity analysis of qualitative data. It would seem that rough sets could be applied in measuring similarity between records containing both quantitative and qualitative data for the purpose of clustering the records.","PeriodicalId":153530,"journal":{"name":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2003.1226781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Similarity is important in knowledge discovery. Cluster analysis, classification, and granulation each involve some notion or definition of similarity. The measurement of similarity is selected based on the domain and distribution of the data. Even within a specific domain, some similarity metrics may be considered more useful than others. There is an amount of uncertainty in quantitatively measuring the similarity between records of mixed data. The uncertainty develops from the lack of scale that both nominal and ordinal data have. Rough set theory is one tool developed for handling uncertainty. Rough sets can be used in dissimilarity analysis of qualitative data. It would seem that rough sets could be applied in measuring similarity between records containing both quantitative and qualitative data for the purpose of clustering the records.