Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos
{"title":"具有离群值的数据集归一化技术的比较","authors":"Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos","doi":"10.4018/ijdsst.286184","DOIUrl":null,"url":null,"abstract":"With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.","PeriodicalId":42414,"journal":{"name":"International Journal of Decision Support System Technology","volume":"2 1","pages":"1-17"},"PeriodicalIF":0.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Comparison of Normalization Techniques on Data Sets With Outliers\",\"authors\":\"Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos\",\"doi\":\"10.4018/ijdsst.286184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.\",\"PeriodicalId\":42414,\"journal\":{\"name\":\"International Journal of Decision Support System Technology\",\"volume\":\"2 1\",\"pages\":\"1-17\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Decision Support System Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijdsst.286184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Decision Support System Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijdsst.286184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Comparison of Normalization Techniques on Data Sets With Outliers
With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.