具有离群值的数据集归一化技术的比较

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS International Journal of Decision Support System Technology Pub Date : 2022-01-01 DOI:10.4018/ijdsst.286184
Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos
{"title":"具有离群值的数据集归一化技术的比较","authors":"Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos","doi":"10.4018/ijdsst.286184","DOIUrl":null,"url":null,"abstract":"With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.","PeriodicalId":42414,"journal":{"name":"International Journal of Decision Support System Technology","volume":"2 1","pages":"1-17"},"PeriodicalIF":0.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Comparison of Normalization Techniques on Data Sets With Outliers\",\"authors\":\"Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos\",\"doi\":\"10.4018/ijdsst.286184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.\",\"PeriodicalId\":42414,\"journal\":{\"name\":\"International Journal of Decision Support System Technology\",\"volume\":\"2 1\",\"pages\":\"1-17\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Decision Support System Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijdsst.286184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Decision Support System Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijdsst.286184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2

摘要

随着数据丰富系统的快速发展,处理具有倾斜输入数据集和各自异常值的复杂决策问题是不可避免的。通常,数据偏度是指数据集中的非均匀分布,即包含不对称和/或异常值的数据集。规范化是大多数多标准决策(MCDM)问题的第一步,用于从异构输入数据集中获得无量纲数据,从而实现标准的聚合,从而对备选方案进行排序。因此,当标准数据集中存在异常值时,找到合适的归一化技术是至关重要的。因此,在这项工作中,我们比较了标准数据集上的七种归一化技术(Max、Max- min、Vector、Sum、Logarithmic、targetbased和Fuzzification),这些数据集包含异常值,以分析它们对MCDM问题的结果。数值示例说明了所选规范化技术的行为,并使用(正在进行的)评估评估框架来推荐此类标准的最佳规范化技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comparison of Normalization Techniques on Data Sets With Outliers
With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Decision Support System Technology
International Journal of Decision Support System Technology COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
2.20
自引率
18.20%
发文量
40
期刊最新文献
A Novel Query Method for Spatial Database Based on Improved K-Nearest Neighbor Algorithm Analysis and Evaluation of Roadblocks Hindering Lean-Green and Industry 4.0 Practices in Indian Manufacturing Industries Developing Fuzzy-AHP-Integrated Hybrid MCDM System of COPRAS-ARAS for Solving an Industrial Robot Selection Problem Generalized Parametric Intuitionistic Fuzzy Measures Based on Trigonometric Functions for Improved Decision-Making Problem An Efficient Method to Decide the Malicious Traffic
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1