{"title":"连续属性的多元相互离散化","authors":"S. Chao, Yiping Li","doi":"10.1109/ICITA.2005.188","DOIUrl":null,"url":null,"abstract":"Decision tree is one of the most widely used and practical methods in the data mining and machine learning discipline. However, many discretization algorithms developed in this field focus on univariate only, which is inadequate to handle the critical problems especially owned by medical domain. In this paper, we propose a new multivariate discretization method called multivariate interdependent discretization for continuous attributes - MIDCA. Our novel algorithm can minimize the uncertainty between the interdependent attribute and the continuous-valued attribute, and at the same time to maximize their correlation. The empirical results demonstrate a comparison of performance of various decision tree algorithms on twelve real-life datasets from UCI repository.","PeriodicalId":371528,"journal":{"name":"Third International Conference on Information Technology and Applications (ICITA'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Multivariate interdependent discretization for continuous attribute\",\"authors\":\"S. Chao, Yiping Li\",\"doi\":\"10.1109/ICITA.2005.188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decision tree is one of the most widely used and practical methods in the data mining and machine learning discipline. However, many discretization algorithms developed in this field focus on univariate only, which is inadequate to handle the critical problems especially owned by medical domain. In this paper, we propose a new multivariate discretization method called multivariate interdependent discretization for continuous attributes - MIDCA. Our novel algorithm can minimize the uncertainty between the interdependent attribute and the continuous-valued attribute, and at the same time to maximize their correlation. The empirical results demonstrate a comparison of performance of various decision tree algorithms on twelve real-life datasets from UCI repository.\",\"PeriodicalId\":371528,\"journal\":{\"name\":\"Third International Conference on Information Technology and Applications (ICITA'05)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Third International Conference on Information Technology and Applications (ICITA'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICITA.2005.188\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third International Conference on Information Technology and Applications (ICITA'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITA.2005.188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multivariate interdependent discretization for continuous attribute
Decision tree is one of the most widely used and practical methods in the data mining and machine learning discipline. However, many discretization algorithms developed in this field focus on univariate only, which is inadequate to handle the critical problems especially owned by medical domain. In this paper, we propose a new multivariate discretization method called multivariate interdependent discretization for continuous attributes - MIDCA. Our novel algorithm can minimize the uncertainty between the interdependent attribute and the continuous-valued attribute, and at the same time to maximize their correlation. The empirical results demonstrate a comparison of performance of various decision tree algorithms on twelve real-life datasets from UCI repository.