{"title":"基于树的监督分类学习算法的诊断工具","authors":"G. Holmes, L. Trigg","doi":"10.1109/ICONIP.1999.845647","DOIUrl":null,"url":null,"abstract":"The process of developing applications of machine learning and data mining that employ supervised classification algorithms includes the important step of knowledge verification. Interpretable output is presented to a user so that they can verify that the knowledge contained in the output makes sense for the given application. As the development of an application is an iterative process it is quite likely that a user would wish to compare models constructed at various times or stages. One crucial stage where comparison of models is important is when the accuracy of a model is being estimated, typically using some form of cross-validation. This stage is used to establish an estimate of how well a model will perform on unseen data. This is vital information to present to a user, but it is also important to show the degree of variation between models obtained from the entire dataset and models obtained during cross-validation. In this way it can be verified that the cross-validation models are at least structurally aligned with the model garnered from the entire dataset. This paper presents a diagnostic tool for the comparison of tree-based supervised classification models. The method is adapted from work on approximate tree matching and applied to decision trees. The tool is described together with experimental results on standard datasets.","PeriodicalId":237855,"journal":{"name":"ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378)","volume":"199 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"A diagnostic tool for tree based supervised classification learning algorithms\",\"authors\":\"G. Holmes, L. Trigg\",\"doi\":\"10.1109/ICONIP.1999.845647\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The process of developing applications of machine learning and data mining that employ supervised classification algorithms includes the important step of knowledge verification. Interpretable output is presented to a user so that they can verify that the knowledge contained in the output makes sense for the given application. As the development of an application is an iterative process it is quite likely that a user would wish to compare models constructed at various times or stages. One crucial stage where comparison of models is important is when the accuracy of a model is being estimated, typically using some form of cross-validation. This stage is used to establish an estimate of how well a model will perform on unseen data. This is vital information to present to a user, but it is also important to show the degree of variation between models obtained from the entire dataset and models obtained during cross-validation. In this way it can be verified that the cross-validation models are at least structurally aligned with the model garnered from the entire dataset. This paper presents a diagnostic tool for the comparison of tree-based supervised classification models. The method is adapted from work on approximate tree matching and applied to decision trees. The tool is described together with experimental results on standard datasets.\",\"PeriodicalId\":237855,\"journal\":{\"name\":\"ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378)\",\"volume\":\"199 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICONIP.1999.845647\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONIP.1999.845647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A diagnostic tool for tree based supervised classification learning algorithms
The process of developing applications of machine learning and data mining that employ supervised classification algorithms includes the important step of knowledge verification. Interpretable output is presented to a user so that they can verify that the knowledge contained in the output makes sense for the given application. As the development of an application is an iterative process it is quite likely that a user would wish to compare models constructed at various times or stages. One crucial stage where comparison of models is important is when the accuracy of a model is being estimated, typically using some form of cross-validation. This stage is used to establish an estimate of how well a model will perform on unseen data. This is vital information to present to a user, but it is also important to show the degree of variation between models obtained from the entire dataset and models obtained during cross-validation. In this way it can be verified that the cross-validation models are at least structurally aligned with the model garnered from the entire dataset. This paper presents a diagnostic tool for the comparison of tree-based supervised classification models. The method is adapted from work on approximate tree matching and applied to decision trees. The tool is described together with experimental results on standard datasets.