{"title":"Statistical techniques vs. SEES algorithm : an application to a small business environment","authors":"J. Andrés","doi":"10.4192/1577-8517-V1_8","DOIUrl":null,"url":null,"abstract":"The aim of this research is to compare the accuracy of a rule induction classifier system - Quinlan's - with linear discriminant analysis and logit. The classification task chosen is the differentiation of the most efficient companies from the least efficient ones on the basis of a set of financial variables. The sample consists of a database containing the annual accounts of the companies located in the Principality of Asturias (Spain), which are mainly small businesses. The main results indicate that SEE5 outperforms logit, but it is not clearly better than discriminant analysis. However, SEE5 models suffer from bigger increases in error rates when tested with validation samples. Another interesting finding is that in SEE5 systems both the number of variables selected and the number of rules inferred grow when sample size increases.","PeriodicalId":404481,"journal":{"name":"The International Journal of Digital Accounting Research","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Journal of Digital Accounting Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4192/1577-8517-V1_8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
The aim of this research is to compare the accuracy of a rule induction classifier system - Quinlan's - with linear discriminant analysis and logit. The classification task chosen is the differentiation of the most efficient companies from the least efficient ones on the basis of a set of financial variables. The sample consists of a database containing the annual accounts of the companies located in the Principality of Asturias (Spain), which are mainly small businesses. The main results indicate that SEE5 outperforms logit, but it is not clearly better than discriminant analysis. However, SEE5 models suffer from bigger increases in error rates when tested with validation samples. Another interesting finding is that in SEE5 systems both the number of variables selected and the number of rules inferred grow when sample size increases.