{"title":"The sensitivity of machine learning techniques to variations in sample size : a comparative analysis","authors":"J. Andrés, P. L. Fernández, E. Combarro","doi":"10.4192/1577-8517-V2_5","DOIUrl":null,"url":null,"abstract":"A comparative analysis of the performance of a number of Machine Learning Techniques (Quinlan's See5, ARNI, FAN and SVM) is conducted. The chosen classification task is the forecasting of the level of efficiency of Spanish commercial and industrial companies. Assignment of the firms is made upon the basis of a set of financial ratios, which make a high dimension feature space with low separability degree. In the present research the effects on the accuracy of variations of each technique in the estimation sample size are measured. The main results suggest that ARNI and See5 yield the best results, even with small sample sizes.","PeriodicalId":404481,"journal":{"name":"The International Journal of Digital Accounting Research","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Journal of Digital Accounting Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4192/1577-8517-V2_5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
A comparative analysis of the performance of a number of Machine Learning Techniques (Quinlan's See5, ARNI, FAN and SVM) is conducted. The chosen classification task is the forecasting of the level of efficiency of Spanish commercial and industrial companies. Assignment of the firms is made upon the basis of a set of financial ratios, which make a high dimension feature space with low separability degree. In the present research the effects on the accuracy of variations of each technique in the estimation sample size are measured. The main results suggest that ARNI and See5 yield the best results, even with small sample sizes.