作为分类验证方法的独立验证

Quantitative and computational methods in behavioral sciences Pub Date : 2023-12-22 DOI:10.5964/qcmb.12069

Tina Braun, Hannes Eckert, Timo von Oertzen

{"title":"作为分类验证方法的独立验证","authors":"Tina Braun, Hannes Eckert, Timo von Oertzen","doi":"10.5964/qcmb.12069","DOIUrl":null,"url":null,"abstract":"The use of classifiers provides an alternative to conventional statistical methods. This involves using the accuracy with which data is correctly assigned to a given group by the classifier to apply tests to compare the performance of classifiers. The conventional validation methods for determining the accuracy of classifiers have the disadvantage that the distribution of correct classifications does not follow any known distribution, and therefore, the application of statistical tests is problematic. Independent validation circumvents this problem and allows the use of binomial tests to assess the performance of classifiers. However, independent validation accuracy is subject to bias for small training datasets. The present study shows that a hyperbolic function can be used to estimate the loss in classifier accuracy for independent validation. This function is used to develop three new methods to estimate the classifier accuracy for small training sets more precisely. These methods are compared to two existing methods in a simulation study. The results indicate overall small errors in the estimation of classifier accuracy and indicate that independent validation can be used with small samples. A least square estimation approach seems best suited to estimate the classifier accuracy.","PeriodicalId":500886,"journal":{"name":"Quantitative and computational methods in behavioral sciences","volume":"36 24","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Independent validation as a validation method for classification\",\"authors\":\"Tina Braun, Hannes Eckert, Timo von Oertzen\",\"doi\":\"10.5964/qcmb.12069\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of classifiers provides an alternative to conventional statistical methods. This involves using the accuracy with which data is correctly assigned to a given group by the classifier to apply tests to compare the performance of classifiers. The conventional validation methods for determining the accuracy of classifiers have the disadvantage that the distribution of correct classifications does not follow any known distribution, and therefore, the application of statistical tests is problematic. Independent validation circumvents this problem and allows the use of binomial tests to assess the performance of classifiers. However, independent validation accuracy is subject to bias for small training datasets. The present study shows that a hyperbolic function can be used to estimate the loss in classifier accuracy for independent validation. This function is used to develop three new methods to estimate the classifier accuracy for small training sets more precisely. These methods are compared to two existing methods in a simulation study. The results indicate overall small errors in the estimation of classifier accuracy and indicate that independent validation can be used with small samples. A least square estimation approach seems best suited to estimate the classifier accuracy.\",\"PeriodicalId\":500886,\"journal\":{\"name\":\"Quantitative and computational methods in behavioral sciences\",\"volume\":\"36 24\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quantitative and computational methods in behavioral sciences\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.5964/qcmb.12069\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative and computational methods in behavioral sciences","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.5964/qcmb.12069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

分类器的使用为传统统计方法提供了另一种选择。这包括利用分类器将数据正确分配到给定组别的准确性来进行测试，以比较分类器的性能。确定分类器准确性的传统验证方法有一个缺点，即正确分类的分布并不遵循任何已知的分布，因此统计检验的应用存在问题。独立验证则规避了这一问题，允许使用二叉检验来评估分类器的性能。不过，独立验证的准确性在训练数据集较小的情况下会出现偏差。本研究表明，双曲线函数可用于估算独立验证的分类器准确性损失。利用该函数开发了三种新方法，以更精确地估计小型训练集的分类器准确性。在模拟研究中，这些方法与现有的两种方法进行了比较。结果表明，分类器准确度估算的总体误差较小，表明独立验证可用于小样本。最小平方估计法似乎最适合估计分类器的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Independent validation as a validation method for classification

The use of classifiers provides an alternative to conventional statistical methods. This involves using the accuracy with which data is correctly assigned to a given group by the classifier to apply tests to compare the performance of classifiers. The conventional validation methods for determining the accuracy of classifiers have the disadvantage that the distribution of correct classifications does not follow any known distribution, and therefore, the application of statistical tests is problematic. Independent validation circumvents this problem and allows the use of binomial tests to assess the performance of classifiers. However, independent validation accuracy is subject to bias for small training datasets. The present study shows that a hyperbolic function can be used to estimate the loss in classifier accuracy for independent validation. This function is used to develop three new methods to estimate the classifier accuracy for small training sets more precisely. These methods are compared to two existing methods in a simulation study. The results indicate overall small errors in the estimation of classifier accuracy and indicate that independent validation can be used with small samples. A least square estimation approach seems best suited to estimate the classifier accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Quantitative and computational methods in behavioral sciences

自引率

0.00%

发文量

期刊最新文献

Independent validation as a validation method for classification Estimating item parameters in multistage designs with the tmt package in R