Independent validation as a validation method for classification

Tina Braun, Hannes Eckert, Timo von Oertzen
{"title":"Independent validation as a validation method for classification","authors":"Tina Braun, Hannes Eckert, Timo von Oertzen","doi":"10.5964/qcmb.12069","DOIUrl":null,"url":null,"abstract":"The use of classifiers provides an alternative to conventional statistical methods. This involves using the accuracy with which data is correctly assigned to a given group by the classifier to apply tests to compare the performance of classifiers. The conventional validation methods for determining the accuracy of classifiers have the disadvantage that the distribution of correct classifications does not follow any known distribution, and therefore, the application of statistical tests is problematic. Independent validation circumvents this problem and allows the use of binomial tests to assess the performance of classifiers. However, independent validation accuracy is subject to bias for small training datasets. The present study shows that a hyperbolic function can be used to estimate the loss in classifier accuracy for independent validation. This function is used to develop three new methods to estimate the classifier accuracy for small training sets more precisely. These methods are compared to two existing methods in a simulation study. The results indicate overall small errors in the estimation of classifier accuracy and indicate that independent validation can be used with small samples. A least square estimation approach seems best suited to estimate the classifier accuracy.","PeriodicalId":500886,"journal":{"name":"Quantitative and computational methods in behavioral sciences","volume":"36 24","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative and computational methods in behavioral sciences","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.5964/qcmb.12069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The use of classifiers provides an alternative to conventional statistical methods. This involves using the accuracy with which data is correctly assigned to a given group by the classifier to apply tests to compare the performance of classifiers. The conventional validation methods for determining the accuracy of classifiers have the disadvantage that the distribution of correct classifications does not follow any known distribution, and therefore, the application of statistical tests is problematic. Independent validation circumvents this problem and allows the use of binomial tests to assess the performance of classifiers. However, independent validation accuracy is subject to bias for small training datasets. The present study shows that a hyperbolic function can be used to estimate the loss in classifier accuracy for independent validation. This function is used to develop three new methods to estimate the classifier accuracy for small training sets more precisely. These methods are compared to two existing methods in a simulation study. The results indicate overall small errors in the estimation of classifier accuracy and indicate that independent validation can be used with small samples. A least square estimation approach seems best suited to estimate the classifier accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
作为分类验证方法的独立验证
分类器的使用为传统统计方法提供了另一种选择。这包括利用分类器将数据正确分配到给定组别的准确性来进行测试,以比较分类器的性能。确定分类器准确性的传统验证方法有一个缺点,即正确分类的分布并不遵循任何已知的分布,因此统计检验的应用存在问题。独立验证则规避了这一问题,允许使用二叉检验来评估分类器的性能。不过,独立验证的准确性在训练数据集较小的情况下会出现偏差。本研究表明,双曲线函数可用于估算独立验证的分类器准确性损失。利用该函数开发了三种新方法,以更精确地估计小型训练集的分类器准确性。在模拟研究中,这些方法与现有的两种方法进行了比较。结果表明,分类器准确度估算的总体误差较小,表明独立验证可用于小样本。最小平方估计法似乎最适合估计分类器的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Independent validation as a validation method for classification Estimating item parameters in multistage designs with the tmt package in R
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1