{"title":"当维度和样本量较大时,使用CV估计误分类概率","authors":"Tomoyuki Nakagawa","doi":"10.32917/HMJ/1544238034","DOIUrl":null,"url":null,"abstract":"In this paper, we study about estimating the probabilities of misclassification in the high-dimensional data. In many cases, the cross-validation (CV) is often used for estimations of the probabilities of misclassification. CV provides a nearly unbiased estimate, using the original data when the sample sizes are large. On the other hand, the properties of CV are not well-known when the dimension is large as compared to the sample sizes. Therefore, we investigate asymptotic properties of CV when the dimension and the sample sizes tend to be large. Furthermore, we suggest the three methods for correcting the bias by using CV which is usable in the high-dimensional data. We show performances of the estimators in the simulation studies.","PeriodicalId":55054,"journal":{"name":"Hiroshima Mathematical Journal","volume":" ","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2018-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Estimating the probabilities of misclassification using CV when the dimension and the sample sizes are large\",\"authors\":\"Tomoyuki Nakagawa\",\"doi\":\"10.32917/HMJ/1544238034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study about estimating the probabilities of misclassification in the high-dimensional data. In many cases, the cross-validation (CV) is often used for estimations of the probabilities of misclassification. CV provides a nearly unbiased estimate, using the original data when the sample sizes are large. On the other hand, the properties of CV are not well-known when the dimension is large as compared to the sample sizes. Therefore, we investigate asymptotic properties of CV when the dimension and the sample sizes tend to be large. Furthermore, we suggest the three methods for correcting the bias by using CV which is usable in the high-dimensional data. We show performances of the estimators in the simulation studies.\",\"PeriodicalId\":55054,\"journal\":{\"name\":\"Hiroshima Mathematical Journal\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2018-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Hiroshima Mathematical Journal\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.32917/HMJ/1544238034\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hiroshima Mathematical Journal","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.32917/HMJ/1544238034","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
Estimating the probabilities of misclassification using CV when the dimension and the sample sizes are large
In this paper, we study about estimating the probabilities of misclassification in the high-dimensional data. In many cases, the cross-validation (CV) is often used for estimations of the probabilities of misclassification. CV provides a nearly unbiased estimate, using the original data when the sample sizes are large. On the other hand, the properties of CV are not well-known when the dimension is large as compared to the sample sizes. Therefore, we investigate asymptotic properties of CV when the dimension and the sample sizes tend to be large. Furthermore, we suggest the three methods for correcting the bias by using CV which is usable in the high-dimensional data. We show performances of the estimators in the simulation studies.
期刊介绍:
Hiroshima Mathematical Journal (HMJ) is a continuation of Journal of Science of the Hiroshima University, Series A, Vol. 1 - 24 (1930 - 1960), and Journal of Science of the Hiroshima University, Series A - I , Vol. 25 - 34 (1961 - 1970).
Starting with Volume 4 (1974), each volume of HMJ consists of three numbers annually. This journal publishes original papers in pure and applied mathematics. HMJ is an (electronically) open access journal from Volume 36, Number 1.