{"title":"Asymptotically optimal model selection and neural nets","authors":"A. Barron","doi":"10.1109/WITS.1994.513871","DOIUrl":null,"url":null,"abstract":"A minimum description length criterion for inference of functions in both parametric and nonparametric settings is determined. By adapting the parameter precision, a description length criterion can take on the form log(likelihood)+const/spl middot/m instead of the familiar -log(likelihood)+(m/2)log n where m is the number of parameters and n is the sample size. For certain regular models the criterion yields asymptotically optimal rates for coding redundancy and statistical risk. Moreover, the convergence is adaptive in the sense that the rates are simultaneously minimax optimal in various parametric and nonparametric function classes without prior knowledge of which function class contains the true function. This one criterion combines positive benefits of information-theoretic criteria proposed by Rissanen, Akaike, and Schwarz. A reviewed is also includes of how the minimum description length principle provides accurate estimates in irregular models such as neural nets.","PeriodicalId":423518,"journal":{"name":"Proceedings of 1994 Workshop on Information Theory and Statistics","volume":"93 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 Workshop on Information Theory and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WITS.1994.513871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A minimum description length criterion for inference of functions in both parametric and nonparametric settings is determined. By adapting the parameter precision, a description length criterion can take on the form log(likelihood)+const/spl middot/m instead of the familiar -log(likelihood)+(m/2)log n where m is the number of parameters and n is the sample size. For certain regular models the criterion yields asymptotically optimal rates for coding redundancy and statistical risk. Moreover, the convergence is adaptive in the sense that the rates are simultaneously minimax optimal in various parametric and nonparametric function classes without prior knowledge of which function class contains the true function. This one criterion combines positive benefits of information-theoretic criteria proposed by Rissanen, Akaike, and Schwarz. A reviewed is also includes of how the minimum description length principle provides accurate estimates in irregular models such as neural nets.