The Cross-Validated Adaptive Epsilon-Net Estimator

M. Laan, S. Dudoit, A. Vaart
{"title":"The Cross-Validated Adaptive Epsilon-Net Estimator","authors":"M. Laan, S. Dudoit, A. Vaart","doi":"10.1524/STND.2006.24.3.373","DOIUrl":null,"url":null,"abstract":"Suppose that we observe a sample of independent and identically distributed realizations of a random variable, and a parameter of interest can be defined as the minimizer, over a suitably defined parameter set, of the expectation of a (loss) function of a candidate parameter value and the random variable. For example, squared error loss in regression or the negative log-density loss in density estimation. Minimizing the empirical risk (i.e., the empirical mean of the loss function) over the entire parameter set may result in ill-defined or too variable estimators of the parameter of interest. In this article, we propose a cross-validated e-net estimation method, which uses a collection of submodels and a collection of e-nets over each submodel. For each submodel s and each resolution level e, the minimizer of the empirical risk over the corresponding e-net is a candidate estimator. Next we select from these estimators (i.e. select the pair (s,e)) by multi-fold cross-validation. We derive a finite sample inequality that shows that the resulting estimator is as good as an oracle estimator that uses the best submodel and resolution level for the unknown true parameter. We also address the implementation of the estimation procedure, and in the context of a linear regression model we present results of a preliminary simulation study comparing the cross-validated e-net estimator to the cross-validated L1-penalized least squares estimator (LASSO) and the least angle regression estimator (LARS).","PeriodicalId":380446,"journal":{"name":"Statistics & Decisions","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"138","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics & Decisions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1524/STND.2006.24.3.373","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 138

Abstract

Suppose that we observe a sample of independent and identically distributed realizations of a random variable, and a parameter of interest can be defined as the minimizer, over a suitably defined parameter set, of the expectation of a (loss) function of a candidate parameter value and the random variable. For example, squared error loss in regression or the negative log-density loss in density estimation. Minimizing the empirical risk (i.e., the empirical mean of the loss function) over the entire parameter set may result in ill-defined or too variable estimators of the parameter of interest. In this article, we propose a cross-validated e-net estimation method, which uses a collection of submodels and a collection of e-nets over each submodel. For each submodel s and each resolution level e, the minimizer of the empirical risk over the corresponding e-net is a candidate estimator. Next we select from these estimators (i.e. select the pair (s,e)) by multi-fold cross-validation. We derive a finite sample inequality that shows that the resulting estimator is as good as an oracle estimator that uses the best submodel and resolution level for the unknown true parameter. We also address the implementation of the estimation procedure, and in the context of a linear regression model we present results of a preliminary simulation study comparing the cross-validated e-net estimator to the cross-validated L1-penalized least squares estimator (LASSO) and the least angle regression estimator (LARS).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
交叉验证自适应Epsilon-Net估计器
假设我们观察到一个随机变量的独立和同分布实现的样本,并且感兴趣的参数可以定义为在适当定义的参数集上,候选参数值和随机变量的(损失)函数的期望的最小值。例如,回归中的平方误差损失或密度估计中的负对数密度损失。最小化整个参数集上的经验风险(即损失函数的经验平均值)可能会导致感兴趣参数的定义不清或太可变的估计。在本文中,我们提出了一种交叉验证的e-net估计方法,该方法使用子模型集合和每个子模型上的e-net集合。对于每个子模型s和每个分辨率水平e,相应e-net上经验风险的最小值是一个候选估计量。接下来,我们通过多重交叉验证从这些估计量中选择(即选择对(s,e))。我们推导了一个有限样本不等式,表明所得到的估计器与使用最佳子模型和未知真参数的分辨率水平的oracle估计器一样好。我们还讨论了估计过程的实现,在线性回归模型的背景下,我们提出了初步模拟研究的结果,将交叉验证的e-net估计器与交叉验证的l1惩罚最小二乘估计器(LASSO)和最小角度回归估计器(LARS)进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A BAYESIAN APPROACH TO INCORPORATE MODEL AMBIGUITY IN A DYNAMIC RISK MEASURE Efficient estimation of a linear functional of a bivariate distribution with equal, but unknown, marginals: The minimum chi-square approach Locally asymptotically optimal tests in semiparametric generalized linear models in the 2-sample-problem Maximum likelihood estimator in a two-phase nonlinear random regression model Confidence estimation of the covariance function of stationary and locally stationary processes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1