Binary classification with covariate selection through ℓ0-penalised empirical risk minimisation

IF 2.9 4区 经济学 Q1 ECONOMICS Econometrics Journal Pub Date : 2020-06-20 DOI:10.1093/ectj/utaa017
Le‐Yu Chen, S. Lee
{"title":"Binary classification with covariate selection through ℓ0-penalised empirical risk minimisation","authors":"Le‐Yu Chen, S. Lee","doi":"10.1093/ectj/utaa017","DOIUrl":null,"url":null,"abstract":"\n We consider the problem of binary classification with covariate selection. We construct a classification procedure by minimising the empirical misclassification risk with a penalty on the number of selected covariates. This optimisation problem is equivalent to obtaining an ℓ0-penalised maximum score estimator. We derive probability bounds on the estimated sparsity as well as on the excess misclassification risk. These theoretical results are nonasymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse solution whose ℓ0-norm can be arbitrarily close to true sparsity with high probability and obtain the rates of convergence for the excess misclassification risk. We implement the proposed procedure via the method of mixed-integer linear programming. Its numerical performance is illustrated in Monte Carlo experiments and a real data application of the work-trip transportation mode choice.","PeriodicalId":50555,"journal":{"name":"Econometrics Journal","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2020-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/ectj/utaa017","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics Journal","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1093/ectj/utaa017","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 7

Abstract

We consider the problem of binary classification with covariate selection. We construct a classification procedure by minimising the empirical misclassification risk with a penalty on the number of selected covariates. This optimisation problem is equivalent to obtaining an ℓ0-penalised maximum score estimator. We derive probability bounds on the estimated sparsity as well as on the excess misclassification risk. These theoretical results are nonasymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse solution whose ℓ0-norm can be arbitrarily close to true sparsity with high probability and obtain the rates of convergence for the excess misclassification risk. We implement the proposed procedure via the method of mixed-integer linear programming. Its numerical performance is illustrated in Monte Carlo experiments and a real data application of the work-trip transportation mode choice.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过ℓ0-惩罚的经验风险最小化
我们考虑具有协变量选择的二元分类问题。我们通过最小化经验错误分类风险,并对所选协变量的数量进行惩罚,构建了一个分类程序。这个优化问题相当于获得ℓ0-惩罚最大得分估计器。我们推导了估计稀疏性和过度错误分类风险的概率边界。这些理论结果是非共形的,并且建立在高维环境中。特别地,我们证明了我们的方法产生的稀疏解ℓ0-范数可以以高概率任意接近真稀疏性,并获得过量错误分类风险的收敛率。我们通过混合整数线性规划的方法来实现所提出的过程。其数值性能在蒙特卡洛实验和实际数据应用中得到了说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Econometrics Journal
Econometrics Journal 管理科学-数学跨学科应用
CiteScore
4.20
自引率
5.30%
发文量
25
审稿时长
>12 weeks
期刊介绍: The Econometrics Journal was established in 1998 by the Royal Economic Society with the aim of creating a top international field journal for the publication of econometric research with a standard of intellectual rigour and academic standing similar to those of the pre-existing top field journals in econometrics. The Econometrics Journal is committed to publishing first-class papers in macro-, micro- and financial econometrics. It is a general journal for econometric research open to all areas of econometrics, whether applied, computational, methodological or theoretical contributions.
期刊最新文献
The Vector Error Correction Index Model: Representation, Estimation and Identification Double Robustness for Complier Parameters and a Semiparametric Test for Complier Characteristics Revealing priors from posteriors with an application to inflation forecasting in the UK Penalized quasi-likelihood estimation and model selection with parameters on the boundary of the parameter space Identifying the elasticity of substitution with biased technical change - a structural panel GMM estimator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1