Sophie Potts, Elisabeth Bergherr, Constantin Reinke, Colin Griesbach
{"title":"Prediction-based variable selection for component-wise gradient boosting.","authors":"Sophie Potts, Elisabeth Bergherr, Constantin Reinke, Colin Griesbach","doi":"10.1515/ijb-2023-0052","DOIUrl":null,"url":null,"abstract":"<p><p>Model-based component-wise gradient boosting is a popular tool for data-driven variable selection. In order to improve its prediction and selection qualities even further, several modifications of the original algorithm have been developed, that mainly focus on different stopping criteria, leaving the actual variable selection mechanism untouched. We investigate different prediction-based mechanisms for the variable selection step in model-based component-wise gradient boosting. These approaches include Akaikes Information Criterion (AIC) as well as a selection rule relying on the component-wise test error computed via cross-validation. We implemented the AIC and cross-validation routines for Generalized Linear Models and evaluated them regarding their variable selection properties and predictive performance. An extensive simulation study revealed improved selection properties whereas the prediction error could be lowered in a real world application with age-standardized COVID-19 incidence rates.</p>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/ijb-2023-0052","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Model-based component-wise gradient boosting is a popular tool for data-driven variable selection. In order to improve its prediction and selection qualities even further, several modifications of the original algorithm have been developed, that mainly focus on different stopping criteria, leaving the actual variable selection mechanism untouched. We investigate different prediction-based mechanisms for the variable selection step in model-based component-wise gradient boosting. These approaches include Akaikes Information Criterion (AIC) as well as a selection rule relying on the component-wise test error computed via cross-validation. We implemented the AIC and cross-validation routines for Generalized Linear Models and evaluated them regarding their variable selection properties and predictive performance. An extensive simulation study revealed improved selection properties whereas the prediction error could be lowered in a real world application with age-standardized COVID-19 incidence rates.
基于模型的组件梯度增强是一种流行的数据驱动变量选择工具。为了进一步提高其预测和选择质量,对原始算法进行了一些修改,主要关注不同的停止准则,而没有改变实际的变量选择机制。我们研究了基于模型的组件梯度增强中变量选择步骤的不同基于预测的机制。这些方法包括赤池氏信息准则(Akaikes Information Criterion, AIC)以及依赖于通过交叉验证计算的组件测试误差的选择规则。我们实现了广义线性模型的AIC和交叉验证例程,并评估了它们的变量选择特性和预测性能。一项广泛的模拟研究揭示了改进的选择特性,而在年龄标准化的COVID-19发病率的现实世界应用中,预测误差可以降低。
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.