Sparse Boosting Based Machine Learning Methods for High-Dimensional Data

Computational Statistics [Working Title] Pub Date : 2021-10-19 DOI:10.5772/intechopen.100506

Mu Yue

{"title":"Sparse Boosting Based Machine Learning Methods for High-Dimensional Data","authors":"Mu Yue","doi":"10.5772/intechopen.100506","DOIUrl":null,"url":null,"abstract":"In high-dimensional data, penalized regression is often used for variable selection and parameter estimation. However, these methods typically require time-consuming cross-validation methods to select tuning parameters and retain more false positives under high dimensionality. This chapter discusses sparse boosting based machine learning methods in the following high-dimensional problems. First, a sparse boosting method to select important biomarkers is studied for the right censored survival data with high-dimensional biomarkers. Then, a two-step sparse boosting method to carry out the variable selection and the model-based prediction is studied for the high-dimensional longitudinal observations measured repeatedly over time. Finally, a multi-step sparse boosting method to identify patient subgroups that exhibit different treatment effects is studied for the high-dimensional dense longitudinal observations. This chapter intends to solve the problem of how to improve the accuracy and calculation speed of variable selection and parameter estimation in high-dimensional data. It aims to expand the application scope of sparse boosting and develop new methods of high-dimensional survival analysis, longitudinal data analysis, and subgroup analysis, which has great application prospects.","PeriodicalId":127371,"journal":{"name":"Computational Statistics [Working Title]","volume":"265 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics [Working Title]","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5772/intechopen.100506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In high-dimensional data, penalized regression is often used for variable selection and parameter estimation. However, these methods typically require time-consuming cross-validation methods to select tuning parameters and retain more false positives under high dimensionality. This chapter discusses sparse boosting based machine learning methods in the following high-dimensional problems. First, a sparse boosting method to select important biomarkers is studied for the right censored survival data with high-dimensional biomarkers. Then, a two-step sparse boosting method to carry out the variable selection and the model-based prediction is studied for the high-dimensional longitudinal observations measured repeatedly over time. Finally, a multi-step sparse boosting method to identify patient subgroups that exhibit different treatment effects is studied for the high-dimensional dense longitudinal observations. This chapter intends to solve the problem of how to improve the accuracy and calculation speed of variable selection and parameter estimation in high-dimensional data. It aims to expand the application scope of sparse boosting and develop new methods of high-dimensional survival analysis, longitudinal data analysis, and subgroup analysis, which has great application prospects.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于稀疏增强的高维数据机器学习方法

在高维数据中，惩罚回归常用于变量选择和参数估计。然而，这些方法通常需要耗时的交叉验证方法来选择调优参数，并在高维下保留更多的误报。本章讨论了以下高维问题中基于稀疏增强的机器学习方法。首先，研究了一种稀疏增强方法来选择具有高维生物标志物的正确剔除存活数据中的重要生物标志物。然后，研究了一种两步稀疏增强方法，对随时间重复测量的高维纵向观测数据进行变量选择和基于模型的预测。最后，针对高维密集纵向观察，研究了一种多步稀疏增强方法来识别表现出不同治疗效果的患者亚组。本章旨在解决如何提高高维数据中变量选择和参数估计的精度和计算速度的问题。旨在扩大稀疏助推的应用范围，开发高维生存分析、纵向数据分析、子群分析等新方法，具有很大的应用前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computational Statistics [Working Title]

自引率

0.00%

发文量

期刊最新文献

Fast Computation of the EM Algorithm for Mixture Models A New Functional Clustering Method with Combined Dissimilarity Sources and Graphical Interpretation Sparse Boosting Based Machine Learning Methods for High-Dimensional Data