{"title":"Optimal subsampling algorithms for composite quantile regression in massive data","authors":"Jun Jin, Shuangzhe Liu, Tiefeng Ma","doi":"10.1080/02331888.2023.2239507","DOIUrl":null,"url":null,"abstract":"Massive datasets have gained increasing prominence across various fields, but their analysis is often impeded by computational limitations. In response, Wang and Ma (Optimal subsampling for quantile regression in big data. Biometrika. 2021;108:99–112) have proposed an optimal subsampling method for quantile regression in massive datasets. Composite quantile regression, as a robust and efficient alternative to ordinary least squares regression and quantile regression in linear models, presents further complexities due to its distinct loss function. This paper extends the optimal subsampling method to accommodate composite quantile regression problems. We begin by deriving two new optimal subsampling probabilities for composite quantile regression, considering both the L- and A-optimality criteria Subsequently, we develop an adaptive two-step method based on these probabilities. The resulting estimators exhibit desirable asymptotic properties. In addition, to estimate the variance-covariance matrix without explicitly estimating the densities of the responses, we propose a combining subsamples method. Numerical studies on simulated and real data are conducted to assess and showcase the practical performance of our proposed methods.","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"29 1","pages":"811 - 843"},"PeriodicalIF":1.2000,"publicationDate":"2023-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/02331888.2023.2239507","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Massive datasets have gained increasing prominence across various fields, but their analysis is often impeded by computational limitations. In response, Wang and Ma (Optimal subsampling for quantile regression in big data. Biometrika. 2021;108:99–112) have proposed an optimal subsampling method for quantile regression in massive datasets. Composite quantile regression, as a robust and efficient alternative to ordinary least squares regression and quantile regression in linear models, presents further complexities due to its distinct loss function. This paper extends the optimal subsampling method to accommodate composite quantile regression problems. We begin by deriving two new optimal subsampling probabilities for composite quantile regression, considering both the L- and A-optimality criteria Subsequently, we develop an adaptive two-step method based on these probabilities. The resulting estimators exhibit desirable asymptotic properties. In addition, to estimate the variance-covariance matrix without explicitly estimating the densities of the responses, we propose a combining subsamples method. Numerical studies on simulated and real data are conducted to assess and showcase the practical performance of our proposed methods.
期刊介绍:
Statistics publishes papers developing and analysing new methods for any active field of statistics, motivated by real-life problems. Papers submitted for consideration should provide interesting and novel contributions to statistical theory and its applications with rigorous mathematical results and proofs. Moreover, numerical simulations and application to real data sets can improve the quality of papers, and should be included where appropriate. Statistics does not publish papers which represent mere application of existing procedures to case studies, and papers are required to contain methodological or theoretical innovation. Topics of interest include, for example, nonparametric statistics, time series, analysis of topological or functional data. Furthermore the journal also welcomes submissions in the field of theoretical econometrics and its links to mathematical statistics.