{"title":"EDITORIAL: RECENT ADVANCES IN SPARSE STATISTICAL MODELING","authors":"K. Hirose","doi":"10.5183/JJSCS.1510002_225","DOIUrl":null,"url":null,"abstract":"The first term L(β) is a loss function and the second term λ ∑p j=1 |βj | is a penalty term. Here λ (λ > 0) is a tuning parameter which controls the sparsity and the model fitting. Because the penalty term consists of the sum of absolute values of the parameter, we can carry out the sparse estimation, that is, some of the elements of β are estimated by exactly zeros. It is well-known that we cannot often obtain the analytical solutions of the minimization problem (1), because the penalty term λ ∑p j=1 |βj | is indifferentiable when βj = 0 (j = 1, . . . , p). Therefore, it is important to develop efficient computational algorithms. This special issue includes six interesting papers related to sparse estimation. These papers cover a wide variety of topics, such as statistical modeling, computation, theoretical analysis, and applications. In particular, all of the papers deal with the issue of statistical computation. Kawasaki and Ueki (the first paper of this issue) apply smooth-threshold estimating equations (STEE, Ueki, 2009) to telemarketing success data collected from a Portuguese retail bank. In STEE, the penalty term consists of a quadratic form ∑p j=1 wjβ 2 j instead of ∑p j=1 |βj |, where wj (j = 1, . . . , p) are positive values allowed to be ∞, so that we do not need to implement a computational algorithm that is used in the L1 regularization. Kawano, Hoshina, Shimamura and Konishi (the second paper) propose a model selection criterion for choosing tuning parameters in the Bayesian lasso (Park and Casella, 2008). They use an efficient sparse estimation algorithm in the Bayesian lasso, referred to as the sparse algorithm. Matsui (the third paper) considers the problem of bi-level selection, which allows the selection of groups of variables and individuals simultaneously. The parameter estimation procedure is based on the coordinate descent algorithm, which is known as a remarkably fast algorithm (Friedman et al., 2010). Suzuki (the fourth paper) focuses attention on the alternating direction method of multipliers algorithm (ADMM algorithm, Boyd et al., 2011), which is applicable to various complex penalties such as the overlapping group lasso (Jacob et al., 2009). He reviews a stochastic version of the ADMM algorithm that allows the online learning. Hino and Fujiki (the fifth paper) propose a penalized linear discriminant analysis that adheres to the normal discriminant model. They apply the Majorize-Minimization algorithm (MM algorithm, Hunter and Lange 2004), which is often used to replace a non-convex optimization problem with a reweighted convex optimization","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Japanese Society of Computational Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5183/JJSCS.1510002_225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The first term L(β) is a loss function and the second term λ ∑p j=1 |βj | is a penalty term. Here λ (λ > 0) is a tuning parameter which controls the sparsity and the model fitting. Because the penalty term consists of the sum of absolute values of the parameter, we can carry out the sparse estimation, that is, some of the elements of β are estimated by exactly zeros. It is well-known that we cannot often obtain the analytical solutions of the minimization problem (1), because the penalty term λ ∑p j=1 |βj | is indifferentiable when βj = 0 (j = 1, . . . , p). Therefore, it is important to develop efficient computational algorithms. This special issue includes six interesting papers related to sparse estimation. These papers cover a wide variety of topics, such as statistical modeling, computation, theoretical analysis, and applications. In particular, all of the papers deal with the issue of statistical computation. Kawasaki and Ueki (the first paper of this issue) apply smooth-threshold estimating equations (STEE, Ueki, 2009) to telemarketing success data collected from a Portuguese retail bank. In STEE, the penalty term consists of a quadratic form ∑p j=1 wjβ 2 j instead of ∑p j=1 |βj |, where wj (j = 1, . . . , p) are positive values allowed to be ∞, so that we do not need to implement a computational algorithm that is used in the L1 regularization. Kawano, Hoshina, Shimamura and Konishi (the second paper) propose a model selection criterion for choosing tuning parameters in the Bayesian lasso (Park and Casella, 2008). They use an efficient sparse estimation algorithm in the Bayesian lasso, referred to as the sparse algorithm. Matsui (the third paper) considers the problem of bi-level selection, which allows the selection of groups of variables and individuals simultaneously. The parameter estimation procedure is based on the coordinate descent algorithm, which is known as a remarkably fast algorithm (Friedman et al., 2010). Suzuki (the fourth paper) focuses attention on the alternating direction method of multipliers algorithm (ADMM algorithm, Boyd et al., 2011), which is applicable to various complex penalties such as the overlapping group lasso (Jacob et al., 2009). He reviews a stochastic version of the ADMM algorithm that allows the online learning. Hino and Fujiki (the fifth paper) propose a penalized linear discriminant analysis that adheres to the normal discriminant model. They apply the Majorize-Minimization algorithm (MM algorithm, Hunter and Lange 2004), which is often used to replace a non-convex optimization problem with a reweighted convex optimization