Information criteria have been popularly used in model selection and proved to possess nice theoretical properties. For classification, Claeskens et al. (2008) proposed support vector machine information criterion for feature selection and provided encouraging numerical evidence. Yet no theoretical justification was given there. This work aims to fill the gap and to provide some theoretical justifications for support vector machine information criterion in both fixed and diverging model spaces. We first derive a uniform convergence rate for the support vector machine solution and then show that a modification of the support vector machine information criterion achieves model selection consistency even when the number of features diverges at an exponential rate of the sample size. This consistency result can be further applied to selecting the optimal tuning parameter for various penalized support vector machine methods. Finite-sample performance of the proposed information criterion is investigated using Monte Carlo studies and one real-world gene selection problem.
信息准则在模型选择中得到了广泛的应用,并被证明具有良好的理论性质。在分类方面,Claeskens et al.(2008)提出了支持向量机信息标准用于特征选择,并提供了令人鼓舞的数值证据。然而,他们没有给出任何理论依据。本工作旨在填补这一空白,并为支持向量机信息准则在固定和发散模型空间中的应用提供一些理论依据。我们首先推导了支持向量机解的统一收敛速率,然后证明了即使特征数量以样本大小的指数速率发散,对支持向量机信息准则的修改也能实现模型选择的一致性。这一一致性结果可进一步应用于选择各种惩罚支持向量机方法的最优调优参数。利用蒙特卡罗研究和一个现实世界的基因选择问题,研究了所提出的信息准则的有限样本性能。
{"title":"A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces.","authors":"Xiang Zhang, Yichao Wu, Lan Wang, Runze Li","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Information criteria have been popularly used in model selection and proved to possess nice theoretical properties. For classification, Claeskens et al. (2008) proposed support vector machine information criterion for feature selection and provided encouraging numerical evidence. Yet no theoretical justification was given there. This work aims to fill the gap and to provide some theoretical justifications for support vector machine information criterion in both fixed and diverging model spaces. We first derive a uniform convergence rate for the support vector machine solution and then show that a modification of the support vector machine information criterion achieves model selection consistency even when the number of features diverges at an exponential rate of the sample size. This consistency result can be further applied to selecting the optimal tuning parameter for various penalized support vector machine methods. Finite-sample performance of the proposed information criterion is investigated using Monte Carlo studies and one real-world gene selection problem.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"17 16","pages":"1-26"},"PeriodicalIF":6.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4883123/pdf/nihms733772.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34435261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As data sets of related studies become more easily accessible, combining data sets of similar studies is often undertaken in practice to achieve a larger sample size and higher power. A major challenge arising from data integration pertains to data heterogeneity in terms of study population, study design, or study coordination. Ignoring such heterogeneity in data analysis may result in biased estimation and misleading inference. Traditional techniques of remedy to data heterogeneity include the use of interactions and random effects, which are inferior to achieving desirable statistical power or providing a meaningful interpretation, especially when a large number of smaller data sets are combined. In this paper, we propose a regularized fusion method that allows us to identify and merge inter-study homogeneous parameter clusters in regression analysis, without the use of hypothesis testing approach. Using the fused lasso, we establish a computationally efficient procedure to deal with large-scale integrated data. Incorporating the estimated parameter ordering in the fused lasso facilitates computing speed with no loss of statistical power. We conduct extensive simulation studies and provide an application example to demonstrate the performance of the new method with a comparison to the conventional methods.
{"title":"Fused Lasso Approach in Regression Coefficients Clustering - Learning Parameter Heterogeneity in Data Integration.","authors":"Lu Tang, Peter X K Song","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>As data sets of related studies become more easily accessible, combining data sets of similar studies is often undertaken in practice to achieve a larger sample size and higher power. A major challenge arising from data integration pertains to data heterogeneity in terms of study population, study design, or study coordination. Ignoring such heterogeneity in data analysis may result in biased estimation and misleading inference. Traditional techniques of remedy to data heterogeneity include the use of interactions and random effects, which are inferior to achieving desirable statistical power or providing a meaningful interpretation, especially when a large number of smaller data sets are combined. In this paper, we propose a regularized fusion method that allows us to identify and merge inter-study homogeneous parameter clusters in regression analysis, without the use of hypothesis testing approach. Using the fused lasso, we establish a computationally efficient procedure to deal with large-scale integrated data. Incorporating the estimated parameter ordering in the fused lasso facilitates computing speed with no loss of statistical power. We conduct extensive simulation studies and provide an application example to demonstrate the performance of the new method with a comparison to the conventional methods.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"17 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5647925/pdf/nihms872528.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35531942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present new methodology based on Multi-Objective Markov Decision Processes for developing sequential decision support systems from data. Our approach uses sequential decision-making data to provide support that is useful to many different decision-makers, each with different, potentially time-varying preference. To accomplish this, we develop an extension of fitted-Q iteration for multiple objectives that computes policies for all scalarization functions, i.e. preference functions, simultaneously from continuous-state, finite-horizon data. We identify and address several conceptual and computational challenges along the way, and we introduce a new solution concept that is appropriate when different actions have similar expected outcomes. Finally, we demonstrate an application of our method using data from the Clinical Antipsychotic Trials of Intervention Effectiveness and show that our approach offers decision-makers increased choice by a larger class of optimal policies.
{"title":"Multi-Objective Markov Decision Processes for Data-Driven Decision Support.","authors":"Daniel J Lizotte, Eric B Laber","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present new methodology based on Multi-Objective Markov Decision Processes for developing sequential decision support systems from data. Our approach uses sequential decision-making data to provide support that is useful to many different decision-makers, each with different, potentially time-varying preference. To accomplish this, we develop an extension of fitted-<i>Q</i> iteration for multiple objectives that computes policies for all scalarization functions, i.e. preference functions, simultaneously from continuous-state, finite-horizon data. We identify and address several conceptual and computational challenges along the way, and we introduce a new solution concept that is appropriate when different actions have similar expected outcomes. Finally, we demonstrate an application of our method using data from the Clinical Antipsychotic Trials of Intervention Effectiveness and show that our approach offers decision-makers increased choice by a larger class of optimal policies.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"17 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5179144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141297118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learning probabilistic models over strings is an important issue for many applications. Spectral methods propose elegant solutions to the problem of inferring weighted automata from finite samples ...
{"title":"Dimension-free concentration bounds on hankel matrices for spectral learning","authors":"DenisFrançois, GybelsMattias, HabrardAmaury","doi":"10.5555/2946645.2946676","DOIUrl":"https://doi.org/10.5555/2946645.2946676","url":null,"abstract":"Learning probabilistic models over strings is an important issue for many applications. Spectral methods propose elegant solutions to the problem of inferring weighted automata from finite samples ...","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"1 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71138777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Byron C Wallace, Joël Kuiper, Aakash Sharma, Mingxi Brian Zhu, Iain J Marshall
Systematic reviews underpin Evidence Based Medicine (EBM) by addressing precise clinical questions via comprehensive synthesis of all relevant published evidence. Authors of systematic reviews typically define a Population/Problem, Intervention, Comparator, and Outcome (a PICO criteria) of interest, and then retrieve, appraise and synthesize results from all reports of clinical trials that meet these criteria. Identifying PICO elements in the full-texts of trial reports is thus a critical yet time-consuming step in the systematic review process. We seek to expedite evidence synthesis by developing machine learning models to automatically extract sentences from articles relevant to PICO elements. Collecting a large corpus of training data for this task would be prohibitively expensive. Therefore, we derive distant supervision (DS) with which to train models using previously conducted reviews. DS entails heuristically deriving 'soft' labels from an available structured resource. However, we have access only to unstructured, free-text summaries of PICO elements for corresponding articles; we must derive from these the desired sentence-level annotations. To this end, we propose a novel method - supervised distant supervision (SDS) - that uses a small amount of direct supervision to better exploit a large corpus of distantly labeled instances by learning to pseudo-annotate articles using the available DS. We show that this approach tends to outperform existing methods with respect to automated PICO extraction.
{"title":"Extracting PICO Sentences from Clinical Trial Reports using <i>Supervised Distant Supervision</i>.","authors":"Byron C Wallace, Joël Kuiper, Aakash Sharma, Mingxi Brian Zhu, Iain J Marshall","doi":"","DOIUrl":"","url":null,"abstract":"<p><p><i>Systematic reviews</i> underpin Evidence Based Medicine (EBM) by addressing precise clinical questions via comprehensive synthesis of all relevant published evidence. Authors of systematic reviews typically define a Population/Problem, Intervention, Comparator, and Outcome (a <i>PICO</i> criteria) of interest, and then retrieve, appraise and synthesize results from all reports of clinical trials that meet these criteria. Identifying PICO elements in the full-texts of trial reports is thus a critical yet time-consuming step in the systematic review process. We seek to expedite evidence synthesis by developing machine learning models to automatically extract sentences from articles relevant to PICO elements. Collecting a large corpus of training data for this task would be prohibitively expensive. Therefore, we derive <i>distant supervision</i> (DS) with which to train models using previously conducted reviews. DS entails heuristically deriving 'soft' labels from an available structured resource. However, we have access only to unstructured, free-text summaries of PICO elements for corresponding articles; we must derive from these the desired sentence-level annotations. To this end, we propose a novel method - <i>supervised distant supervision</i> (SDS) - that uses a small amount of direct supervision to better exploit a large corpus of distantly labeled instances by <i>learning</i> to pseudo-annotate articles using the available DS. We show that this approach tends to outperform existing methods with respect to automated PICO extraction.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"17 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5065023/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140289407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eunho Yang, Pradeep Ravikumar, Genevera I Allen, Zhandong Liu
Undirected graphical models, or Markov networks, are a popular class of statistical models, used in a wide variety of applications. Popular instances of this class include Gaussian graphical models and Ising models. In many settings, however, it might not be clear which subclass of graphical models to use, particularly for non-Gaussian and non-categorical data. In this paper, we consider a general sub-class of graphical models where the node-wise conditional distributions arise from exponential families. This allows us to derive multivariate graphical model distributions from univariate exponential family distributions, such as the Poisson, negative binomial, and exponential distributions. Our key contributions include a class of M-estimators to fit these graphical model distributions; and rigorous statistical analysis showing that these M-estimators recover the true graphical model structure exactly, with high probability. We provide examples of genomic and proteomic networks learned via instances of our class of graphical models derived from Poisson and exponential distributions.
无向图模型或马尔可夫网络是一类流行的统计模型,应用广泛。这类模型的常用实例包括高斯图形模型和伊辛模型。然而,在很多情况下,使用哪一类图形模型可能并不明确,特别是对于非高斯和非分类数据。在本文中,我们考虑了图形模型的一般子类,其中节点条件分布来自指数族。这样,我们就能从单变量指数族分布(如泊松分布、负二项分布和指数分布)推导出多变量图形模型分布。我们的主要贡献包括:一类拟合这些图形模型分布的 M 估计器;以及严格的统计分析,表明这些 M 估计器以很高的概率精确地恢复了真实的图形模型结构。我们提供了基因组和蛋白质组网络的实例,这些网络是通过我们从泊松和指数分布中推导出的图形模型实例学习到的。
{"title":"Graphical Models via Univariate Exponential Family Distributions.","authors":"Eunho Yang, Pradeep Ravikumar, Genevera I Allen, Zhandong Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Undirected graphical models, or Markov networks, are a popular class of statistical models, used in a wide variety of applications. Popular instances of this class include Gaussian graphical models and Ising models. In many settings, however, it might not be clear which subclass of graphical models to use, particularly for non-Gaussian and non-categorical data. In this paper, we consider a general sub-class of graphical models where the node-wise conditional distributions arise from exponential families. This allows us to derive <i>multivariate</i> graphical model distributions from <i>univariate</i> exponential family distributions, such as the Poisson, negative binomial, and exponential distributions. Our key contributions include a class of M-estimators to fit these graphical model distributions; and rigorous statistical analysis showing that these M-estimators recover the true graphical model structure exactly, with high probability. We provide examples of genomic and proteomic networks learned via instances of our class of graphical models derived from Poisson and exponential distributions.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"16 ","pages":"3813-3847"},"PeriodicalIF":6.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4998206/pdf/nihms808903.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34398019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O(1/ϵ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
我们提出了一种名为 CMR 的校准多元回归方法,用于拟合高维多元回归模型。与现有方法相比,CMR 针对每个回归任务的噪声水平校准正则化,从而同时获得更好的有限样本性能和调整不敏感性。从理论上讲,我们提供了 CMR 在参数估计中达到最佳收敛率的充分条件。在计算上,我们提出了一种高效的平滑近似梯度算法,其最坏情况下的数值收敛率为 O(1/ϵ),其中ϵ 是目标函数值的预设精度。我们进行了全面的数值模拟,以说明 CMR 始终优于其他高维多元回归方法。我们还将 CMR 应用于解决大脑活动预测问题,发现它与人类专家创建的手工模型一样具有竞争力。实现该方法的 R 软件包 camel 可在综合 R 档案网络 http://cran.r-project.org/web/packages/camel/ 上获取。
{"title":"Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.","authors":"Han Liu, Lie Wang, Tuo Zhao","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence <i>O</i>(1/<i>ϵ</i>), where <i>ϵ</i> is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"16 ","pages":"1579-1606"},"PeriodicalIF":6.0,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5354374/pdf/nihms-752268.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34832057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The statistical comparison of multiple algorithms over multiple data sets is fundamental in machine learning. This is typically carried out by the Friedman test. When the Friedman test rejects the null hypothesis, multiple comparisons are carried out to establish which are the significant differences among algorithms. The multiple comparisons are usually performed using the mean-ranks test. The aim of this technical note is to discuss the inconsistencies of the mean-ranks post-hoc test with the goal of discouraging its use in machine learning as well as in medicine, psychology, etc. We show that the outcome of the mean-ranks test depends on the pool of algorithms originally included in the experiment. In other words, the outcome of the comparison between algorithms A and B depends also on the performance of the other algorithms included in the original experiment. This can lead to paradoxical situations. For instance the difference between A and B could be declared significant if the pool comprises algorithms C, D, E and not significant if the pool comprises algorithms F, G, H. To overcome these issues, we suggest instead to perform the multiple comparison using a test whose outcome only depends on the two algorithms being compared, such as the sign-test or the Wilcoxon signed-rank test.
{"title":"Should we really use post-hoc tests based on mean-ranks?","authors":"BenavoliAlessio, CoraniGiorgio, MangiliFrancesca","doi":"10.5555/2946645.2946650","DOIUrl":"https://doi.org/10.5555/2946645.2946650","url":null,"abstract":"The statistical comparison of multiple algorithms over multiple data sets is fundamental in machine learning. This is typically carried out by the Friedman test. When the Friedman test rejects the null hypothesis, multiple comparisons are carried out to establish which are the significant differences among algorithms. The multiple comparisons are usually performed using the mean-ranks test. The aim of this technical note is to discuss the inconsistencies of the mean-ranks post-hoc test with the goal of discouraging its use in machine learning as well as in medicine, psychology, etc. We show that the outcome of the mean-ranks test depends on the pool of algorithms originally included in the experiment. In other words, the outcome of the comparison between algorithms A and B depends also on the performance of the other algorithms included in the original experiment. This can lead to paradoxical situations. For instance the difference between A and B could be declared significant if the pool comprises algorithms C, D, E and not significant if the pool comprises algorithms F, G, H. To overcome these issues, we suggest instead to perform the multiple comparison using a test whose outcome only depends on the two algorithms being compared, such as the sign-test or the Wilcoxon signed-rank test.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"1 1","pages":"5:1-5:10"},"PeriodicalIF":6.0,"publicationDate":"2015-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71138343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD Lasso, SQRT Lasso, ℓ q Lasso, and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). These methods exploit different nonsmooth loss functions to gain modeling exibility, estimation robustness, and tuning insensitiveness. The developed solver is based on the alternating direction method of multipliers (ADMM), which is further accelerated by the multistage screening approach. The package flare is coded in double precision C, and called from R by a user-friendly interface. The memory usage is optimized by using the sparse matrix output. The experiments show that flare is efficient and can scale up to large problems.
本文介绍了一个名为 flare 的 R 软件包,它实现了一系列新的高维回归方法(LAD Lasso、SQRT Lasso、ℓ q Lasso 和 Dantzig selector)及其对稀疏精度矩阵估计的扩展(TIGER 和 CLIME)。这些方法利用不同的非光滑损失函数来获得建模的可扩展性、估计的鲁棒性和调整的不敏感性。开发的求解器基于交替方向乘法(ADMM),并通过多级筛选方法进一步加速。软件包 flare 采用双精度 C 语言编码,并通过友好的用户界面从 R 语言调用。通过使用稀疏矩阵输出,优化了内存使用。实验表明,flare 是高效的,可以扩展到大型问题。
{"title":"The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R.","authors":"Xingguo Li, Tuo Zhao, Xiaoming Yuan, Han Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD Lasso, SQRT Lasso, ℓ <sub><i>q</i></sub> Lasso, and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). These methods exploit different nonsmooth loss functions to gain modeling exibility, estimation robustness, and tuning insensitiveness. The developed solver is based on the alternating direction method of multipliers (ADMM), which is further accelerated by the multistage screening approach. The package flare is coded in double precision C, and called from R by a user-friendly interface. The memory usage is optimized by using the sparse matrix output. The experiments show that flare is efficient and can scale up to large problems.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"16 ","pages":"553-557"},"PeriodicalIF":6.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5360104/pdf/nihms752270.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34850328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increasing size of today's data sets, nding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an i...
{"title":"Fast cross-validation via sequential testing","authors":"KruegerTammo, PankninDanny, BraunMikio","doi":"10.5555/2789272.2886786","DOIUrl":"https://doi.org/10.5555/2789272.2886786","url":null,"abstract":"With the increasing size of today's data sets, nding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an i...","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"1 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71135413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}