高维广义线性模型的基于正则化的自适应检验。

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Journal of Machine Learning Research Pub Date : 2020-01-01 Epub Date: 2020-07-26

Chong Wu, Gongjun Xu, Xiaotong Shen, Wei Pan

{"title":"高维广义线性模型的基于正则化的自适应检验。","authors":"Chong Wu, Gongjun Xu, Xiaotong Shen, Wei Pan","doi":"","DOIUrl":null,"url":null,"abstract":"In spite of its urgent importance in the era of big data, testing high-dimensional parameters in generalized linear models (GLMs) in the presence of high-dimensional nuisance parameters has been largely under-studied, especially with regard to constructing powerful tests for general (and unknown) alternatives. Most existing tests are powerful only against certain alternatives and may yield incorrect Type I error rates under high-dimensional nuisance parameter situations. In this paper, we propose the adaptive interaction sum of powered score (aiSPU) test in the framework of penalized regression with a non-convex penalty, called truncated Lasso penalty (TLP), which can maintain correct Type I error rates while yielding high statistical power across a wide range of alternatives. To calculate its p-values analytically, we derive its asymptotic null distribution. Via simulations, its superior finite-sample performance is demonstrated over several representative existing methods. In addition, we apply it and other representative tests to an Alzheimer's Disease Neuroimaging Initiative (ADNI) data set, detecting possible gene-gender interactions for Alzheimer's disease. We also put R package \"aispu\" implementing the proposed test on GitHub.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7425805/pdf/","citationCount":"0","resultStr":"{\"title\":\"A Regularization-Based Adaptive Test for High-Dimensional Generalized Linear Models.\",\"authors\":\"Chong Wu, Gongjun Xu, Xiaotong Shen, Wei Pan\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In spite of its urgent importance in the era of big data, testing high-dimensional parameters in generalized linear models (GLMs) in the presence of high-dimensional nuisance parameters has been largely under-studied, especially with regard to constructing powerful tests for general (and unknown) alternatives. Most existing tests are powerful only against certain alternatives and may yield incorrect Type I error rates under high-dimensional nuisance parameter situations. In this paper, we propose the adaptive interaction sum of powered score (aiSPU) test in the framework of penalized regression with a non-convex penalty, called truncated Lasso penalty (TLP), which can maintain correct Type I error rates while yielding high statistical power across a wide range of alternatives. To calculate its p-values analytically, we derive its asymptotic null distribution. Via simulations, its superior finite-sample performance is demonstrated over several representative existing methods. In addition, we apply it and other representative tests to an Alzheimer's Disease Neuroimaging Initiative (ADNI) data set, detecting possible gene-gender interactions for Alzheimer's disease. We also put R package \\\"aispu\\\" implementing the proposed test on GitHub.\",\"PeriodicalId\":50161,\"journal\":{\"name\":\"Journal of Machine Learning Research\",\"volume\":\"21 \",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7425805/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Machine Learning Research\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2020/7/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Machine Learning Research","FirstCategoryId":"94","ListUrlMain":"","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/7/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

尽管在大数据时代具有紧迫的重要性，但在存在高维干扰参数的情况下测试广义线性模型（GLM）中的高维参数在很大程度上被研究不足，尤其是在为一般（和未知）替代方案构建强大的测试方面。大多数现有的测试仅针对某些替代方案是强大的，并且在高维干扰参数情况下可能产生不正确的I型错误率。在本文中，我们提出了在具有非凸惩罚的惩罚回归框架下的自适应交互和幂分数（aiSPU）检验，称为截断Lasso惩罚（TLP），它可以保持正确的I型错误率，同时在广泛的备选方案中产生高统计幂。为了解析地计算它的p值，我们导出了它的渐近零分布。通过仿真，与几种具有代表性的现有方法相比，证明了其优越的有限样本性能。此外，我们将其和其他具有代表性的测试应用于阿尔茨海默病神经成像倡议（ADNI）数据集，检测阿尔茨海默病可能的基因-性别相互作用。我们还在GitHub上放了R包“aispu”来实现所提出的测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Regularization-Based Adaptive Test for High-Dimensional Generalized Linear Models.

In spite of its urgent importance in the era of big data, testing high-dimensional parameters in generalized linear models (GLMs) in the presence of high-dimensional nuisance parameters has been largely under-studied, especially with regard to constructing powerful tests for general (and unknown) alternatives. Most existing tests are powerful only against certain alternatives and may yield incorrect Type I error rates under high-dimensional nuisance parameter situations. In this paper, we propose the adaptive interaction sum of powered score (aiSPU) test in the framework of penalized regression with a non-convex penalty, called truncated Lasso penalty (TLP), which can maintain correct Type I error rates while yielding high statistical power across a wide range of alternatives. To calculate its p-values analytically, we derive its asymptotic null distribution. Via simulations, its superior finite-sample performance is demonstrated over several representative existing methods. In addition, we apply it and other representative tests to an Alzheimer's Disease Neuroimaging Initiative (ADNI) data set, detecting possible gene-gender interactions for Alzheimer's disease. We also put R package "aispu" implementing the proposed test on GitHub.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Machine Learning Research 工程技术-计算机：人工智能

CiteScore

18.80

自引率

0.00%

发文量

审稿时长

3 months

期刊介绍： The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online. JMLR has a commitment to rigorous yet rapid reviewing. JMLR seeks previously unpublished papers on machine learning that contain: new principled algorithms with sound empirical validation, and with justification of theoretical, psychological, or biological nature; experimental and/or theoretical studies yielding new insight into the design and behavior of learning in intelligent systems; accounts of applications of existing techniques that shed light on the strengths and weaknesses of the methods; formalization of new learning tasks (e.g., in the context of new applications) and of methods for assessing performance on those tasks; development of new analytical frameworks that advance theoretical studies of practical learning methods; computational models of data from natural learning systems at the behavioral or neural level; or extremely well-written surveys of existing work.