边际筛选模型选择检验统计量的零分布:多元回归分析的意义

A. V. Rubanovich, V. Saenko
{"title":"边际筛选模型选择检验统计量的零分布:多元回归分析的意义","authors":"A. V. Rubanovich, V. Saenko","doi":"10.34257/gjsfrgvol21is1pg23","DOIUrl":null,"url":null,"abstract":"Marginal screening (MS) is the computationally simple and commonly used for the dimension reduction procedures. In it, a linear model is constructed for several top predictors, chosen according to the absolute value of marginal correlations with the dependent variable. Importantly, when kpredictors out of mprimary covariates are selected, the standard regression analysis may yield false-positive results if m>> k(Freedman's paradox). In this work, we provide analytical expressions describing null distribution of the test statistics for model selection via MS. Using the theory of order statistics, we show that under MS, the common F-statistic is distributed as a mean of ktop variables out of mindependent random variables having a 21χdistribution. Based on this finding, we estimated critical p-values for multiple regression models after MS, comparisons with which of those obtained in real studies will help researchers to avoid false-positive result. Analytical solutions obtained in the work are implemented in a free Excel spreadsheet program.","PeriodicalId":12547,"journal":{"name":"Global Journal of Science Frontier Research","volume":"138 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis\",\"authors\":\"A. V. Rubanovich, V. Saenko\",\"doi\":\"10.34257/gjsfrgvol21is1pg23\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Marginal screening (MS) is the computationally simple and commonly used for the dimension reduction procedures. In it, a linear model is constructed for several top predictors, chosen according to the absolute value of marginal correlations with the dependent variable. Importantly, when kpredictors out of mprimary covariates are selected, the standard regression analysis may yield false-positive results if m>> k(Freedman's paradox). In this work, we provide analytical expressions describing null distribution of the test statistics for model selection via MS. Using the theory of order statistics, we show that under MS, the common F-statistic is distributed as a mean of ktop variables out of mindependent random variables having a 21χdistribution. Based on this finding, we estimated critical p-values for multiple regression models after MS, comparisons with which of those obtained in real studies will help researchers to avoid false-positive result. Analytical solutions obtained in the work are implemented in a free Excel spreadsheet program.\",\"PeriodicalId\":12547,\"journal\":{\"name\":\"Global Journal of Science Frontier Research\",\"volume\":\"138 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Global Journal of Science Frontier Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34257/gjsfrgvol21is1pg23\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Journal of Science Frontier Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34257/gjsfrgvol21is1pg23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

边际筛选法(MS)是计算简单、常用的降维方法。在该模型中,根据与因变量的边际相关的绝对值选择几个顶级预测因子,构建了一个线性模型。重要的是,当选择主要协变量中的k个预测因子时,如果m>> k(弗里德曼悖论),标准回归分析可能产生假阳性结果。在这项工作中,我们提供了描述通过MS进行模型选择的检验统计量的零分布的解析表达式。利用序统计量理论,我们证明了在MS下,公共f统计量分布为具有21χ分布的独立随机变量中的ktop变量的均值。基于这一发现,我们估计了MS后多元回归模型的临界p值,与实际研究中得到的临界p值进行比较,有助于研究者避免假阳性结果。在工作中得到的分析解在一个免费的Excel电子表格程序中实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Null Distribution of the Test Statistic for Model Selection via Marginal Screening: Implications for Multivariate Regression Analysis
Marginal screening (MS) is the computationally simple and commonly used for the dimension reduction procedures. In it, a linear model is constructed for several top predictors, chosen according to the absolute value of marginal correlations with the dependent variable. Importantly, when kpredictors out of mprimary covariates are selected, the standard regression analysis may yield false-positive results if m>> k(Freedman's paradox). In this work, we provide analytical expressions describing null distribution of the test statistics for model selection via MS. Using the theory of order statistics, we show that under MS, the common F-statistic is distributed as a mean of ktop variables out of mindependent random variables having a 21χdistribution. Based on this finding, we estimated critical p-values for multiple regression models after MS, comparisons with which of those obtained in real studies will help researchers to avoid false-positive result. Analytical solutions obtained in the work are implemented in a free Excel spreadsheet program.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Investigating the Seasonal Variations of Event, Recent, and Pre-Recent Runoff Components in a Pre-Alpine Catchment using Stable Isotopes and an Iterative Hydrograph Separation Approach Comprehensive Review of Key Taenia Species and Taeniosis/ Cysticercosis Disease in Animals and Humans Research and Discussion of Quantum Theory Study on the Mechanism of Cycle and Storage Process of Lithium-Ion Battery Leave-Intercalation Theory and Conductive Mechanism during Charge-Discharge Process for Secondary Battery
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1