A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY Journal of Nonparametric Statistics Pub Date : 2023-10-18 DOI:10.1080/10485252.2023.2270079

Kin Yap Cheung, Stephen M. S. Lee

{"title":"A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data","authors":"Kin Yap Cheung, Stephen M. S. Lee","doi":"10.1080/10485252.2023.2270079","DOIUrl":null,"url":null,"abstract":"AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"25 1","pages":"0"},"PeriodicalIF":0.8000,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Nonparametric Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/10485252.2023.2270079","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

Abstract

AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

缺失数据下变量选择和非参数预测的改进Nadaraya-Watson程序

摘要本文提出了一种新的非参数回归设置下的变量选择和预测方法，其中协变量可能因为其值对观察者隐藏或因为它不适用于被观察的特定主题而丢失。尽管具有实际意义，但该问题在文献中很少受到关注，其解决方案基本上不存在。我们的建议依赖于条件平均回归函数的改进Nadaraya-Watson估计器的构造，其带宽经过正则化以选择变量，其权重经过调整以适应不同类型的缺失。该方法允许跨不同缺失数据模式共享信息，而不会影响估计器的一致性。不同于其他传统的方法，例如那些基于imputation或likelihood的方法，我们的方法只需要对模型和缺失机制进行温和的假设。对于预测，我们专注于寻找预测平均响应的相关变量，条件是协变量向量服从给定类型的缺失。理论和数值结果表明，与现有方法相比，新方法在变量选择上具有一致性，预测精度更高。关键词:Nadaraya-Watson估计缺失数据非参数回归变量选择披露声明作者未报告潜在利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Nonparametric Statistics 数学-统计学与概率论

CiteScore

1.50

自引率

8.30%

发文量

审稿时长

6-12 weeks

期刊介绍： Journal of Nonparametric Statistics provides a medium for the publication of research and survey work in nonparametric statistics and related areas. The scope includes, but is not limited to the following topics: Nonparametric modeling, Nonparametric function estimation, Rank and other robust and distribution-free procedures, Resampling methods, Lack-of-fit testing, Multivariate analysis, Inference with high-dimensional data, Dimension reduction and variable selection, Methods for errors in variables, missing, censored, and other incomplete data structures, Inference of stochastic processes, Sample surveys, Time series analysis, Longitudinal and functional data analysis, Nonparametric Bayes methods and decision procedures, Semiparametric models and procedures, Statistical methods for imaging and tomography, Statistical inverse problems, Financial statistics and econometrics, Bioinformatics and comparative genomics, Statistical algorithms and machine learning. Both the theory and applications of nonparametric statistics are covered in the journal. Research applying nonparametric methods to medicine, engineering, technology, science and humanities is welcomed, provided the novelty and quality level are of the highest order. Authors are encouraged to submit supplementary technical arguments, computer code, data analysed in the paper or any additional information for online publication along with the published paper.