A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data

IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Journal of Nonparametric Statistics Pub Date : 2023-10-18 DOI:10.1080/10485252.2023.2270079
Kin Yap Cheung, Stephen M. S. Lee
{"title":"A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data","authors":"Kin Yap Cheung, Stephen M. S. Lee","doi":"10.1080/10485252.2023.2270079","DOIUrl":null,"url":null,"abstract":"AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"25 1","pages":"0"},"PeriodicalIF":0.8000,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Nonparametric Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/10485252.2023.2270079","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
缺失数据下变量选择和非参数预测的改进Nadaraya-Watson程序
摘要本文提出了一种新的非参数回归设置下的变量选择和预测方法,其中协变量可能因为其值对观察者隐藏或因为它不适用于被观察的特定主题而丢失。尽管具有实际意义,但该问题在文献中很少受到关注,其解决方案基本上不存在。我们的建议依赖于条件平均回归函数的改进Nadaraya-Watson估计器的构造,其带宽经过正则化以选择变量,其权重经过调整以适应不同类型的缺失。该方法允许跨不同缺失数据模式共享信息,而不会影响估计器的一致性。不同于其他传统的方法,例如那些基于imputation或likelihood的方法,我们的方法只需要对模型和缺失机制进行温和的假设。对于预测,我们专注于寻找预测平均响应的相关变量,条件是协变量向量服从给定类型的缺失。理论和数值结果表明,与现有方法相比,新方法在变量选择上具有一致性,预测精度更高。关键词:Nadaraya-Watson估计缺失数据非参数回归变量选择披露声明作者未报告潜在利益冲突。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Nonparametric Statistics
Journal of Nonparametric Statistics 数学-统计学与概率论
CiteScore
1.50
自引率
8.30%
发文量
42
审稿时长
6-12 weeks
期刊介绍: Journal of Nonparametric Statistics provides a medium for the publication of research and survey work in nonparametric statistics and related areas. The scope includes, but is not limited to the following topics: Nonparametric modeling, Nonparametric function estimation, Rank and other robust and distribution-free procedures, Resampling methods, Lack-of-fit testing, Multivariate analysis, Inference with high-dimensional data, Dimension reduction and variable selection, Methods for errors in variables, missing, censored, and other incomplete data structures, Inference of stochastic processes, Sample surveys, Time series analysis, Longitudinal and functional data analysis, Nonparametric Bayes methods and decision procedures, Semiparametric models and procedures, Statistical methods for imaging and tomography, Statistical inverse problems, Financial statistics and econometrics, Bioinformatics and comparative genomics, Statistical algorithms and machine learning. Both the theory and applications of nonparametric statistics are covered in the journal. Research applying nonparametric methods to medicine, engineering, technology, science and humanities is welcomed, provided the novelty and quality level are of the highest order. Authors are encouraged to submit supplementary technical arguments, computer code, data analysed in the paper or any additional information for online publication along with the published paper.
期刊最新文献
Adaptive and efficient isotonic estimation in Wicksell's problem A general semi-parametric elliptical distribution model for semi-supervised learning Stone's theorem for distributional regression in Wasserstein distance Kernel density estimation for a stochastic process with values in a Riemannian manifold Functional index coefficient models for locally stationary time series
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1