首页 > 最新文献

Journal of Applied Statistics最新文献

英文 中文
Comparison of two statistical methodologies for a binary classification problem of two-dimensional images 二维图像二值分类问题的两种统计方法的比较
IF 1.5 4区 数学 Q2 Mathematics Pub Date : 2023-11-15 DOI: 10.1080/02664763.2023.2279012
Deniz A. Sanchez S., Rubén D. Guevara G., Sergio A. Calderón V.
The present work intends to compare two statistical classification methods using images as covariates and under the comparison criterion of the ROC curve. The first implemented procedure is based o...
本文以图像为协变量,在ROC曲线的比较标准下,比较两种统计分类方法。第一个实现的过程是基于…
{"title":"Comparison of two statistical methodologies for a binary classification problem of two-dimensional images","authors":"Deniz A. Sanchez S., Rubén D. Guevara G., Sergio A. Calderón V.","doi":"10.1080/02664763.2023.2279012","DOIUrl":"https://doi.org/10.1080/02664763.2023.2279012","url":null,"abstract":"The present work intends to compare two statistical classification methods using images as covariates and under the comparison criterion of the ROC curve. The first implemented procedure is based o...","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GWR-assisted integrated estimator of finite population total under two-phase sampling: a model-assisted approach 两阶段抽样下有限总体的gwr辅助综合估计:一种模型辅助方法
4区 数学 Q2 Mathematics Pub Date : 2023-11-14 DOI: 10.1080/02664763.2023.2280879
Nobin Chandra Paul, Anil Rai, Tauqueer Ahmad, Ankur Biswas, Prachi Misra Sahoo
AbstractIn survey sampling, auxiliary information is used to precisely estimate the finite population parameters. There are several approaches available in the literature that provide a practical method for incorporating auxiliary information during the estimation stage. In order to effectively utilize the auxiliary information, a geographically weighted regression (GWR) model-assisted integrated estimator of finite population total under a two-phase sampling design has been proposed in this article. Spatial simulation studies have been conducted to empirically assess the statistical properties of the proposed estimator. In the presence of spatial non-stationarity, empirical findings reveal that the proposed estimator outperforms all existing estimators such as two-phase HT, ratio, and regression estimators, demonstrating the importance of spatial information in survey sampling.KEYWORDS: Data integrationgeographically weighted regressionmodel-assisted approachspatial non-stationaritytwo-phase regression AcknowledgementThe authors are thankful to the blind reviewers for providing valuable suggestions that have greatly enhanced the quality of the article. The first author would like to express his heartfelt gratitude to the ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India, for providing the real CCE survey data, lab facilities, and overall support to conduct the research work during Ph.D. programme.Disclosure statementNo potential conflict of interest was reported by the authors.Data availability statementData sharing is not applicable.
摘要在调查抽样中,利用辅助信息对有限的总体参数进行精确估计。在文献中有几种可用的方法,提供了在估计阶段合并辅助信息的实用方法。为了有效地利用辅助信息,本文提出了一种地理加权回归(GWR)模型辅助的两阶段抽样设计下有限总体总数的综合估计方法。空间模拟研究已经进行,以经验评估所提出的估计器的统计特性。在存在空间非平稳性的情况下,实证结果表明,所提出的估计量优于所有现有的估计量,如两相HT估计量、比率估计量和回归估计量,证明了空间信息在调查抽样中的重要性。关键词:数据整合地理加权回归模型辅助方法空间非平稳性两阶段回归感谢盲审稿人提供的宝贵意见,极大地提高了文章的质量。第一作者衷心感谢印度新德里icar -印度农业统计研究所(ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India)为博士期间的研究工作提供了真实的CCE调查数据、实验室设施和全面的支持。披露声明作者未报告潜在的利益冲突。数据可用性声明数据共享不适用。
{"title":"GWR-assisted integrated estimator of finite population total under two-phase sampling: a model-assisted approach","authors":"Nobin Chandra Paul, Anil Rai, Tauqueer Ahmad, Ankur Biswas, Prachi Misra Sahoo","doi":"10.1080/02664763.2023.2280879","DOIUrl":"https://doi.org/10.1080/02664763.2023.2280879","url":null,"abstract":"AbstractIn survey sampling, auxiliary information is used to precisely estimate the finite population parameters. There are several approaches available in the literature that provide a practical method for incorporating auxiliary information during the estimation stage. In order to effectively utilize the auxiliary information, a geographically weighted regression (GWR) model-assisted integrated estimator of finite population total under a two-phase sampling design has been proposed in this article. Spatial simulation studies have been conducted to empirically assess the statistical properties of the proposed estimator. In the presence of spatial non-stationarity, empirical findings reveal that the proposed estimator outperforms all existing estimators such as two-phase HT, ratio, and regression estimators, demonstrating the importance of spatial information in survey sampling.KEYWORDS: Data integrationgeographically weighted regressionmodel-assisted approachspatial non-stationaritytwo-phase regression AcknowledgementThe authors are thankful to the blind reviewers for providing valuable suggestions that have greatly enhanced the quality of the article. The first author would like to express his heartfelt gratitude to the ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India, for providing the real CCE survey data, lab facilities, and overall support to conduct the research work during Ph.D. programme.Disclosure statementNo potential conflict of interest was reported by the authors.Data availability statementData sharing is not applicable.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134900822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Forecasting of the true satellite carbon monoxide data with ensemble empirical mode decomposition, singular value decomposition and moving average 用集合经验模态分解、奇异值分解和移动平均预测真卫星一氧化碳数据
4区 数学 Q2 Mathematics Pub Date : 2023-11-14 DOI: 10.1080/02664763.2023.2277115
Sameer Poongadan, M. C. Lineesh
AbstractThe forecasting of carbon monoxide in the atmosphere is essential as it causes the pollution of the atmosphere and hence severe health problems for humans. This study proposes a time-series prognosis EEMD-SVD-MA technique which incorporates Ensemble Empirical Mode Decomposition, Singular Value Decomposition and Moving Average, to predict the prospects of carbon monoxide data taken from the Indian region. The collected data are non-linear. The technique can be applied for non-stationary and non-linear data. In this approach, there are three levels: EEMD level, SVD level and MA level. The first level deploys EEMD to fragment data series into a limited number of Intrinsic Mode Function (IMF) components along with a residue. To denoise each IMF component, SVD is deployed in the second level. In the third level, each denoised IMF component is predicted by MA. The future values of the original data are obtained by adding all the predicted series of the components. In this study, we proposed two variants of the model: EEMD-SVD-MA(3) and EEMD-SVD-MA(4) and compared the results with other forecasting techniques, namely LSTM (Long Short Term Memory network), EMD-LSTM, EMD-MA, EEMD-MA and CEEMDAN-MA. The results show that the proposed EEMD-SVD-MA model is more efficient than other models.Keywords: Intrinsic mode functionempirical mode decompositionensemble empirical mode decompositionsingular value decompositionmoving averagelong short term memory networkMathematics Subject Classifications: 37M1068T0715A18 AcknowledgmentsThe author's deep appreciation goes out to NASA's teams for AIRS/AMSU, MODIS and MOPPIT data for tropospheric CO.Disclosure statementNo potential conflict of interest was reported by the author(s).
摘要大气中一氧化碳的预测是必不可少的,因为它会造成大气污染,从而给人类带来严重的健康问题。本文提出了一种结合集合经验模态分解、奇异值分解和移动平均的时间序列预测EEMD-SVD-MA技术,用于预测印度地区一氧化碳数据的前景。收集的数据是非线性的。该技术可以应用于非平稳和非线性数据。在这种方法中,有三个级别:EEMD级别,SVD级别和MA级别。第一级部署EEMD,将数据序列分解为有限数量的内模态函数(IMF)组件和剩余部分。为了对每个IMF分量进行降噪,在第二层部署了奇异值分解。在第三个层次,每个去噪的IMF分量用MA进行预测。将各分量的预测序列相加,得到原始数据的未来值。在本研究中,我们提出了两个模型的变体:EEMD-SVD-MA(3)和EEMD-SVD-MA(4),并将结果与LSTM(长短期记忆网络)、EMD-LSTM、EMD-MA、EEMD-MA和CEEMDAN-MA进行了比较。结果表明,本文提出的EEMD-SVD-MA模型比其他模型更有效。关键词:内禀模态函数经验模态分解集成经验模态分解奇异值分解移动平均长期短期记忆网络数学学科分类:37M1068T0715A18致谢作者对NASA团队提供的AIRS/AMSU、MODIS和MOPPIT对流层大气大气数据表示深深的感谢披露声明作者未报告潜在的利益冲突。
{"title":"Forecasting of the true satellite carbon monoxide data with ensemble empirical mode decomposition, singular value decomposition and moving average","authors":"Sameer Poongadan, M. C. Lineesh","doi":"10.1080/02664763.2023.2277115","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277115","url":null,"abstract":"AbstractThe forecasting of carbon monoxide in the atmosphere is essential as it causes the pollution of the atmosphere and hence severe health problems for humans. This study proposes a time-series prognosis EEMD-SVD-MA technique which incorporates Ensemble Empirical Mode Decomposition, Singular Value Decomposition and Moving Average, to predict the prospects of carbon monoxide data taken from the Indian region. The collected data are non-linear. The technique can be applied for non-stationary and non-linear data. In this approach, there are three levels: EEMD level, SVD level and MA level. The first level deploys EEMD to fragment data series into a limited number of Intrinsic Mode Function (IMF) components along with a residue. To denoise each IMF component, SVD is deployed in the second level. In the third level, each denoised IMF component is predicted by MA. The future values of the original data are obtained by adding all the predicted series of the components. In this study, we proposed two variants of the model: EEMD-SVD-MA(3) and EEMD-SVD-MA(4) and compared the results with other forecasting techniques, namely LSTM (Long Short Term Memory network), EMD-LSTM, EMD-MA, EEMD-MA and CEEMDAN-MA. The results show that the proposed EEMD-SVD-MA model is more efficient than other models.Keywords: Intrinsic mode functionempirical mode decompositionensemble empirical mode decompositionsingular value decompositionmoving averagelong short term memory networkMathematics Subject Classifications: 37M1068T0715A18 AcknowledgmentsThe author's deep appreciation goes out to NASA's teams for AIRS/AMSU, MODIS and MOPPIT data for tropospheric CO.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134900804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phase II control charts for monitoring the depth-ratio of ball-bearings involving three normal variables 第二期控制图用于监测球轴承的深度比,涉及三个正态变量
4区 数学 Q2 Mathematics Pub Date : 2023-11-08 DOI: 10.1080/02664763.2023.2279015
Li Jin, Amitava Mukherjee, Zhi Song, Jiujun Zhang
AbstractThis paper investigates the problem of monitoring the ratio involving three variables, jointly distributed as trivariate normal. The Shewhart-type and two exponentially weighted moving average (EWMA) type schemes for monitoring depth ratio are proposed. The ratio of a normal variable to the average of two other normal variables has wide applications in natural science, production, and engineering. It is defined with slightly different terminology in various contexts, such as depth or aspect ratios. In modern bearing manufacturing, the aspect ratio of width to the average of inner and outer diameters can be an essential indicator of product quality and process stability. While there are many helpful existing charts for monitoring the three components separately or jointly when these characteristics follow a normal distribution, the ratio aspect is often ignored. The Shewhart-type schemes' exact and approximated control limits are considered and analyzed. Numerical results based on Monte-Carlo are conducted using the average run length as a metric with different values of in-control ratio and correlation between the three variables. An application based on the parts manufacturing data illustrates the implementation design of the two control charts. The real-life data analysis shows the efficacy of the proposed monitoring schemes in practice.Keywords: Charting schemesparts manufacturingratio involving three variablesphase-II process monitoringtrivariate normal AcknowledgmentsThe authors are grateful to the Editor-in-chief, Associate Editor and three reviewers for various constructive comments and suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 https://products.emersonbearing.com/viewitems/deep-groove-radial-ball-bearings/6300-series-deep-groove-radial-ball-bearingsAdditional informationFundingThis work was supported by the National Natural Science Foundation of China [Grant Nos. 12171328,12201429]; the Beijing Natural Science Foundation [Grant No. Z210003]; the Liaoning BaiQianWan Talents Program; the Natural Science Foundation of Liaoning Province [Grant Nos. 2020-MS-139, 2023-MS-142]; the Scientific Research Fund of Liaoning Provincial Education Department of China [Grant No. LJC202006]; the Project of Science and Research of Hebei Educational Department of China [Grant No. ZD2022020]; the Doctoral Research Start-up Fund of Liaoning Province [Grant No. 2021-BS-142]; the Research on Humanities and Social Sciences of the Ministry of Education [Grant No. 22YJC910009]; and the Research of economic and social development in Liaoning Province [Grant No. 20231s1ybkt-103].
摘要本文研究了以三元正态分布的三个变量的比值监测问题。提出了深度比监测的shehart型和两个指数加权移动平均(EWMA)型方案。正态变量与其他两个正态变量的平均值之比在自然科学、生产和工程中有着广泛的应用。在不同的上下文中,它的定义略有不同,例如深度或宽高比。在现代轴承制造中,宽度与内径和外径平均值的纵横比可以作为产品质量和工艺稳定性的重要指标。当这些特征遵循正态分布时,虽然有许多有用的现有图表可以单独或联合监视这三个组成部分,但比率方面经常被忽略。考虑并分析了shehart型方案的精确控制极限和近似控制极限。以平均行程长度为度量,采用不同的控制率值和三个变量之间的相关性,进行了基于蒙特卡罗的数值计算。一个基于零件制造数据的应用说明了这两个控制图的实现设计。实际数据分析表明了所提出的监测方案在实践中的有效性。关键词:制图方案;零件制造;三变量比例;披露声明作者未报告潜在的利益冲突。no . 12171328,12201429国家自然科学基金资助;北京市自然科学基金资助项目[基金资助号];Z210003];辽宁省百千湾人才计划;辽宁省自然科学基金项目[批准号:2020-MS-139、2023-MS-142];辽宁省教育厅科学研究基金[批准号:)LJC202006];河北省教育厅科学与研究项目[批准号:2016.06];ZD2022020];辽宁省博士科研启动基金[批准号:2021-BS-142];教育部人文社会科学研究项目[批准号:22YJC910009];辽宁省经济社会发展研究[批准号:20231s1ybkt-103]。
{"title":"Phase II control charts for monitoring the depth-ratio of ball-bearings involving three normal variables","authors":"Li Jin, Amitava Mukherjee, Zhi Song, Jiujun Zhang","doi":"10.1080/02664763.2023.2279015","DOIUrl":"https://doi.org/10.1080/02664763.2023.2279015","url":null,"abstract":"AbstractThis paper investigates the problem of monitoring the ratio involving three variables, jointly distributed as trivariate normal. The Shewhart-type and two exponentially weighted moving average (EWMA) type schemes for monitoring depth ratio are proposed. The ratio of a normal variable to the average of two other normal variables has wide applications in natural science, production, and engineering. It is defined with slightly different terminology in various contexts, such as depth or aspect ratios. In modern bearing manufacturing, the aspect ratio of width to the average of inner and outer diameters can be an essential indicator of product quality and process stability. While there are many helpful existing charts for monitoring the three components separately or jointly when these characteristics follow a normal distribution, the ratio aspect is often ignored. The Shewhart-type schemes' exact and approximated control limits are considered and analyzed. Numerical results based on Monte-Carlo are conducted using the average run length as a metric with different values of in-control ratio and correlation between the three variables. An application based on the parts manufacturing data illustrates the implementation design of the two control charts. The real-life data analysis shows the efficacy of the proposed monitoring schemes in practice.Keywords: Charting schemesparts manufacturingratio involving three variablesphase-II process monitoringtrivariate normal AcknowledgmentsThe authors are grateful to the Editor-in-chief, Associate Editor and three reviewers for various constructive comments and suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 https://products.emersonbearing.com/viewitems/deep-groove-radial-ball-bearings/6300-series-deep-groove-radial-ball-bearingsAdditional informationFundingThis work was supported by the National Natural Science Foundation of China [Grant Nos. 12171328,12201429]; the Beijing Natural Science Foundation [Grant No. Z210003]; the Liaoning BaiQianWan Talents Program; the Natural Science Foundation of Liaoning Province [Grant Nos. 2020-MS-139, 2023-MS-142]; the Scientific Research Fund of Liaoning Provincial Education Department of China [Grant No. LJC202006]; the Project of Science and Research of Hebei Educational Department of China [Grant No. ZD2022020]; the Doctoral Research Start-up Fund of Liaoning Province [Grant No. 2021-BS-142]; the Research on Humanities and Social Sciences of the Ministry of Education [Grant No. 22YJC910009]; and the Research of economic and social development in Liaoning Province [Grant No. 20231s1ybkt-103].","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135341864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smoothing level selection for density estimators based on the moments 基于矩的密度估计器平滑水平选择
4区 数学 Q2 Mathematics Pub Date : 2023-11-07 DOI: 10.1080/02664763.2023.2277125
Rosa M. García-Fernández, Federico Palacios-González
AbstractThis paper introduces an approach to select the bandwidth or smoothing parameter in multiresolution (MR) density estimation and nonparametric density estimation. It is based on the evolution of the second, third and fourth central moments and the shape of the estimated densities for different bandwidths and resolution levels. The proposed method has been applied to density estimation by means of multiresolution densities as well as kernel density estimation (MRDE and KDE respectively). The results of the simulations and the empirical application demonstrate that the level of resolution resulting from the moments method performs better with multimodal densities than the Bayesian Information Criterion (BIC) for multiresolution densities estimation and the plug-in for kernel densities estimation.KEYWORDS: Multiresolution density estimationkernel density estimationbandwidthmoments and level of resolution Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 The multiresolution densities are a particular case of semiparametric models (see, [Citation12,Citation14]).2 This is a well-known fact underlying all the bandwidth selection methods.3 Remind that these intervals form a partition of the real line and their amplitude converges to zero as j increases.4 Unless this is done parametrically using the EM algorithm on a mixture model of three double exponential distributions. But for a sample of size 10,000 the process time is too long.5 Note that the values for the Gini coefficient can differ from other publications since our illustration is based on gross income instead of net income.6 The expected value of the density is zero and the central and non-central moments are equal.
摘要介绍了一种多分辨率(MR)密度估计和非参数密度估计中带宽或平滑参数的选择方法。它基于第二、第三和第四中心矩的演变以及不同带宽和分辨率水平下估计密度的形状。该方法已应用于多分辨率密度估计和核密度估计(分别为MRDE和KDE)。仿真和经验应用结果表明,矩量法在多模态密度下的分辨率水平优于贝叶斯信息准则(BIC)和核密度估计插件。关键词:多分辨率密度估计核密度估计带宽矩和分辨率水平披露声明作者未报告潜在利益冲突。注1:多分辨率密度是半参数模型的特殊情况(参见[Citation12,Citation14])这是所有带宽选择方法背后的一个众所周知的事实提醒一下,这些区间形成了实线的一个分割,它们的振幅随着j的增加收敛于零除非在三个双指数分布的混合模型上使用EM算法进行参数化处理。但是对于大小为10,000的样本,处理时间太长了请注意,基尼系数的值可能与其他出版物不同,因为我们的插图是基于总收入而不是净收入密度的期望值为零,中心矩和非中心矩相等。
{"title":"Smoothing level selection for density estimators based on the moments","authors":"Rosa M. García-Fernández, Federico Palacios-González","doi":"10.1080/02664763.2023.2277125","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277125","url":null,"abstract":"AbstractThis paper introduces an approach to select the bandwidth or smoothing parameter in multiresolution (MR) density estimation and nonparametric density estimation. It is based on the evolution of the second, third and fourth central moments and the shape of the estimated densities for different bandwidths and resolution levels. The proposed method has been applied to density estimation by means of multiresolution densities as well as kernel density estimation (MRDE and KDE respectively). The results of the simulations and the empirical application demonstrate that the level of resolution resulting from the moments method performs better with multimodal densities than the Bayesian Information Criterion (BIC) for multiresolution densities estimation and the plug-in for kernel densities estimation.KEYWORDS: Multiresolution density estimationkernel density estimationbandwidthmoments and level of resolution Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 The multiresolution densities are a particular case of semiparametric models (see, [Citation12,Citation14]).2 This is a well-known fact underlying all the bandwidth selection methods.3 Remind that these intervals form a partition of the real line and their amplitude converges to zero as j increases.4 Unless this is done parametrically using the EM algorithm on a mixture model of three double exponential distributions. But for a sample of size 10,000 the process time is too long.5 Note that the values for the Gini coefficient can differ from other publications since our illustration is based on gross income instead of net income.6 The expected value of the density is zero and the central and non-central moments are equal.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135474671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A computationally efficient sequential regression imputation algorithm for multilevel data 一种计算效率高的多层次数据序列回归插值算法
4区 数学 Q2 Mathematics Pub Date : 2023-11-06 DOI: 10.1080/02664763.2023.2277669
Tugba Akkaya Hocagil, Recai M. Yucel
ABSTRACTDue to the computational burden, especially in high-dimensional settings, sequential imputation may not be practical. In this paper, we adopt computationally advantageous methods by sampling the missing data from their perspective predictive distributions, which leads to significantly improved computation time in the class of variable-by-variable imputation algorithms. We assess the computational performance in a comprehensive simulation study. We then compare and contrast the performance of our algorithm with commonly used alternatives. The results show that our method has a significant advantage over the commonly used alternatives with respect to computational efficiency and inferential quality. Finally, we demonstrate our methods in a substantive problem aimed at investigating the effects of area-level behavioral, socioeconomic, and demographic characteristics on poor birth outcomes in New York State among singleton births.KEYWORDS: Sequential regression imputationmultilevel datacomputational efficiencyfast variable by variable imputationmultiple imputation by chained equations AcknowledgmentsWe thank Dr. Tabassum Insaf for providing assistance in accessing the New York State Vital Records Registry data.Disclosure statementNo potential conflict of interest was reported by the author(s).
摘要由于计算量大,特别是在高维环境下,序贯输入可能不太实用。在本文中,我们采用了计算优势的方法,从缺失数据的预测分布角度对缺失数据进行采样,从而显著提高了变量逐变量插值算法的计算时间。我们在一个全面的模拟研究中评估了计算性能。然后,我们将我们的算法与常用替代算法的性能进行比较和对比。结果表明,我们的方法在计算效率和推理质量方面比常用的替代方法具有显著的优势。最后,我们在一个实质性问题中展示了我们的方法,该问题旨在调查纽约州单胎分娩中区域层面行为、社会经济和人口特征对不良出生结果的影响。关键词:序贯回归、多层次数据、计算效率、逐变量快速、链式方程多元、感谢Tabassum Insaf博士在获取纽约州生命记录登记处数据方面提供的帮助。披露声明作者未报告潜在的利益冲突。
{"title":"A computationally efficient sequential regression imputation algorithm for multilevel data","authors":"Tugba Akkaya Hocagil, Recai M. Yucel","doi":"10.1080/02664763.2023.2277669","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277669","url":null,"abstract":"ABSTRACTDue to the computational burden, especially in high-dimensional settings, sequential imputation may not be practical. In this paper, we adopt computationally advantageous methods by sampling the missing data from their perspective predictive distributions, which leads to significantly improved computation time in the class of variable-by-variable imputation algorithms. We assess the computational performance in a comprehensive simulation study. We then compare and contrast the performance of our algorithm with commonly used alternatives. The results show that our method has a significant advantage over the commonly used alternatives with respect to computational efficiency and inferential quality. Finally, we demonstrate our methods in a substantive problem aimed at investigating the effects of area-level behavioral, socioeconomic, and demographic characteristics on poor birth outcomes in New York State among singleton births.KEYWORDS: Sequential regression imputationmultilevel datacomputational efficiencyfast variable by variable imputationmultiple imputation by chained equations AcknowledgmentsWe thank Dr. Tabassum Insaf for providing assistance in accessing the New York State Vital Records Registry data.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135635744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An exact projection pursuit-based algorithm for multivariate two-sample nonparametric testing applicable to retrospective and group sequential studies 一种基于精确投影追踪的多变量双样本非参数检验算法,适用于回顾性和组序贯研究
4区 数学 Q2 Mathematics Pub Date : 2023-11-06 DOI: 10.1080/02664763.2023.2277118
Li Zou, Gregory Gurevich, Ablert Vexler
AbstractNonparametric tests for equality of multivariate distributions are frequently desired in research. It is commonly required that test-procedures based on relatively small samples of vectors accurately control the corresponding Type I Error (TIE) rates. Often, in the multivariate testing, extensions of null-distribution-free univariate methods, e.g., Kolmogorov-Smirnov and Cramér-von Mises type schemes, are not exact, since their null distributions depend on underlying data distributions. The present paper extends the density-based empirical likelihood technique in order to nonparametrically approximate the most powerful test for the multivariate two-sample (MTS) problem, yielding an exact finite-sample test statistic. We rigorously apply one-to-one-mapping between the equality of vectors' distributions and the equality of distributions of relevant univariate linear projections. We establish a general algorithm that simplifies the use of projection pursuit, employing only a few of the infinitely many linear combinations of observed vectors' components. The displayed distribution-free strategy is employed in retrospective and group sequential manners. A novel MTS nonparametric procedure in the group sequential manner is proposed. The asymptotic consistency of the proposed technique is shown. Monte Carlo studies demonstrate that the proposed procedures exhibit extremely high and stable power characteristics across a variety of settings. Supplementary materials for this article are available online.KEYWORDS: Density-based empirical likelihoodexact testmultivariate two-sample testnonparametric testprojection pursuit AcknowledgementWe are grateful to the Editor, the AE and two reviewers for helpful comments.Disclosure statementNo potential conflict of interest was reported by the author(s).
摘要研究中经常需要对多元分布进行非参数检验。通常需要基于相对较小的载体样本的测试程序准确地控制相应的I型错误率。通常,在多元检验中,无零分布的单变量方法的扩展,如Kolmogorov-Smirnov和cram -von Mises型方案,是不精确的,因为它们的零分布依赖于底层数据分布。本文扩展了基于密度的经验似然技术,以非参数逼近多变量双样本(MTS)问题的最有效检验,得到了精确的有限样本检验统计量。我们严格地应用了向量分布的等式与相关单变量线性投影分布的等式之间的一对一映射关系。我们建立了一种简化投影追踪使用的通用算法,仅使用观测向量分量的无穷多个线性组合中的几个。所显示的无分布策略采用回顾性和分组顺序方式。提出了一种新的群序MTS非参数过程。证明了该方法的渐近一致性。蒙特卡罗研究表明,所提出的程序在各种设置中表现出极高和稳定的功率特性。本文的补充材料可在网上获得。关键词:基于密度的经验似然,精确检验,多变量双样本检验,非参数检验,投影追踪感谢编辑,AE和两位审稿人的宝贵意见。披露声明作者未报告潜在的利益冲突。
{"title":"An exact projection pursuit-based algorithm for multivariate two-sample nonparametric testing applicable to retrospective and group sequential studies","authors":"Li Zou, Gregory Gurevich, Ablert Vexler","doi":"10.1080/02664763.2023.2277118","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277118","url":null,"abstract":"AbstractNonparametric tests for equality of multivariate distributions are frequently desired in research. It is commonly required that test-procedures based on relatively small samples of vectors accurately control the corresponding Type I Error (TIE) rates. Often, in the multivariate testing, extensions of null-distribution-free univariate methods, e.g., Kolmogorov-Smirnov and Cramér-von Mises type schemes, are not exact, since their null distributions depend on underlying data distributions. The present paper extends the density-based empirical likelihood technique in order to nonparametrically approximate the most powerful test for the multivariate two-sample (MTS) problem, yielding an exact finite-sample test statistic. We rigorously apply one-to-one-mapping between the equality of vectors' distributions and the equality of distributions of relevant univariate linear projections. We establish a general algorithm that simplifies the use of projection pursuit, employing only a few of the infinitely many linear combinations of observed vectors' components. The displayed distribution-free strategy is employed in retrospective and group sequential manners. A novel MTS nonparametric procedure in the group sequential manner is proposed. The asymptotic consistency of the proposed technique is shown. Monte Carlo studies demonstrate that the proposed procedures exhibit extremely high and stable power characteristics across a variety of settings. Supplementary materials for this article are available online.KEYWORDS: Density-based empirical likelihoodexact testmultivariate two-sample testnonparametric testprojection pursuit AcknowledgementWe are grateful to the Editor, the AE and two reviewers for helpful comments.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135636034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust estimation and bias-corrected empirical likelihood in generalized linear models with right censored data 右截尾数据广义线性模型的鲁棒估计和纠偏经验似然
4区 数学 Q2 Mathematics Pub Date : 2023-11-03 DOI: 10.1080/02664763.2023.2277117
Liugen Xue, Junshan Xie, Xiaohui Yang
AbstractIn this paper, we study the robust estimation and empirical likelihood for the regression parameter in generalized linear models with right censored data. A robust estimating equation is proposed to estimate the regression parameter, and the resulting estimator has consistent and asymptotic normality. A bias-corrected empirical log-likelihood ratio statistic of the regression parameter is constructed, and it is shown that the statistic converges weakly to a standard χ2 distribution. The result can be directly used to construct the confidence region of regression parameter. We use the bias correction method to directly calibrate the empirical log-likelihood ratio, which does not need to be multiplied by an adjustment factor. We also propose a method for selecting the tuning parameters in the loss function. Simulation studies show that the estimator of the regression parameter is robust and the bias-corrected empirical likelihood is better than the normal approximation method. An example of a real dataset from Alzheimer's disease studies shows that the proposed method can be applied in practical problems.Keywords: Generalized linear modelright censored datarobust estimationempirical likelihoodregression parameter AcknowledgmentsThe authors thank the Editor, Associate Editor and two referees for their helpful comments. The dataset used was provided by Dr. Chunling Liu of the Hong Kong Polytechnic University. The source of this dataset is available on https://adni.loni.usc.edu/about/.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThe research were supported by the National Natural Science Foundation of China (11971001), the Natural Science Foundation of Henan (222300420417), and the Science and Technology Project (2103004).
摘要本文研究了右截尾数据下广义线性模型回归参数的鲁棒估计和经验似然。提出了一个鲁棒估计方程来估计回归参数,得到的估计量具有一致正态性和渐近正态性。构造了回归参数的经偏差校正的经验对数似然比统计量,结果表明,该统计量弱收敛于标准χ2分布。结果可直接用于构造回归参数的置信区间。我们使用偏差校正方法直接校准经验对数似然比,而不需要乘以调整因子。我们还提出了一种选择损失函数中调谐参数的方法。仿真研究表明,回归参数的估计量具有良好的鲁棒性,经偏差校正的经验似然优于正态近似方法。一个阿尔茨海默病研究的真实数据集实例表明,该方法可以应用于实际问题。关键词:广义线性模型右删减数据估计经验似然回归参数致谢作者感谢编辑、副编辑和两位审稿人的宝贵意见。所用数据集由香港理工大学刘春玲博士提供。该数据集的来源可在https://adni.loni.usc.edu/about/.Disclosure statement作者未报告潜在的利益冲突。本研究得到国家自然科学基金项目(11971001)、河南省自然科学基金项目(222300420417)和河南省科技项目(2103004)的资助。
{"title":"Robust estimation and bias-corrected empirical likelihood in generalized linear models with right censored data","authors":"Liugen Xue, Junshan Xie, Xiaohui Yang","doi":"10.1080/02664763.2023.2277117","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277117","url":null,"abstract":"AbstractIn this paper, we study the robust estimation and empirical likelihood for the regression parameter in generalized linear models with right censored data. A robust estimating equation is proposed to estimate the regression parameter, and the resulting estimator has consistent and asymptotic normality. A bias-corrected empirical log-likelihood ratio statistic of the regression parameter is constructed, and it is shown that the statistic converges weakly to a standard χ2 distribution. The result can be directly used to construct the confidence region of regression parameter. We use the bias correction method to directly calibrate the empirical log-likelihood ratio, which does not need to be multiplied by an adjustment factor. We also propose a method for selecting the tuning parameters in the loss function. Simulation studies show that the estimator of the regression parameter is robust and the bias-corrected empirical likelihood is better than the normal approximation method. An example of a real dataset from Alzheimer's disease studies shows that the proposed method can be applied in practical problems.Keywords: Generalized linear modelright censored datarobust estimationempirical likelihoodregression parameter AcknowledgmentsThe authors thank the Editor, Associate Editor and two referees for their helpful comments. The dataset used was provided by Dr. Chunling Liu of the Hong Kong Polytechnic University. The source of this dataset is available on https://adni.loni.usc.edu/about/.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThe research were supported by the National Natural Science Foundation of China (11971001), the Natural Science Foundation of Henan (222300420417), and the Science and Technology Project (2103004).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of time-varying kernel densities and chronology of the impact of COVID-19 on financial markets COVID-19对金融市场影响的时变核密度和时间顺序估计
4区 数学 Q2 Mathematics Pub Date : 2023-10-30 DOI: 10.1080/02664763.2023.2272226
Matthieu Garcin, Jules Klein, Sana Laaribi
The time-varying kernel density estimation relies on two free parameters: the bandwidth and the discount factor. We propose to select these parameters so as to minimize a criterion consistent with the traditional requirements of the validation of a probability density forecast. These requirements are both the uniformity and the independence of the so-called probability integral transforms, which are the forecast time-varying cumulated distributions applied to the observations. We thus build a new numerical criterion incorporating both the uniformity and independence properties by the mean of an adapted Kolmogorov-Smirnov statistic. We apply this method to financial markets during the COVID-19 crisis. We determine the time-varying density of daily price returns of several stock indices and, using various divergence statistics, we are able to describe the chronology of the crisis as well as regional disparities. For instance, we observe a more limited impact of COVID-19 on financial markets in China, a strong impact in the US, and a slow recovery in Europe.
时变核密度估计依赖于两个自由参数:带宽和折现系数。我们建议选择这些参数,以便最小化符合概率密度预测验证传统要求的准则。这些要求是所谓的概率积分变换的均匀性和独立性,即应用于观测的预测时变累积分布。因此,我们建立了一个新的数值准则,结合了一致性和独立性的性质,通过一个适应的Kolmogorov-Smirnov统计量的平均值。我们将这一方法应用于2019冠状病毒病危机期间的金融市场。我们确定了几个股票指数的日价格回报的时变密度,并使用各种差异统计,我们能够描述危机的时间顺序以及区域差异。例如,我们观察到COVID-19对中国金融市场的影响较为有限,对美国的影响较大,对欧洲的复苏缓慢。
{"title":"Estimation of time-varying kernel densities and chronology of the impact of COVID-19 on financial markets","authors":"Matthieu Garcin, Jules Klein, Sana Laaribi","doi":"10.1080/02664763.2023.2272226","DOIUrl":"https://doi.org/10.1080/02664763.2023.2272226","url":null,"abstract":"The time-varying kernel density estimation relies on two free parameters: the bandwidth and the discount factor. We propose to select these parameters so as to minimize a criterion consistent with the traditional requirements of the validation of a probability density forecast. These requirements are both the uniformity and the independence of the so-called probability integral transforms, which are the forecast time-varying cumulated distributions applied to the observations. We thus build a new numerical criterion incorporating both the uniformity and independence properties by the mean of an adapted Kolmogorov-Smirnov statistic. We apply this method to financial markets during the COVID-19 crisis. We determine the time-varying density of daily price returns of several stock indices and, using various divergence statistics, we are able to describe the chronology of the crisis as well as regional disparities. For instance, we observe a more limited impact of COVID-19 on financial markets in China, a strong impact in the US, and a slow recovery in Europe.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136018115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A longitudinal study of the influence of air pollutants on children: a robust multivariate approach 空气污染物对儿童影响的纵向研究:一种稳健的多变量方法
4区 数学 Q2 Mathematics Pub Date : 2023-10-30 DOI: 10.1080/02664763.2023.2272228
Ian Meneghel Danilevicz, Pascal Bondon, Valdério Anselmo Reisen, Faradiba Sarquis Serpa
{"title":"A longitudinal study of the influence of air pollutants on children: a robust multivariate approach","authors":"Ian Meneghel Danilevicz, Pascal Bondon, Valdério Anselmo Reisen, Faradiba Sarquis Serpa","doi":"10.1080/02664763.2023.2272228","DOIUrl":"https://doi.org/10.1080/02664763.2023.2272228","url":null,"abstract":"","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136104508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Applied Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1