首页 > 最新文献

Canadian Journal of Statistics-Revue Canadienne De Statistique最新文献

英文 中文
Smoothed model-assisted small area estimation of proportions 平滑模型辅助的小面积比例估计
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-30 DOI: 10.1002/cjs.11787
Peter A. Gao, Jon Wakefield

In countries where population census data are limited, generating accurate subnational estimates of health and demographic indicators is challenging. Existing model-based geostatistical methods leverage covariate information and spatial smoothing to reduce the variability of estimates but often ignore the survey design, while traditional small area estimation approaches may not incorporate both unit-level covariate information and spatial smoothing in a design consistent way. We propose a smoothed model-assisted estimator that accounts for survey design and leverages both unit-level covariates and spatial smoothing. Under certain regularity assumptions, this estimator is both design consistent and model consistent. We compare it with existing design-based and model-based estimators using real and simulated data.

在人口普查数据有限的国家,准确估算国家以下各级的卫生和人口指标具有挑战性。现有的基于模型的地理统计方法利用协变量信息和空间平滑来降低估算值的变异性,但往往忽略了调查设计,而传统的小区域估算方法可能无法以设计一致的方式同时纳入单位层面的协变量信息和空间平滑。我们提出了一种平滑模型辅助估算器,它考虑了调查设计,并同时利用了单位级协变量和空间平滑。在一定的规则性假设下,该估计器既符合设计,又符合模型。我们使用真实数据和模拟数据将其与现有的基于设计和基于模型的估计器进行了比较。
{"title":"Smoothed model-assisted small area estimation of proportions","authors":"Peter A. Gao,&nbsp;Jon Wakefield","doi":"10.1002/cjs.11787","DOIUrl":"10.1002/cjs.11787","url":null,"abstract":"<p>In countries where population census data are limited, generating accurate subnational estimates of health and demographic indicators is challenging. Existing model-based geostatistical methods leverage covariate information and spatial smoothing to reduce the variability of estimates but often ignore the survey design, while traditional small area estimation approaches may not incorporate both unit-level covariate information and spatial smoothing in a design consistent way. We propose a smoothed model-assisted estimator that accounts for survey design and leverages both unit-level covariates and spatial smoothing. Under certain regularity assumptions, this estimator is both design consistent and model consistent. We compare it with existing design-based and model-based estimators using real and simulated data.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45895755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A calibration method to stabilize estimation with missing data 一种用缺失数据稳定估计的标定方法
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-30 DOI: 10.1002/cjs.11788
Baojiang Chen, Ao Yuan, Jing Qin

The augmented inverse weighting (AIW) estimator is commonly used to estimate the marginal mean of an outcome because of its doubly robust property. However, the AIW estimator can be severely biased if both the propensity score (PS) and the outcome regression (OR) models are misspecified. One possible reason is that misspecification of the PS or OR model yields extreme values in these models, which can have a great influence on the marginal mean estimate. In this article, we propose a calibrated AIW estimator for the marginal mean, which can control the influence of these extreme values and provide a stable marginal mean estimator. The proposed estimator also enjoys the doubly robust property. We also extend this method to handle high-dimensional covariates in PS and OR models. Asymptotic results are also developed. Extensive simulation studies show that the proposed method performs better in most cases than existing approaches by providing a more stable estimate. We apply this method to an AIDS clinical trial study.

由于增强反向加权(AIW)估计器具有双重稳健性,因此常用于估计结果的边际均值。然而,如果倾向得分(PS)和结果回归(OR)模型都被错误地指定,AIW 估计器就会出现严重偏差。其中一个可能的原因是,倾向得分模型或结果回归模型的错误定义会在这些模型中产生极端值,而极端值会对边际均值估计值产生很大影响。在本文中,我们提出了一种经过校准的边际均值 AIW 估计器,它可以控制这些极端值的影响,并提供一个稳定的边际均值估计器。该估计器还具有双重稳健性。我们还扩展了这种方法,以处理 PS 和 OR 模型中的高维协变量。我们还得出了渐近结果。广泛的模拟研究表明,与现有方法相比,所提出的方法在大多数情况下都能提供更稳定的估计值。我们将该方法应用于一项艾滋病临床试验研究。
{"title":"A calibration method to stabilize estimation with missing data","authors":"Baojiang Chen,&nbsp;Ao Yuan,&nbsp;Jing Qin","doi":"10.1002/cjs.11788","DOIUrl":"10.1002/cjs.11788","url":null,"abstract":"<p>The augmented inverse weighting (AIW) estimator is commonly used to estimate the marginal mean of an outcome because of its doubly robust property. However, the AIW estimator can be severely biased if both the propensity score (PS) and the outcome regression (OR) models are misspecified. One possible reason is that misspecification of the PS or OR model yields extreme values in these models, which can have a great influence on the marginal mean estimate. In this article, we propose a calibrated AIW estimator for the marginal mean, which can control the influence of these extreme values and provide a stable marginal mean estimator. The proposed estimator also enjoys the doubly robust property. We also extend this method to handle high-dimensional covariates in PS and OR models. Asymptotic results are also developed. Extensive simulation studies show that the proposed method performs better in most cases than existing approaches by providing a more stable estimate. We apply this method to an AIDS clinical trial study.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48434123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Oscillating neural circuits: Phase, amplitude, and the complex normal distribution 振荡神经回路:相位、振幅和复正态分布
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-22 DOI: 10.1002/cjs.11790
Konrad N. Urban, Heejong Bong, Josue Orellana, Robert E. Kass

Multiple oscillating time series are typically analyzed in the frequency domain, where coherence is usually said to represent the magnitude of the correlation between two signals at a particular frequency. The correlation being referenced is complex-valued and is similar to the real-valued Pearson correlation in some ways but not others. We discuss the dependence among oscillating series in the context of the multivariate complex normal distribution, which plays a role for vectors of complex random variables analogous to the usual multivariate normal distribution for vectors of real-valued random variables. We emphasize special cases that are valuable for the neural data we are interested in and provide new variations on existing results. We then introduce a complex latent variable model for narrowly band-pass-filtered signals at some frequency, and show that the resulting maximum likelihood estimate produces a latent coherence that is equivalent to the magnitude of the complex canonical correlation at the given frequency. We also derive an equivalence between partial coherence and the magnitude of complex partial correlation, at a given frequency. Our theoretical framework leads to interpretable results for an interesting multivariate dataset from the Allen Institute for Brain Science.

通常在频域中分析多个振荡时间序列,其中相干性通常被认为表示特定频率下两个信号之间的相关性的大小。所引用的相关性是复值的,在某些方面与实值Pearson相关性相似,但在其他方面则不同。我们在多元复正态分布的背景下讨论了振荡序列之间的依赖关系,它对复随机变量的向量起着类似于实值随机变量向量的通常多元正态分布的作用。我们强调对我们感兴趣的神经数据有价值的特殊情况,并对现有结果提供新的变化。然后,我们为某些频率下的窄带通滤波信号引入了一个复杂的潜在变量模型,并表明所得到的最大似然估计产生的潜在相干性相当于给定频率下的复杂典型相关的幅度。在给定频率下,我们还推导出部分相干和复部分相关量级之间的等价关系。我们的理论框架为艾伦脑科学研究所的一个有趣的多元数据集带来了可解释的结果。
{"title":"Oscillating neural circuits: Phase, amplitude, and the complex normal distribution","authors":"Konrad N. Urban,&nbsp;Heejong Bong,&nbsp;Josue Orellana,&nbsp;Robert E. Kass","doi":"10.1002/cjs.11790","DOIUrl":"10.1002/cjs.11790","url":null,"abstract":"<p>Multiple oscillating time series are typically analyzed in the frequency domain, where coherence is usually said to represent the magnitude of the correlation between two signals at a particular frequency. The correlation being referenced is complex-valued and is similar to the real-valued Pearson correlation in some ways but not others. We discuss the dependence among oscillating series in the context of the multivariate complex normal distribution, which plays a role for vectors of complex random variables analogous to the usual multivariate normal distribution for vectors of real-valued random variables. We emphasize special cases that are valuable for the neural data we are interested in and provide new variations on existing results. We then introduce a complex latent variable model for narrowly band-pass-filtered signals at some frequency, and show that the resulting maximum likelihood estimate produces a latent coherence that is equivalent to the magnitude of the complex canonical correlation at the given frequency. We also derive an equivalence between partial coherence and the magnitude of complex partial correlation, at a given frequency. Our theoretical framework leads to interpretable results for an interesting multivariate dataset from the Allen Institute for Brain Science.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11790","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44700176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On the correlation analysis of stocks with zero returns 零收益股票的相关性分析
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-09 DOI: 10.1002/cjs.11785
Hamdi Raïssi

The purpose of this article is to study serial correlations, allowing for unconditional heteroscedasticity and time-varying probabilities of zero financial returns. Depending on the set-up, we investigate how the standard autocorrelations can be accommodated to deliver an accurate representation of the serial correlations of stock price changes. We shed light on the properties of the different serial correlations measures by means of Monte Carlo experiments. Theoretical results are also illustrated on shares from the Chilean stock market and Facebook stock intraday data.

本文旨在研究序列相关性,同时考虑无条件异方差性和时变的零财务收益概率。根据不同的设置,我们研究了如何适应标准自相关,以准确表示股票价格变化的序列相关性。我们通过蒙特卡罗实验揭示了不同序列相关性测量的特性。理论结果还通过智利股市的股票和 Facebook 股票的盘中数据进行了说明。
{"title":"On the correlation analysis of stocks with zero returns","authors":"Hamdi Raïssi","doi":"10.1002/cjs.11785","DOIUrl":"10.1002/cjs.11785","url":null,"abstract":"<p>The purpose of this article is to study serial correlations, allowing for unconditional heteroscedasticity and time-varying probabilities of zero financial returns. Depending on the set-up, we investigate how the standard autocorrelations can be accommodated to deliver an accurate representation of the serial correlations of stock price changes. We shed light on the properties of the different serial correlations measures by means of Monte Carlo experiments. Theoretical results are also illustrated on shares from the Chilean stock market and Facebook stock intraday data.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46438144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A combined moment equation approach for spatial autoregressive models 空间自回归模型的组合矩方程方法
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-08 DOI: 10.1002/cjs.11784
Jiaxin Liu, Hongliang Liu, Yi Li, Huazhen Lin

Existing methods for fitting spatial autoregressive models have various strengths and weaknesses. For example, the maximum likelihood estimation (MLE) approach yields efficient estimates but is computationally burdensome. Computationally efficient methods, such as generalized method of moments (GMMs) and spatial two-stage least squares (2SLS), typically require exogenous covariates to be significant, a restrictive assumption that may fail in practice. We propose a new estimating equation approach, termed combined moment equation (COME), which combines the first moment with covariance conditions on the residual terms. The proposed estimator is less computationally demanding than MLE and does not need the restrictive exogenous conditions as required by GMM and 2SLS. We show that the proposed estimator is consistent and establish its asymptotic distribution. Extensive simulations demonstrate that the proposed method outperforms the competitors in terms of bias, efficiency, and computation. We apply the proposed method to analyze an air pollution study, and obtain some interesting results about the spatial distribution of PM2.5 concentrations in Beijing.

现有的空间自回归模型拟合方法各有优缺点。例如,最大似然估计法(MLE)可以得到有效的估计值,但计算负担较重。计算效率高的方法,如广义矩法(GMMs)和空间两阶段最小二乘法(2SLS),通常要求外生协变量显著,这一限制性假设在实践中可能会失效。我们提出了一种新的估计方程方法,称为组合矩方程(COME),它将第一矩与残差项的协方差条件相结合。与 MLE 相比,所提出的估计方法对计算的要求更低,而且不需要 GMM 和 2SLS 所要求的限制性外生条件。我们证明了所提出的估计方法是一致的,并建立了其渐近分布。大量的模拟证明,所提出的方法在偏差、效率和计算方面都优于竞争对手。我们将提出的方法用于分析一项空气污染研究,并获得了有关北京 PM2.5 浓度空间分布的一些有趣结果。
{"title":"A combined moment equation approach for spatial autoregressive models","authors":"Jiaxin Liu,&nbsp;Hongliang Liu,&nbsp;Yi Li,&nbsp;Huazhen Lin","doi":"10.1002/cjs.11784","DOIUrl":"10.1002/cjs.11784","url":null,"abstract":"<p>Existing methods for fitting spatial autoregressive models have various strengths and weaknesses. For example, the maximum likelihood estimation (MLE) approach yields efficient estimates but is computationally burdensome. Computationally efficient methods, such as generalized method of moments (GMMs) and spatial two-stage least squares (2SLS), typically require exogenous covariates to be significant, a restrictive assumption that may fail in practice. We propose a new estimating equation approach, termed combined moment equation (COME), which combines the first moment with covariance conditions on the residual terms. The proposed estimator is less computationally demanding than MLE and does not need the restrictive exogenous conditions as required by GMM and 2SLS. We show that the proposed estimator is consistent and establish its asymptotic distribution. Extensive simulations demonstrate that the proposed method outperforms the competitors in terms of bias, efficiency, and computation. We apply the proposed method to analyze an air pollution study, and obtain some interesting results about the spatial distribution of PM2.5 concentrations in Beijing.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44254561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Multivariate Survival Data under Semiparametric Copula Models 半参数Copula模型下的多变量生存数据分析
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-03 DOI: 10.1002/cjs.11776
Wenqing He, Grace Y. Yi, Ao Yuan

Modelling multivariate survival data is complicated by the complex association structure among the responses. To balance model flexibility and interpretability, we propose a semiparametric copula model to modulate multivariate survival data, with the marginal distributions of the response components described by semiparametric linear transformation models. To conduct inference about the model parameters, we develop a two-stage maximum likelihood method and a three-stage pseudo-likelihood estimation procedure. We investigate the impact of model misspecification on the estimation of covariate effects and identify a scenario in which consistent estimation of the marginal parameters is retained even when the copula model is misspecified. The proposed methods are justified both theoretically and empirically. An application to a real dataset is provided to demonstrate the utility of the proposed method.

由于反应之间存在复杂的关联结构,多变量生存数据的建模变得十分复杂。为了兼顾模型的灵活性和可解释性,我们提出了一种半参数 copula 模型来调节多变量生存数据,并通过半参数线性变换模型来描述响应成分的边际分布。为了对模型参数进行推断,我们开发了两阶段最大似然法和三阶段伪似然估计程序。我们研究了模型失当对协变效应估计的影响,并确定了一种方案,在这种方案中,即使 copula 模型失当,也能保持对边际参数的一致估计。提出的方法在理论和经验上都是合理的。在实际数据集上的应用证明了所提方法的实用性。
{"title":"Analysis of Multivariate Survival Data under Semiparametric Copula Models","authors":"Wenqing He,&nbsp;Grace Y. Yi,&nbsp;Ao Yuan","doi":"10.1002/cjs.11776","DOIUrl":"10.1002/cjs.11776","url":null,"abstract":"<p>Modelling multivariate survival data is complicated by the complex association structure among the responses. To balance model flexibility and interpretability, we propose a semiparametric copula model to modulate multivariate survival data, with the marginal distributions of the response components described by semiparametric linear transformation models. To conduct inference about the model parameters, we develop a two-stage maximum likelihood method and a three-stage pseudo-likelihood estimation procedure. We investigate the impact of model misspecification on the estimation of covariate effects and identify a scenario in which consistent estimation of the marginal parameters is retained even when the copula model is misspecified. The proposed methods are justified both theoretically and empirically. An application to a real dataset is provided to demonstrate the utility of the proposed method.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11776","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44321280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rerandomization and optimal matching 重随机化与最优匹配
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-07-03 DOI: 10.1002/cjs.11783
John D. Kalbfleisch, Zhenzhen Xu

On average, randomization achieves balance in covariate distributions between treatment groups; yet in practice, chance imbalance exists post randomization, which increases the error in estimating treatment effects. This is an important issue, especially in cluster randomized trials, where the experimental units (the clusters) are highly heterogeneous and relatively few in number. To address this, several restricted randomization designs have been proposed to balance on a few covariates of particular interest. More recently, approaches involving rerandomization have been proposed that aim to achieve simultaneous balance on several important prognostic factors. In this article, we comment on some properties of rerandomized designs and propose a new design for comparing two or more treatments. This design combines optimal nonbipartite matching of the subjects together with rerandomization, both aimed at minimizing a measure of distance between elements in blocks to achieve reductions in the mean squared error of estimated treatment effects. Compared with the existing alternatives, the proposed design can substantially reduce the mean squared error of the estimated treatment effect. This enhanced efficiency is evaluated both theoretically and empirically, and robustness properties are also noted. The design is generalized to three or more treatment arms.

平均而言,随机化在治疗组之间实现了协变量分布的平衡;然而,在实践中,随机化后存在机会失衡,这增加了估计治疗效果的误差。这是一个重要的问题,尤其是在集群随机试验中,实验单元(集群)高度异质,数量相对较少。为了解决这一问题,已经提出了几种限制性随机化设计,以平衡一些特别感兴趣的协变量。最近,有人提出了涉及重新随机化的方法,旨在同时平衡几个重要的预后因素。在这篇文章中,我们评论了重新随机化设计的一些性质,并提出了一种新的设计来比较两种或多种处理。该设计将受试者的最佳非二分匹配与重新随机化相结合,两者都旨在最小化块中元素之间的距离,以降低估计治疗效果的均方误差。与现有的替代方案相比,所提出的设计可以大大降低估计治疗效果的均方误差。对这种增强的效率进行了理论和经验评估,并注意到了鲁棒性特性。该设计被推广到三个或更多的治疗臂。
{"title":"Rerandomization and optimal matching","authors":"John D. Kalbfleisch,&nbsp;Zhenzhen Xu","doi":"10.1002/cjs.11783","DOIUrl":"10.1002/cjs.11783","url":null,"abstract":"<p>On average, randomization achieves balance in covariate distributions between treatment groups; yet in practice, chance imbalance exists post randomization, which increases the error in estimating treatment effects. This is an important issue, especially in cluster randomized trials, where the experimental units (the clusters) are highly heterogeneous and relatively few in number. To address this, several restricted randomization designs have been proposed to balance on a few covariates of particular interest. More recently, approaches involving rerandomization have been proposed that aim to achieve simultaneous balance on several important prognostic factors. In this article, we comment on some properties of rerandomized designs and propose a new design for comparing two or more treatments. This design combines optimal nonbipartite matching of the subjects together with rerandomization, both aimed at minimizing a measure of distance between elements in blocks to achieve reductions in the mean squared error of estimated treatment effects. Compared with the existing alternatives, the proposed design can substantially reduce the mean squared error of the estimated treatment effect. This enhanced efficiency is evaluated both theoretically and empirically, and robustness properties are also noted. The design is generalized to three or more treatment arms.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11783","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43041286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Nonparametric simulation extrapolation for measurement-error models 测量误差模型的非参数模拟外推
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-06-27 DOI: 10.1002/cjs.11777
Dylan Spicker, Michael P. Wallace, Grace Y. Yi

The presence of measurement error is a widespread issue, which, when ignored, can render the results of an analysis unreliable. Numerous corrections for the effects of measurement error have been proposed and studied, often under the assumption of a normally distributed, additive measurement-error model. In many situations, observed data are nonsymmetric, heavy-tailed, or otherwise highly non-normal. In these settings, correction techniques relying on the assumption of normality are undesirable. We propose an extension of simulation extrapolation that is nonparametric in the sense that no specific distributional assumptions are required on the error terms. The technique can be implemented when either validation data or replicate measurements are available, and is designed to be immediately accessible to those familiar with simulation extrapolation.

测量误差的存在是一个普遍存在的问题,如果忽略它,可能会使分析结果不可靠。已经提出并研究了许多对测量误差影响的校正,通常是在正态分布的加性测量误差模型的假设下。一种这样的方法是模拟外推法(SIMEX)。在许多情况下,观测到的数据是非对称的、重尾的或高度非正态的。在这些设置中,依赖于正常性假设的校正技术是不可取的。我们提出了对模拟外推方法的扩展,该方法是非参数的,因为在误差项上不需要特定的分布假设。该技术是在验证数据或重复测量可用时实施的,并且设计为熟悉模拟外推的人可以立即访问。
{"title":"Nonparametric simulation extrapolation for measurement-error models","authors":"Dylan Spicker,&nbsp;Michael P. Wallace,&nbsp;Grace Y. Yi","doi":"10.1002/cjs.11777","DOIUrl":"10.1002/cjs.11777","url":null,"abstract":"<p>The presence of measurement error is a widespread issue, which, when ignored, can render the results of an analysis unreliable. Numerous corrections for the effects of measurement error have been proposed and studied, often under the assumption of a normally distributed, additive measurement-error model. In many situations, observed data are nonsymmetric, heavy-tailed, or otherwise highly non-normal. In these settings, correction techniques relying on the assumption of normality are undesirable. We propose an extension of simulation extrapolation that is nonparametric in the sense that no specific distributional assumptions are required on the error terms. The technique can be implemented when either validation data or replicate measurements are available, and is designed to be immediately accessible to those familiar with simulation extrapolation.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11777","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42900846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Objective model selection with parallel genetic algorithms using an eradication strategy 基于根除策略的并行遗传算法目标模型选择
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-06-05 DOI: 10.1002/cjs.11775
Jean-François Plante, Maxime Larocque, Michel Adès

In supervised learning, feature selection methods identify the most relevant predictors to include in a model. For linear models, the inclusion or exclusion of each variable may be represented as a vector of bits playing the role of the genetic material that defines the model. Genetic algorithms reproduce the strategies of natural selection on a population of models to identify the best. We derive the distribution of the importance scores for parallel genetic algorithms under the null hypothesis that none of the features has predictive power. They, hence, provide an objective threshold for feature selection that does not require the visual inspection of a bubble plot. We also introduce the eradication strategy, akin to forward stepwise selection, where the genes of useful variables are sequentially forced into the models. The method is illustrated on real data, and simulation studies are run to describe its performance.

在有监督学习中,特征选择方法可以确定模型中最相关的预测因子。对于线性模型来说,每个变量的加入或排除都可以用比特向量来表示,比特向量就像定义模型的遗传物质。遗传算法再现了对模型群体进行自然选择的策略,以找出最佳模型。在没有任何特征具有预测能力的零假设下,我们得出了并行遗传算法的重要性得分分布。因此,它们为特征选择提供了一个客观的阈值,而无需对气泡图进行目测。我们还引入了类似于前向逐步选择的根除策略,在这种策略中,有用变量的基因会被依次强制加入模型中。我们在真实数据上对该方法进行了说明,并进行了模拟研究以描述其性能。
{"title":"Objective model selection with parallel genetic algorithms using an eradication strategy","authors":"Jean-François Plante,&nbsp;Maxime Larocque,&nbsp;Michel Adès","doi":"10.1002/cjs.11775","DOIUrl":"10.1002/cjs.11775","url":null,"abstract":"<p>In supervised learning, feature selection methods identify the most relevant predictors to include in a model. For linear models, the inclusion or exclusion of each variable may be represented as a vector of bits playing the role of the genetic material that defines the model. Genetic algorithms reproduce the strategies of natural selection on a population of models to identify the best. We derive the distribution of the importance scores for parallel genetic algorithms under the null hypothesis that none of the features has predictive power. They, hence, provide an objective threshold for feature selection that does not require the visual inspection of a bubble plot. We also introduce the eradication strategy, akin to forward stepwise selection, where the genes of useful variables are sequentially forced into the models. The method is illustrated on real data, and simulation studies are run to describe its performance.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11775","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48685546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finite sample and asymptotic distributions of a statistic for sufficient follow-up in cure models 治疗模型中足够随访的统计量的有限样本和渐近分布
IF 0.6 4区 数学 Q4 Mathematics Pub Date : 2023-04-19 DOI: 10.1002/cjs.11771
Ross Maller, Sidney Resnick, Soudabeh Shemehsavar

The existence of immune or cured individuals in a population and whether there is sufficient follow-up in a sample of censored observations on their lifetimes to be confident of their presence are questions of major importance in medical survival analysis. Here we give a detailed analysis of a statistic designed to test for sufficient follow-up in a sample. Assuming an i.i.d. censoring model, we obtain exact finite-sample and asymptotic distributions for the statistic, and use these to calculate the power of a test based on it. A particularly useful finding is that the asymptotic distribution of the test statistic is parameter-free in the null case when follow-up is insufficient. The methods are illustrated with application to a glioma cancer dataset.

人口中是否存在免疫或痊愈的个体,以及在对这些个体的一生进行删减观测的样本中是否有足够的随访来确信这些个体的存在,是医学生存分析中非常重要的问题。在此,我们将详细分析一种旨在检验样本中是否有足够随访的统计量。假定存在 i.i.d. 普查模型,我们将得到该统计量的精确有限样本分布和渐近分布,并利用这些分布计算基于该统计量的检验功率。一个特别有用的发现是,在随访不充分的无效情况下,检验统计量的渐近分布是无参数的。这些方法将应用于胶质瘤癌症数据集。
{"title":"Finite sample and asymptotic distributions of a statistic for sufficient follow-up in cure models","authors":"Ross Maller,&nbsp;Sidney Resnick,&nbsp;Soudabeh Shemehsavar","doi":"10.1002/cjs.11771","DOIUrl":"10.1002/cjs.11771","url":null,"abstract":"<p>The existence of immune or cured individuals in a population and whether there is sufficient follow-up in a sample of censored observations on their lifetimes to be confident of their presence are questions of major importance in medical survival analysis. Here we give a detailed analysis of a statistic designed to test for sufficient follow-up in a sample. Assuming an i.i.d. censoring model, we obtain exact finite-sample and asymptotic distributions for the statistic, and use these to calculate the power of a test based on it. A particularly useful finding is that the asymptotic distribution of the test statistic is parameter-free in the null case when follow-up is insufficient. The methods are illustrated with application to a glioma cancer dataset.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2023-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11771","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47178106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Canadian Journal of Statistics-Revue Canadienne De Statistique
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1