首页 > 最新文献

International Statistical Review最新文献

英文 中文
An Optimised Optional Randomised Response Technique 优化的可选随机应变技术
IF 2 3区 数学 Q1 Mathematics Pub Date : 2024-05-29 DOI: 10.1111/insr.12581
Kavya Pushadapu, Sarjinder Singh, Stephen A. Sedory
SummaryIn this paper, we begin by reviewing the optional randomised response technique estimator (ORRTE) developed by Chaudhuri and Mukerjee for estimating the proportion of a sensitive characteristic in a population. We show that their estimator is unbiased and has smaller variance than the Warner's estimator. Then we make an attempt at developing an optimised optional randomised response technique estimator (OORRTE). The proposed OORRTE is shown to be more efficient than the ORRTE. Findings from simulation studies are discussed and interpreted for various situations. Sample sizes for the Warner's estimator, the ORRTE and the OORRTE are computed based on power analysis introduced by Ulrich, Schroter, Striegel and Simon. Finally, we include an application to real data on COVID‐19 by considering it to be partially sensitive variable; that is, sensitive to some but not to others. The data used are included in the paper and the R‐codes used in the simulation study are documented in online material.
摘要在本文中,我们首先回顾了 Chaudhuri 和 Mukerjee 开发的可选随机响应技术估计器 (ORRTE),用于估计人口中敏感特征的比例。我们证明,他们的估计器是无偏的,方差小于华纳估计器。然后,我们尝试开发一种优化的可选随机响应技术估计器(OORRTE)。结果表明,建议的 OORRTE 比 ORRTE 更有效。我们讨论了模拟研究的结果,并对各种情况进行了解释。根据 Ulrich、Schroter、Striegel 和 Simon 引入的功率分析,计算了 Warner 估计器、ORRTE 和 OORRTE 的样本大小。最后,我们将 COVID-19 视为部分敏感变量(即对某些变量敏感,但对另一些变量不敏感),并将其应用于真实数据中。所用数据包含在论文中,模拟研究中使用的 R 代码记录在在线资料中。
{"title":"An Optimised Optional Randomised Response Technique","authors":"Kavya Pushadapu, Sarjinder Singh, Stephen A. Sedory","doi":"10.1111/insr.12581","DOIUrl":"https://doi.org/10.1111/insr.12581","url":null,"abstract":"SummaryIn this paper, we begin by reviewing the optional randomised response technique estimator (ORRTE) developed by Chaudhuri and Mukerjee for estimating the proportion of a sensitive characteristic in a population. We show that their estimator is unbiased and has smaller variance than the Warner's estimator. Then we make an attempt at developing an optimised optional randomised response technique estimator (OORRTE). The proposed OORRTE is shown to be more efficient than the ORRTE. Findings from simulation studies are discussed and interpreted for various situations. Sample sizes for the Warner's estimator, the ORRTE and the OORRTE are computed based on power analysis introduced by Ulrich, Schroter, Striegel and Simon. Finally, we include an application to real data on COVID‐19 by considering it to be partially sensitive variable; that is, sensitive to some but not to others. The data used are included in the paper and the R‐codes used in the simulation study are documented in online material.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141188933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Robust Variable Selection of Mean and Covariance Model via Shrinkage Methods 通过缩减法对均值和协方差模型进行联合稳健变量选择
IF 2 3区 数学 Q1 Mathematics Pub Date : 2024-05-24 DOI: 10.1111/insr.12577
Y. Güney, Fulya Gokalp Yavuz, Olcay Arslan
A valuable and robust extension of the traditional joint mean and the covariance models when data subject to outliers and/or heavy‐tailed outcomes can be achieved using the joint modelling of location and scatter matrix of the multivariate t‐distribution. This model encompasses three models in itself, and the number of unknown parameters in the covariance model increases quadratically with the matrix size. As a result, selecting the important variables becomes a crucial aspect to consider. In this context, the variable selection combined with the parameter estimation is considered under the normality assumption. However, because of the non‐robustness of the normal distribution, the resulting estimators will be sensitive to outliers and/or heavy taildness in the data. This paper has two objectives to overcome these problems. The first is to obtain the maximum likelihood estimates of the parameters and propose an expectation‐maximisation type algorithm as an alternative to the Fisher scoring algorithm in the literature. We also consider simultaneous parameter estimation and variable selection in the multivariate t‐joint location and scatter matrix models. The consistency and oracle properties of the regularised estimators are also established. Simulation studies and real data analysis are provided to assess the performance of the proposed methods.
当数据存在异常值和/或重尾结果时,可以利用多变量 t 分布的位置和散点矩阵联合建模来实现对传统的均值和协方差联合模型的有价值和稳健的扩展。该模型本身包含三个模型,而协方差模型中未知参数的数量与矩阵大小成二次方增加。因此,选择重要变量就成了一个需要考虑的关键问题。在这种情况下,变量选择与参数估计结合在一起,是在正态性假设下考虑的。然而,由于正态分布的非稳健性,所得到的估计值会对数据中的异常值和/或重尾敏感。本文有两个目标来克服这些问题。首先是获得参数的最大似然估计值,并提出一种期望最大化类型的算法,以替代文献中的费雪评分算法。我们还考虑了多变量 t 关节位置和散点矩阵模型中的同步参数估计和变量选择。我们还建立了正则化估计器的一致性和甲骨文特性。我们还提供了模拟研究和真实数据分析,以评估所提出方法的性能。
{"title":"Joint Robust Variable Selection of Mean and Covariance Model via Shrinkage Methods","authors":"Y. Güney, Fulya Gokalp Yavuz, Olcay Arslan","doi":"10.1111/insr.12577","DOIUrl":"https://doi.org/10.1111/insr.12577","url":null,"abstract":"A valuable and robust extension of the traditional joint mean and the covariance models when data subject to outliers and/or heavy‐tailed outcomes can be achieved using the joint modelling of location and scatter matrix of the multivariate t‐distribution. This model encompasses three models in itself, and the number of unknown parameters in the covariance model increases quadratically with the matrix size. As a result, selecting the important variables becomes a crucial aspect to consider. In this context, the variable selection combined with the parameter estimation is considered under the normality assumption. However, because of the non‐robustness of the normal distribution, the resulting estimators will be sensitive to outliers and/or heavy taildness in the data. This paper has two objectives to overcome these problems. The first is to obtain the maximum likelihood estimates of the parameters and propose an expectation‐maximisation type algorithm as an alternative to the Fisher scoring algorithm in the literature. We also consider simultaneous parameter estimation and variable selection in the multivariate t‐joint location and scatter matrix models. The consistency and oracle properties of the regularised estimators are also established. Simulation studies and real data analysis are provided to assess the performance of the proposed methods.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141099030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey of Monte Carlo Methods for Noisy and Costly Densities With Application to Reinforcement Learning and ABC 噪声和高成本密度的蒙特卡洛方法概览--应用于强化学习和 ABC
IF 2 3区 数学 Q1 Mathematics Pub Date : 2024-05-17 DOI: 10.1111/insr.12573
Fernando Llorente, Luca Martino, Jesse Read, David Delgado‐Gómez
SummaryThis survey gives an overview of Monte Carlo methodologies using surrogate models, for dealing with densities that are intractable, costly, and/or noisy. This type of problem can be found in numerous real‐world scenarios, including stochastic optimisation and reinforcement learning, where each evaluation of a density function may incur some computationally‐expensive or even physical (real‐world activity) cost, likely to give different results each time. The surrogate model does not incur this cost, but there are important trade‐offs and considerations involved in the choice and design of such methodologies. We classify the different methodologies into three main classes and describe specific instances of algorithms under a unified notation. A modular scheme that encompasses the considered methods is also presented. A range of application scenarios is discussed, with special attention to the likelihood‐free setting and reinforcement learning. Several numerical comparisons are also provided.
摘要 本研究概述了使用代用模型的蒙特卡罗方法,用于处理难以处理、成本高昂和/或噪声大的密度问题。这类问题存在于现实世界的许多场景中,包括随机优化和强化学习,其中密度函数的每次评估都可能产生一些计算成本高昂甚至是物理成本(现实世界的活动)的问题,而且每次评估的结果都可能不同。代用模型不会产生这种成本,但在选择和设计此类方法时,需要进行重要的权衡和考虑。我们将不同的方法分为三大类,并用统一的符号描述算法的具体实例。此外,我们还介绍了一种包含所考虑方法的模块化方案。讨论了一系列应用场景,特别关注了无似然设置和强化学习。此外,还提供了一些数值比较。
{"title":"A Survey of Monte Carlo Methods for Noisy and Costly Densities With Application to Reinforcement Learning and ABC","authors":"Fernando Llorente, Luca Martino, Jesse Read, David Delgado‐Gómez","doi":"10.1111/insr.12573","DOIUrl":"https://doi.org/10.1111/insr.12573","url":null,"abstract":"SummaryThis survey gives an overview of Monte Carlo methodologies using surrogate models, for dealing with densities that are intractable, costly, and/or noisy. This type of problem can be found in numerous real‐world scenarios, including stochastic optimisation and reinforcement learning, where each evaluation of a density function may incur some computationally‐expensive or even physical (real‐world activity) cost, likely to give different results each time. The surrogate model does not incur this cost, but there are important trade‐offs and considerations involved in the choice and design of such methodologies. We classify the different methodologies into three main classes and describe specific instances of algorithms under a unified notation. A modular scheme that encompasses the considered methods is also presented. A range of application scenarios is discussed, with special attention to the likelihood‐free setting and reinforcement learning. Several numerical comparisons are also provided.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141062556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ODC and ROC Curves, Comparison Curves and Stochastic Dominance ODC 和 ROC 曲线、比较曲线和随机优势†。
IF 1.7 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-05-08 DOI: 10.1111/insr.12571
Teresa Ledwina, Adam Zagdański

We discuss two novel approaches to inter-distributional comparisons in the classical two-sample problem. Our starting point is properly standardised and combined, very popular in several areas of statistics and data analysis, ordinal dominance and receiver characteristic curves, denoted by ODC and ROC, respectively. The proposed new curves are termed the comparison curves. Their estimates, being weighted rank processes on (0,1), form the basis of inference. These weighted processes are intuitive, well-suited for visual inspection of data at hand and are also useful for constructing some formal inferential procedures. They can be applied to several variants of two-sample problem. Their use can help improve some existing procedures both in terms of power and the ability to identify the sources of departures from the postulated model. To simplify interpretation of finite sample results, we restrict attention to values of the processes on a finite grid of points. This results in the so-called bar plots (B-plots), which readably summarise the information contained in the data. What is more, we show that B-plots along with adjusted simultaneous acceptance regions provide principled information about where the model departs from the data. This leads to a framework that facilitates identification of regions with locally significant differences.

We show an implementation of the considered techniques to a standard stochastic dominance testing problem. Some min-type statistics are introduced and investigated. A simulation study compares two tests pertinent to the comparison curves to well-established tests in the literature and demonstrates the strong and competitive performance of the former in many typical situations. Some real data applications illustrate simplicity and practical usefulness of the proposed approaches. A range of other applications of considered weighted processes is briefly discussed too.

摘要我们讨论了经典的双样本问题中进行分布间比较的两种新方法。我们的出发点是适当标准化并结合在统计学和数据分析的多个领域中非常流行的顺序优势曲线和接收者特征曲线,分别用 ODC 和 ROC 表示。建议的新曲线被称为比较曲线。它们的估计值是 (0,1) 上的加权秩过程,是推理的基础。这些加权过程很直观,非常适合目测手头的数据,也有助于构建一些正式的推断程序。它们可以应用于双样本问题的多种变体。使用它们有助于改进现有的一些程序,无论是在功率方面,还是在识别偏离假设模型的来源的能力方面。为了简化有限样本结果的解释,我们将注意力限制在有限网格点上的过程值。这就产生了所谓的条形图(B-plots),它可以清晰地概括数据中包含的信息。此外,我们还表明,B-图和调整后的同步接受区域提供了模型偏离数据的原则性信息。我们展示了所考虑的技术在标准随机优势检验问题中的应用。我们介绍并研究了一些最小类型统计。模拟研究比较了与比较曲线相关的两种检验方法和文献中的成熟检验方法,证明了前者在许多典型情况下具有强大的竞争力。一些真实数据的应用说明了所提出方法的简便性和实用性。此外,还简要讨论了所考虑的加权过程的一系列其他应用。
{"title":"ODC and ROC Curves, Comparison Curves and Stochastic Dominance","authors":"Teresa Ledwina,&nbsp;Adam Zagdański","doi":"10.1111/insr.12571","DOIUrl":"10.1111/insr.12571","url":null,"abstract":"<div>\u0000 \u0000 <p>We discuss two novel approaches to inter-distributional comparisons in the classical two-sample problem. Our starting point is properly standardised and combined, very popular in several areas of statistics and data analysis, ordinal dominance and receiver characteristic curves, denoted by ODC and ROC, respectively. The proposed new curves are termed the comparison curves. Their estimates, being weighted rank processes on (0,1), form the basis of inference. These weighted processes are intuitive, well-suited for visual inspection of data at hand and are also useful for constructing some formal inferential procedures. They can be applied to several variants of two-sample problem. Their use can help improve some existing procedures both in terms of power and the ability to identify the sources of departures from the postulated model. To simplify interpretation of finite sample results, we restrict attention to values of the processes on a finite grid of points. This results in the so-called bar plots (B-plots), which readably summarise the information contained in the data. What is more, we show that B-plots along with adjusted simultaneous acceptance regions provide principled information about where the model departs from the data. This leads to a framework that facilitates identification of regions with locally significant differences.</p>\u0000 <p>We show an implementation of the considered techniques to a standard stochastic dominance testing problem. Some min-type statistics are introduced and investigated. A simulation study compares two tests pertinent to the comparison curves to well-established tests in the literature and demonstrates the strong and competitive performance of the former in many typical situations. Some real data applications illustrate simplicity and practical usefulness of the proposed approaches. A range of other applications of considered weighted processes is briefly discussed too.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Raise Regression: Justification, Properties and Application 提升回归:理由、特性和应用
IF 2 3区 数学 Q1 Mathematics Pub Date : 2024-05-02 DOI: 10.1111/insr.12575
Román Salmerón‐Gómez, Catalina B. García‐García, José García‐Pérez
SummaryMulticollinearity results in inflation in the variance of the ordinary least squares estimators due to the correlation between two or more independent variables (including the constant term). A widely applied solution is to estimate with penalised estimators such as the ridge estimator, which trade off some bias in the estimators to gain a reduction in the variance of these estimators. Although the variance diminishes with these procedures, all seem to indicate that the inference and goodness of fit are controversial. Alternatively, the raise regression allows mitigation of the problems associated with multicollinearity without the loss of inference or the coefficient of determination. This paper completely formalises the raise estimator. For the first time, the norm of the estimator, the behaviour of the individual and joint significance, the behaviour of the mean squared error and the coefficient of variation are analysed. We also present the generalisation of the estimation and the relation between the raise and the residualisation estimators. To have a better understanding of raise regression, previous contributions are also summarised: its mean squared error, the variance inflation factor, the condition number, adequate selection of the variable to be raised, the successive raising, and the relation between the raise and the ridge estimator. The usefulness of the raise regression as an alternative to mitigate multicollinearity is illustrated with two empirical applications.
摘要多重共线性会导致普通最小二乘估计方差因两个或多个自变量(包括常数项)之间的相关性而膨胀。一种广泛应用的解决方法是使用脊估计器等惩罚估计器进行估计,这种估计器可以牺牲估计器中的一些偏差,以减少这些估计器的方差。虽然方差会随着这些方法的使用而减小,但所有方法似乎都表明推论和拟合优度存在争议。另外,加权回归可以减轻与多重共线性相关的问题,而不会损失推断或决定系数。本文完全正规化了加权估计器。本文首次分析了估计器的规范、个体显著性和联合显著性的表现、均方误差和变异系数的表现。我们还介绍了估计的一般化以及加权估计器和残差估计器之间的关系。为了更好地理解加权回归,我们还总结了以前的贡献:其均方误差、方差膨胀因子、条件数、待加权变量的适当选择、连续加权以及加权与脊估计器之间的关系。提升回归作为减轻多重共线性的替代方法的实用性通过两个经验应用得到了说明。
{"title":"The Raise Regression: Justification, Properties and Application","authors":"Román Salmerón‐Gómez, Catalina B. García‐García, José García‐Pérez","doi":"10.1111/insr.12575","DOIUrl":"https://doi.org/10.1111/insr.12575","url":null,"abstract":"SummaryMulticollinearity results in inflation in the variance of the ordinary least squares estimators due to the correlation between two or more independent variables (including the constant term). A widely applied solution is to estimate with penalised estimators such as the ridge estimator, which trade off some bias in the estimators to gain a reduction in the variance of these estimators. Although the variance diminishes with these procedures, all seem to indicate that the inference and goodness of fit are controversial. Alternatively, the raise regression allows mitigation of the problems associated with multicollinearity without the loss of inference or the coefficient of determination. This paper completely formalises the raise estimator. For the first time, the norm of the estimator, the behaviour of the individual and joint significance, the behaviour of the mean squared error and the coefficient of variation are analysed. We also present the generalisation of the estimation and the relation between the raise and the residualisation estimators. To have a better understanding of raise regression, previous contributions are also summarised: its mean squared error, the variance inflation factor, the condition number, adequate selection of the variable to be raised, the successive raising, and the relation between the raise and the ridge estimator. The usefulness of the raise regression as an alternative to mitigate multicollinearity is illustrated with two empirical applications.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140837225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-Inflation and Zero-Truncation Count Data Modelling Revisited With a View on Horvitz–Thompson Estimation of Population Size 以 Horvitz-Thompson 人口规模估算为视角,重新审视单膨胀和零截断计数数据建模法
IF 1.7 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-04-30 DOI: 10.1111/insr.12570
Dankmar Böhning, Herwig Friedl

Estimating the size of a hard-to-count population is a challenging matter. We consider uni-list approaches in which the count of identifications per unit is the basis of analysis. Unseen units have a zero count and do not occur in the sample leading to a zero-truncated setting. Because of various mechanisms, one-inflation is often an occurring phenomena that can lead to seriously biased estimates of population size. The current work reviews some recent advances on one-inflation and zero-truncation modelling, and furthermore focuses here on the impact it has on population size estimation. The zero-truncated one-inflated and the one-inflated zero-truncated model is compared (also with the model ignoring one-inflation) in terms of Horvitz–Thompson estimation of population size. The simulation work shows clearly the biasing effect of ignoring one-inflation. Both models, the zero-truncated one-inflated and the one-inflated zero-truncated one, are suitable to model ongoing one-inflation. It is also important to choose an appropriate base-line distributional model. Finally, all models derived in the paper are illustrated on a number of case studies.

摘要估算难以统计的人口数量是一项具有挑战性的工作。我们考虑了单列表方法,其中每个单位的识别计数是分析的基础。未见单位的计数为零,不会出现在样本中,从而导致零截断设置。由于各种机制的影响,"零膨胀 "现象经常发生,可能导致对种群数量的估计出现严重偏差。目前的研究回顾了一膨胀和零截断建模的一些最新进展,并重点探讨了一膨胀和零截断建模对人口规模估计的影响。从霍维茨-汤普森模型估计种群数量的角度,比较了一膨胀零截断模型和一膨胀零截断模型(以及忽略一膨胀的模型)。模拟工作清楚地显示了忽略单膨胀的偏差效应。两种模型,即零截断一膨胀模型和一膨胀零截断模型,都适用于模拟持续的一膨胀现象。选择一个合适的基线分布模型也很重要。最后,本文推导的所有模型都在一些案例研究中得到了说明。
{"title":"One-Inflation and Zero-Truncation Count Data Modelling Revisited With a View on Horvitz–Thompson Estimation of Population Size","authors":"Dankmar Böhning,&nbsp;Herwig Friedl","doi":"10.1111/insr.12570","DOIUrl":"10.1111/insr.12570","url":null,"abstract":"<p>Estimating the size of a hard-to-count population is a challenging matter. We consider uni-list approaches in which the count of identifications per unit is the basis of analysis. Unseen units have a zero count and do not occur in the sample leading to a zero-truncated setting. Because of various mechanisms, one-inflation is often an occurring phenomena that can lead to seriously biased estimates of population size. The current work reviews some recent advances on one-inflation and zero-truncation modelling, and furthermore focuses here on the impact it has on population size estimation. The zero-truncated one-inflated and the one-inflated zero-truncated model is compared (also with the model ignoring one-inflation) in terms of Horvitz–Thompson estimation of population size. The simulation work shows clearly the biasing effect of ignoring one-inflation. Both models, the zero-truncated one-inflated and the one-inflated zero-truncated one, are suitable to model ongoing one-inflation. It is also important to choose an appropriate base-line distributional model. Finally, all models derived in the paper are illustrated on a number of case studies.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12570","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140837218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Small Sample Inference for Two‐Way Capture‐Recapture Experiments 双向捕获-再捕获实验的小样本推断
IF 2 3区 数学 Q1 Mathematics Pub Date : 2024-04-24 DOI: 10.1111/insr.12574
Louis‐Paul Rivest, Mamadou Yauck
SummaryThe properties of the generalised Waring distribution defined on the non‐negative integers are reviewed. Formulas for its moments and its mode are given. A construction as a mixture of negative binomial distributions is also presented. Then we turn to the Petersen model for estimating the population size in a two‐way capture‐recapture experiment. We construct a Bayesian model for by combining a Waring prior with the hypergeometric distribution for the number of units caught twice in the experiment. Credible intervals for are obtained using quantiles of the posterior, a generalised Waring distribution. The standard confidence interval for the population size constructed using the asymptotic variance of Petersen estimator and 0.5 logit transformed interval are shown to be special cases of the generalised Waring credible interval. The true coverage of this interval is shown to be bigger than or equal to its nominal converage in small populations, regardless of the capture probabilities. In addition, its length is substantially smaller than that of the 0.5 logit transformed interval. Thus, the proposed generalised Waring credible interval appears to be the best way to quantify the uncertainty of the Petersen estimator for populations size.
摘要 回顾了定义在非负整数上的广义瓦林分布的性质。给出了其矩和模的公式。此外,还介绍了负二项分布混合分布的构造。然后,我们转向彼得森模型,以估计双向捕获-再捕获实验中的种群数量。我们结合瓦林先验和实验中两次捕获单位数的超几何分布,构建了贝叶斯模型 for。利用广义瓦林分布的后验定量值,可以得到种群数量的可信区间。使用彼得森估计器的渐近方差构建的种群数量标准置信区间和 0.5 logit 转换区间都是广义瓦林可信区间的特例。无论捕获概率如何,在小规模种群中,该区间的真实覆盖范围都大于或等于其名义平均值。此外,其长度大大小于 0.5 logit 转换区间。因此,建议的广义瓦林可信区间似乎是量化彼得森种群规模估计值不确定性的最佳方法。
{"title":"Small Sample Inference for Two‐Way Capture‐Recapture Experiments","authors":"Louis‐Paul Rivest, Mamadou Yauck","doi":"10.1111/insr.12574","DOIUrl":"https://doi.org/10.1111/insr.12574","url":null,"abstract":"SummaryThe properties of the generalised Waring distribution defined on the non‐negative integers are reviewed. Formulas for its moments and its mode are given. A construction as a mixture of negative binomial distributions is also presented. Then we turn to the Petersen model for estimating the population size in a two‐way capture‐recapture experiment. We construct a Bayesian model for by combining a Waring prior with the hypergeometric distribution for the number of units caught twice in the experiment. Credible intervals for are obtained using quantiles of the posterior, a generalised Waring distribution. The standard confidence interval for the population size constructed using the asymptotic variance of Petersen estimator and 0.5 logit transformed interval are shown to be special cases of the generalised Waring credible interval. The true coverage of this interval is shown to be bigger than or equal to its nominal converage in small populations, regardless of the capture probabilities. In addition, its length is substantially smaller than that of the 0.5 logit transformed interval. Thus, the proposed generalised Waring credible interval appears to be the best way to quantify the uncertainty of the Petersen estimator for populations size.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140798380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonresponse Bias Analysis in Longitudinal Studies: A Comparative Review with an Application to the Early Childhood Longitudinal Study 纵向研究中的无应答偏差分析:应用于幼儿纵向研究的比较综述
IF 1.7 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-03-19 DOI: 10.1111/insr.12566
Yajuan Si, Roderick J.A. Little, Ya Mo, Nell Sedransk

Longitudinal studies are subject to nonresponse when individuals fail to provide data for entire waves or particular questions of the survey. We compare approaches to nonresponse bias analysis (NRBA) in longitudinal studies and illustrate them on the Early Childhood Longitudinal Study, Kindergarten Class of 2010–2011 (ECLS-K:2011). Wave nonresponse with attrition often yields a monotone missingness pattern, and the missingness mechanism can be missing at random (MAR) or missing not at random (MNAR). We discuss weighting, multiple imputation (MI), incomplete data modelling and Bayesian approaches to NRBA for monotone patterns. Weighting adjustments can be effective when the constructed weights are correlated with the survey outcome of interest. MI allows for variables with missing values to be included in the imputation model, yielding potentially less biased and more efficient estimates. We add offsets in the MAR results to provide sensitivity analyses to assess MNAR deviations. We conduct NRBA for descriptive summaries and analytic model estimates in the ECLS-K:2011 application. The strength of evidence about our NRBA depends on the strength of the relationship between the fully observed variables and the key survey outcomes, so the key to a successful NRBA is to include strong predictors.

摘要纵向研究会受到无应答的影响,即个人未能提供整个调查波次或特定问题的数据。我们比较了纵向研究中的无应答偏差分析(NRBA)方法,并在《2010-2011 年幼儿园班级幼儿纵向研究》(ECLS-K:2011)中进行了说明。带有自然减员的波形非响应通常会产生单调的缺失模式,缺失机制可以是随机缺失(MAR)或非随机缺失(MNAR)。我们讨论了针对单调模式的加权、多重估算(MI)、不完整数据建模和贝叶斯方法的 NRBA。当构建的权重与所关注的调查结果相关时,加权调整会很有效。MI 允许将缺失值变量纳入估算模型,从而减少偏差,提高估算效率。我们在 MAR 结果中添加了偏移量,以提供敏感性分析,评估 MNAR 偏差。我们对 ECLS-K:2011 应用程序中的描述性摘要和分析模型估计值进行了 NRBA。我们的 NRBA 证据的强度取决于完全观测变量与关键调查结果之间关系的强度,因此成功进行 NRBA 的关键在于纳入强有力的预测因素。
{"title":"Nonresponse Bias Analysis in Longitudinal Studies: A Comparative Review with an Application to the Early Childhood Longitudinal Study","authors":"Yajuan Si,&nbsp;Roderick J.A. Little,&nbsp;Ya Mo,&nbsp;Nell Sedransk","doi":"10.1111/insr.12566","DOIUrl":"10.1111/insr.12566","url":null,"abstract":"<p>Longitudinal studies are subject to nonresponse when individuals fail to provide data for entire waves or particular questions of the survey. We compare approaches to nonresponse bias analysis (NRBA) in longitudinal studies and illustrate them on the Early Childhood Longitudinal Study, Kindergarten Class of 2010–2011 (ECLS-K:2011). Wave nonresponse with attrition often yields a monotone missingness pattern, and the missingness mechanism can be missing at random (MAR) or missing not at random (MNAR). We discuss weighting, multiple imputation (MI), incomplete data modelling and Bayesian approaches to NRBA for monotone patterns. Weighting adjustments can be effective when the constructed weights are correlated with the survey outcome of interest. MI allows for variables with missing values to be included in the imputation model, yielding potentially less biased and more efficient estimates. We add offsets in the MAR results to provide sensitivity analyses to assess MNAR deviations. We conduct NRBA for descriptive summaries and analytic model estimates in the ECLS-K:2011 application. The strength of evidence about our NRBA depends on the strength of the relationship between the fully observed variables and the key survey outcomes, so the key to a successful NRBA is to include strong predictors.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12566","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140169849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multidimensional Stationary Time Series Dimension Reduction and Prediction Marianna Bolla, Tamás SzabadosRoutledge, 2023, xiv +  318 pages, $59.95, paperback ISBN: 9780367619701 多维静态时间序列降维与预测MariannaBolla,TamásSzabadosRoutledge,2023,xiv + 318页,59.95美元,平装 ISBN:9780367619701
IF 2 3区 数学 Q1 Mathematics Pub Date : 2024-03-12 DOI: 10.1111/insr.12567
Brian W. Sloboda
{"title":"Multidimensional Stationary Time Series Dimension Reduction and Prediction Marianna Bolla, Tamás SzabadosRoutledge, 2023, xiv +  318 pages, $59.95, paperback ISBN: 9780367619701","authors":"Brian W. Sloboda","doi":"10.1111/insr.12567","DOIUrl":"10.1111/insr.12567","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140115075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Slicing-Free Perspective to Sufficient Dimension Reduction: Selective Review and Recent Developments 充分降维的无切片视角:选择性回顾与最新发展
IF 1.7 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-03-07 DOI: 10.1111/insr.12565
Lu Li, Xiaofeng Shao, Zhou Yu

Since the pioneering work of sliced inverse regression, sufficient dimension reduction has been growing into a mature field in statistics and it has broad applications to regression diagnostics, data visualisation, image processing and machine learning. In this paper, we provide a review of several popular inverse regression methods, including sliced inverse regression (SIR) method and principal hessian directions (PHD) method. In addition, we adopt a conditional characteristic function approach and develop a new class of slicing-free methods, which are parallel to the classical SIR and PHD, and are named weighted inverse regression ensemble (WIRE) and weighted PHD (WPHD), respectively. Relationship with recently developed martingale difference divergence matrix is also revealed. Numerical studies and a real data example show that the proposed slicing-free alternatives have superior performance than SIR and PHD.

摘要自切片反回归的开创性工作以来,充分降维已逐渐发展成为统计学中的一个成熟领域,并在回归诊断、数据可视化、图像处理和机器学习等方面有着广泛的应用。在本文中,我们回顾了几种流行的反回归方法,包括切片反回归(SIR)方法和主哈希安方向(PHD)方法。此外,我们采用条件特征函数方法,开发了一类新的无切片方法,与经典的 SIR 和 PHD 方法并行,并分别命名为加权反回归集合(WIRE)和加权 PHD(WPHD)。此外,还揭示了与最近开发的马氏差分发散矩阵的关系。数值研究和真实数据实例表明,所提出的无切分替代方案比 SIR 和 PHD 具有更优越的性能。
{"title":"A Slicing-Free Perspective to Sufficient Dimension Reduction: Selective Review and Recent Developments","authors":"Lu Li,&nbsp;Xiaofeng Shao,&nbsp;Zhou Yu","doi":"10.1111/insr.12565","DOIUrl":"10.1111/insr.12565","url":null,"abstract":"<div>\u0000 \u0000 <p>Since the pioneering work of sliced inverse regression, sufficient dimension reduction has been growing into a mature field in statistics and it has broad applications to regression diagnostics, data visualisation, image processing and machine learning. In this paper, we provide a review of several popular inverse regression methods, including sliced inverse regression (SIR) method and principal hessian directions (PHD) method. In addition, we adopt a conditional characteristic function approach and develop a new class of slicing-free methods, which are parallel to the classical SIR and PHD, and are named weighted inverse regression ensemble (WIRE) and weighted PHD (WPHD), respectively. Relationship with recently developed martingale difference divergence matrix is also revealed. Numerical studies and a real data example show that the proposed slicing-free alternatives have superior performance than SIR and PHD.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140073149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Statistical Review
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1