This note presents a refined local approximation for the logarithm of the ratio between the negative multinomial probability mass function and a multivariate normal density, both having the same mean–covariance structure. This approximation, which is derived using Stirling's formula and a meticulous treatment of Taylor expansions, yields an upper bound on the Hellinger distance between the jittered negative multinomial distribution and the corresponding multivariate normal distribution. Upper bounds on the Le Cam distance between negative multinomial and multivariate normal experiments ensue.
{"title":"Asymptotic comparison of negative multinomial and multivariate normal experiments","authors":"Christian Genest, Frédéric Ouimet","doi":"10.1111/stan.12328","DOIUrl":"https://doi.org/10.1111/stan.12328","url":null,"abstract":"This note presents a refined local approximation for the logarithm of the ratio between the negative multinomial probability mass function and a multivariate normal density, both having the same mean–covariance structure. This approximation, which is derived using Stirling's formula and a meticulous treatment of Taylor expansions, yields an upper bound on the Hellinger distance between the jittered negative multinomial distribution and the corresponding multivariate normal distribution. Upper bounds on the Le Cam distance between negative multinomial and multivariate normal experiments ensue.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136019643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In May 2023 a new editorial team, consisting of Edwin van den Heuvel, Veronica Vinciotti and myself, Ernst Wit, currently with the help of Casper Albers, have taken over from our predecessors, Nan van Geloven, Marijtje van Duijn, and Miroslav Ristic. We have immediately moved to a new format, of which we have informed you in the previous issue of Statistica Neerlandica, consisting of a fast-turnaround system with the aim of going from first submission to actual publication in two (!) months' time. Despite the efforts required on the part of the editors, associate editors and the publisher Wiley, the system seems to be working as intended. After the first half year, we are seeing the first fruits of this new approach. The first publications of the new regime are already available online, well within the 2-month target, and the number of quality submissions to Statistica Neerlandica is up. Clearly, we need to see how the system will develop in the future, but the first signs are encouraging. Besides the fast-turnaround system, we also want to diversify our offerings in the journal to better connect to the statistical society and the readers of Statistica Neerlandica. For eight decades we have only had the research paper, in which statistical researchers presented their research in a typical 20-page (or so) format. This trusted article type will obviously stay, as it a universal scientific currency with value for researchers and readers alike. From now on, however, we will also accept three new article types, in particular, Short Notes, Tutorials and State-of-the-Art reviews. Short Notes are short contributions that present a single, original, and significant discovery for rapid dissemination. They are aimed at informing colleagues about a finding, useful for the application, computation, methodology or theory of statistics. They should not be longer than 6–7 pages (2,500 words). These short notes can avoid long introductions or conclusions, and they do not need a lot of contextualization. They are little statistical gems, nice insights, useful mathematical tools that can aid the rest of the statistical community. Tutorials are aimed at introducing a general area of statistical theory, statistical methodology, statistical computing, statistical software, or an important application area of statistics. The tutorial should act as an introduction to those not already familiar with the area, and as a review to others. The mathematical level of each depends upon the topic. In all cases, however, the tutorial will strive for the broadest possible audience of researchers who comprise the readership of Statistica Neerlandica. Whereas tutorials are introductory, State-of-the-Art reviews focus on the latest developments. State-of-the-Art reviews aim at bringing an interested statistical audience up-to-date in an important and topical subject in the field of statistics with a focus on summarizing the latest advances. Although they present advanced material, th
2023年5月,一个由Edwin van den Heuvel, Veronica Vinciotti和我,Ernst Wit组成的新编辑团队,目前在Casper Albers的帮助下,接替了我们的前任Nan van Geloven, Marijtje van Duijn和Miroslav Ristic。我们已经立即移动到一个新的格式,其中我们已经通知你在前一个问题的统计荷兰,包括一个快速周转系统,从第一次提交到实际出版的目的在两个(!)个月的时间。尽管编辑、副编辑和出版商Wiley需要付出努力,但这个系统似乎正在按计划工作。上半年过去了,我们看到了这种新方法的初步成果。新制度的第一批出版物已经在网上发布,远远超过了2个月的目标,向荷兰统计局提交的高质量报告数量也在增加。显然,我们需要观察这个系统未来将如何发展,但最初的迹象令人鼓舞。除了快速周转系统,我们还希望使我们在期刊上的产品多样化,以便更好地与统计学会和《荷兰统计》的读者建立联系。80年来,我们只有研究论文,其中统计研究人员以典型的20页(左右)格式展示他们的研究。这种值得信赖的文章类型显然会继续存在,因为它是一种普遍的科学货币,对研究人员和读者都有价值。但是,从现在开始,我们还将接受3种新的文章类型,特别是Short Notes, tutorial和state -of-the- review。短文是一篇简短的文章,它提出了一个单一的、原创的、重要的发现,以便迅速传播。它们的目的是通知同事一个对应用、计算、方法或统计学理论有用的发现。简历不应超过6-7页(2500字)。这些简短的笔记可以避免冗长的介绍或结论,也不需要大量的语境化。它们是小小的统计瑰宝,有很好的见解,有用的数学工具,可以帮助统计社区的其他人。教程旨在介绍统计理论、统计方法、统计计算、统计软件或统计的重要应用领域的一般领域。对于那些不熟悉该领域的人来说,教程应该是一个介绍,同时也是对其他人的一种回顾。每个人的数学水平取决于主题。然而,在所有情况下,本教程都将争取尽可能广泛的研究人员读者,这些研究人员包括《荷兰统计》的读者。教程是介绍性的,而最新的评论则侧重于最新的发展。最新的评论旨在使有兴趣的统计读者了解统计领域中一个重要和热门主题的最新情况,重点是总结最新进展。虽然它们提供了先进的材料,但它们应该是专门和感兴趣的统计学家,而不仅仅是专家可以访问的。它们应该为感兴趣的研究人员提供一种方法,使他们能够迅速掌握某一统计主题的最新技术。因此,最先进的文章应该限制在大约12-13页(5000字)。在我们努力使《荷兰统计》再次成为Q1期刊的过程中,我们邀请社区向我们提供建议,向我们反馈我们的工作情况,当然,还请向期刊提交手稿,可以是研究论文,也可以是三种新格式之一:简短说明、教程或最新评论。我们期待着收到更多高质量的提交。
{"title":"New article types in <i>Statistica Neerlandica</i>","authors":"Ernst‐Jan Camiel Wit","doi":"10.1111/stan.12324","DOIUrl":"https://doi.org/10.1111/stan.12324","url":null,"abstract":"In May 2023 a new editorial team, consisting of Edwin van den Heuvel, Veronica Vinciotti and myself, Ernst Wit, currently with the help of Casper Albers, have taken over from our predecessors, Nan van Geloven, Marijtje van Duijn, and Miroslav Ristic. We have immediately moved to a new format, of which we have informed you in the previous issue of Statistica Neerlandica, consisting of a fast-turnaround system with the aim of going from first submission to actual publication in two (!) months' time. Despite the efforts required on the part of the editors, associate editors and the publisher Wiley, the system seems to be working as intended. After the first half year, we are seeing the first fruits of this new approach. The first publications of the new regime are already available online, well within the 2-month target, and the number of quality submissions to Statistica Neerlandica is up. Clearly, we need to see how the system will develop in the future, but the first signs are encouraging. Besides the fast-turnaround system, we also want to diversify our offerings in the journal to better connect to the statistical society and the readers of Statistica Neerlandica. For eight decades we have only had the research paper, in which statistical researchers presented their research in a typical 20-page (or so) format. This trusted article type will obviously stay, as it a universal scientific currency with value for researchers and readers alike. From now on, however, we will also accept three new article types, in particular, Short Notes, Tutorials and State-of-the-Art reviews. Short Notes are short contributions that present a single, original, and significant discovery for rapid dissemination. They are aimed at informing colleagues about a finding, useful for the application, computation, methodology or theory of statistics. They should not be longer than 6–7 pages (2,500 words). These short notes can avoid long introductions or conclusions, and they do not need a lot of contextualization. They are little statistical gems, nice insights, useful mathematical tools that can aid the rest of the statistical community. Tutorials are aimed at introducing a general area of statistical theory, statistical methodology, statistical computing, statistical software, or an important application area of statistics. The tutorial should act as an introduction to those not already familiar with the area, and as a review to others. The mathematical level of each depends upon the topic. In all cases, however, the tutorial will strive for the broadest possible audience of researchers who comprise the readership of Statistica Neerlandica. Whereas tutorials are introductory, State-of-the-Art reviews focus on the latest developments. State-of-the-Art reviews aim at bringing an interested statistical audience up-to-date in an important and topical subject in the field of statistics with a focus on summarizing the latest advances. Although they present advanced material, th","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135995322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Single‐index model is a very popular and powerful semiparametric model. As an improvement of the maximum rank correlation estimator, [[spiapacite]]bib1[[/spiapacite]] proposed the linearized maximum rank correlation estimator. We show that this estimator has some interesting connections with the distribution‐transformed least‐squares estimator for single‐index models. We also propose a rescaled distribution‐transformed least‐squares estimator, which is mathematically equivalent to the linearized maximum rank correlation estimator when the distribution of the response is absolutely continuous. Despite some nontrivial connections, the two estimation procedures are different in terms of motivations, interpretations, and applications. We discuss some of the differences between the two estimation procedures. This article is protected by copyright. All rights reserved.
{"title":"Connections between two classes of estimators for single‐index models","authors":"Weichao Yang, Xu Guo, Niwen Zhou, Changliang Zou","doi":"10.1111/stan.12329","DOIUrl":"https://doi.org/10.1111/stan.12329","url":null,"abstract":"Single‐index model is a very popular and powerful semiparametric model. As an improvement of the maximum rank correlation estimator, [[spiapacite]]bib1[[/spiapacite]] proposed the linearized maximum rank correlation estimator. We show that this estimator has some interesting connections with the distribution‐transformed least‐squares estimator for single‐index models. We also propose a rescaled distribution‐transformed least‐squares estimator, which is mathematically equivalent to the linearized maximum rank correlation estimator when the distribution of the response is absolutely continuous. Despite some nontrivial connections, the two estimation procedures are different in terms of motivations, interpretations, and applications. We discuss some of the differences between the two estimation procedures. This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135918330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a new procedure to test conditional independence assumption in studying casual inference for time series data. The conditional independence assumption is transformed to a nonparametric conditional moment test with the help of auxiliary variables which are allowed to affect policy choice but the dependence can be fully captured by potential outcomes and observable controls. When the policy choice is binary, a nonparametric statistic test is developed further for testing the conditional independence assumption conditional on policy propensity score. Under some regular conditions, we show that the proposed test statistics are asymptotically normal under the null hypotheses for time series data. In addition, the performances of the proposed methods are illustrated through Monte Carlo simulations and a real example considered in Angrist and Kuersteiner (2011).
{"title":"Testing Conditional Independence in Casual Inference for Time Series Data<sup>†</sup>","authors":"Zongwu Cai, Ying Fang, Ming Lin, Shengfang Tang","doi":"10.1111/stan.12323","DOIUrl":"https://doi.org/10.1111/stan.12323","url":null,"abstract":"In this paper, we propose a new procedure to test conditional independence assumption in studying casual inference for time series data. The conditional independence assumption is transformed to a nonparametric conditional moment test with the help of auxiliary variables which are allowed to affect policy choice but the dependence can be fully captured by potential outcomes and observable controls. When the policy choice is binary, a nonparametric statistic test is developed further for testing the conditional independence assumption conditional on policy propensity score. Under some regular conditions, we show that the proposed test statistics are asymptotically normal under the null hypotheses for time series data. In addition, the performances of the proposed methods are illustrated through Monte Carlo simulations and a real example considered in Angrist and Kuersteiner (2011).","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135477036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We provide a prior distribution for a functional parameter so that its trajectories are smooth and vanish on a given subset. This distribution can be interpreted as the distribution of an initial Gaussian process conditioned to be zero on a given subset. Precisely, we show that the initial Gaussian process is the sum of the conditioned process and an independent process with probability one and that all the processes have the same almost sure regularity. This prior distribution is use to provide an interpretable estimate of the coefficient function in the linear scalar‐on‐function regression; by interpretable, we mean a smooth function that may possibly be zero on some intervals. We apply our model in a simulation and real case studies with two different priors for the null region of the coefficient function. In one case, the null region is known to be an unknown single interval. In the other case, it can be any unknown unions of intervals.This article is protected by copyright. All rights reserved.
{"title":"An Informative Prior distribution on Functions with Application to Functional Regression","authors":"C. Abraham","doi":"10.1111/stan.12322","DOIUrl":"https://doi.org/10.1111/stan.12322","url":null,"abstract":"We provide a prior distribution for a functional parameter so that its trajectories are smooth and vanish on a given subset. This distribution can be interpreted as the distribution of an initial Gaussian process conditioned to be zero on a given subset. Precisely, we show that the initial Gaussian process is the sum of the conditioned process and an independent process with probability one and that all the processes have the same almost sure regularity. This prior distribution is use to provide an interpretable estimate of the coefficient function in the linear scalar‐on‐function regression; by interpretable, we mean a smooth function that may possibly be zero on some intervals. We apply our model in a simulation and real case studies with two different priors for the null region of the coefficient function. In one case, the null region is known to be an unknown single interval. In the other case, it can be any unknown unions of intervals.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90644063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are non‐ignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by non‐ignorable missingness. All the technical proofs of the theorems are given in the Online Supplementary Material.This article is protected by copyright. All rights reserved.
{"title":"Semiparametric Recovery of Central Dimension Reduction Space with Nonignorable Nonresponse","authors":"Siming Zheng, Alan T.K. Wan, Yong Zhou","doi":"10.1111/stan.12321","DOIUrl":"https://doi.org/10.1111/stan.12321","url":null,"abstract":"Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are non‐ignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by non‐ignorable missingness. All the technical proofs of the theorems are given in the Online Supplementary Material.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89912126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Martín Andrés, Álvarez Hernández M, Gayá Moreno F
Asymptotic inferences about the difference, ratio or odds‐ratio of two independent proportions are very common in diverse fields. This article defines for each parameter eight conditional inference methods. These methods depend on: (1) using a chi‐squared type statistic or a z type one; (2) using the classic Yates continuity correction or the less well‐known Conover one; and (3) whether the p‐value of the test is determined by doubling the one‐tailed p‐value or by the Mantel method (asymmetrical approach). In all cases, the conclusions are: (i) the methods based on the chi‐squared statistic should not be used, as they are too liberal; (ii) for those in favour of using the criterion of doubling the p‐value, the best method is using the z statistic with Conover continuity correction; and (iii) for those in favour of the asymmetrical approach, the best method is based on the z statistic with Conover continuity correction and the Mantel p‐value.This article is protected by copyright. All rights reserved.
{"title":"The Yates, Conover and Mantel statistics in 2×2 tables revisited (and extended)","authors":"A. Martín Andrés, Álvarez Hernández M, Gayá Moreno F","doi":"10.1111/stan.12320","DOIUrl":"https://doi.org/10.1111/stan.12320","url":null,"abstract":"Asymptotic inferences about the difference, ratio or odds‐ratio of two independent proportions are very common in diverse fields. This article defines for each parameter eight conditional inference methods. These methods depend on: (1) using a chi‐squared type statistic or a z type one; (2) using the classic Yates continuity correction or the less well‐known Conover one; and (3) whether the p‐value of the test is determined by doubling the one‐tailed p‐value or by the Mantel method (asymmetrical approach). In all cases, the conclusions are: (i) the methods based on the chi‐squared statistic should not be used, as they are too liberal; (ii) for those in favour of using the criterion of doubling the p‐value, the best method is using the z statistic with Conover continuity correction; and (iii) for those in favour of the asymmetrical approach, the best method is based on the z statistic with Conover continuity correction and the Mantel p‐value.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84502447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Estimation of the average treatment effect is one of the crucial problems in clinical trials for two or multiple treatments. The covariate‐adaptive randomization methods are often applied to balance treatment assignments across prognostic factors in clinical trials, such as the minimization and stratified permuted blocks method. We propose a model‐free estimator of average treatment effects under covariate‐adaptive randomization methods, which is least square adjustment for the estimator of outcome models. The proposed estimator is not only applicable to the case of binary treatment, but also can be extended to the case of multiple treatment. The proposed estimator is consistent and asymptotically normally distributed. Simulation studies show that the proposed estimator and Ye's estimator are comparable, and it performs better than Bugni's estimator when the outcome model is linear. The proposed estimator has some advantages over targeted maximum likelihood estimator, Bugni's estimator and Ye's estimator in terms of the standard error and root mean squared error when the outcome model is nonlinear. The proposed estimator is stable for the from of outcome model. Finally, we apply the proposed methodology to a data set that studies the causal effect promotional videos mode on the school‐age children's educational attainment in Peru.This article is protected by copyright. All rights reserved.
{"title":"Improved estimation of average treatment effects under covariate‐adaptive randomization methods","authors":"Jun Wang, Yahe Yu","doi":"10.1111/stan.12319","DOIUrl":"https://doi.org/10.1111/stan.12319","url":null,"abstract":"Estimation of the average treatment effect is one of the crucial problems in clinical trials for two or multiple treatments. The covariate‐adaptive randomization methods are often applied to balance treatment assignments across prognostic factors in clinical trials, such as the minimization and stratified permuted blocks method. We propose a model‐free estimator of average treatment effects under covariate‐adaptive randomization methods, which is least square adjustment for the estimator of outcome models. The proposed estimator is not only applicable to the case of binary treatment, but also can be extended to the case of multiple treatment. The proposed estimator is consistent and asymptotically normally distributed. Simulation studies show that the proposed estimator and Ye's estimator are comparable, and it performs better than Bugni's estimator when the outcome model is linear. The proposed estimator has some advantages over targeted maximum likelihood estimator, Bugni's estimator and Ye's estimator in terms of the standard error and root mean squared error when the outcome model is nonlinear. The proposed estimator is stable for the from of outcome model. Finally, we apply the proposed methodology to a data set that studies the causal effect promotional videos mode on the school‐age children's educational attainment in Peru.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90073386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher Aguirre‐Hamilton, Stephen A. Sedory, Sarjinder Singh
We propose two types of estimators that are analogous to Franklin's model. One estimator is derived by concentrating on the row averages of the responses, and another is obtained by concentrating on the column averages of the observed responses. In the latter case we have two responses per respondent from a bi‐variate normal distribution. The proposed estimator based on row averages, by making use of negatively correlated random numbers from a multivariate density, is always more efficient than the corresponding Franklin's estimator. In the case of the proposed estimator based on column averages, we found that the use of positively correlated random numbers from a bivariate density can lead to the most efficient estimator. We also discuss results which are observed by making use of three responses per respondent. When the three responses are recorded, three independent normal densities are derived from three correlated variables. The findings are supported based on analytical, numerical and simulation studies. A simulation study was done to determine the minimum sample size required to produce non‐negative estimates of the population proportion of a sensitive characteristic, and to investigate the 95% nominal coverage by the interval estimates. Ultimately at the end, one best estimator is suggested. A very neat and clean derivations of theoretical results and discussion of numerical and simulation studies are documented in online supplementary material.This article is protected by copyright. All rights reserved.
{"title":"Franklin's Randomized Response Model With Correlated Scrambled Variables","authors":"Christopher Aguirre‐Hamilton, Stephen A. Sedory, Sarjinder Singh","doi":"10.1111/stan.12318","DOIUrl":"https://doi.org/10.1111/stan.12318","url":null,"abstract":"We propose two types of estimators that are analogous to Franklin's model. One estimator is derived by concentrating on the row averages of the responses, and another is obtained by concentrating on the column averages of the observed responses. In the latter case we have two responses per respondent from a bi‐variate normal distribution. The proposed estimator based on row averages, by making use of negatively correlated random numbers from a multivariate density, is always more efficient than the corresponding Franklin's estimator. In the case of the proposed estimator based on column averages, we found that the use of positively correlated random numbers from a bivariate density can lead to the most efficient estimator. We also discuss results which are observed by making use of three responses per respondent. When the three responses are recorded, three independent normal densities are derived from three correlated variables. The findings are supported based on analytical, numerical and simulation studies. A simulation study was done to determine the minimum sample size required to produce non‐negative estimates of the population proportion of a sensitive characteristic, and to investigate the 95% nominal coverage by the interval estimates. Ultimately at the end, one best estimator is suggested. A very neat and clean derivations of theoretical results and discussion of numerical and simulation studies are documented in online supplementary material.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89826934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}