首页 > 最新文献

International Statistical Review最新文献

英文 中文
Who Is My Neighbour? A Discussion of ‘A Socio-Demographic Latent Space Approach to Spatial Data When Geography Is Important but Not All-Important’ 谁是我的邻居?“当地理很重要但不是全部重要时,空间数据的社会人口潜在空间方法”的讨论
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-09-02 DOI: 10.1111/insr.70011
Lance A. Waller
{"title":"Who Is My Neighbour? A Discussion of ‘A Socio-Demographic Latent Space Approach to Spatial Data When Geography Is Important but Not All-Important’","authors":"Lance A. Waller","doi":"10.1111/insr.70011","DOIUrl":"https://doi.org/10.1111/insr.70011","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 3","pages":"377-380"},"PeriodicalIF":1.8,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145442918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Invited Discussion on Article by Nandy, Holan and Schweinberger Nandy, Holan和Schweinberger文章邀请讨论
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-08-18 DOI: 10.1111/insr.70008
Jonathan Bradley
{"title":"Invited Discussion on Article by Nandy, Holan and Schweinberger","authors":"Jonathan Bradley","doi":"10.1111/insr.70008","DOIUrl":"https://doi.org/10.1111/insr.70008","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 3","pages":"374-376"},"PeriodicalIF":1.8,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rejoinder: A Socio-Demographic Latent Space Approach to Spatial Data When Geography is Important but Not All-Important 反驳:当地理很重要但不是全部重要时,对空间数据的社会人口潜在空间方法
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-08-07 DOI: 10.1111/insr.70009
Saikat Nandy, Scott H. Holan, Michael Schweinberger
{"title":"Rejoinder: A Socio-Demographic Latent Space Approach to Spatial Data When Geography is Important but Not All-Important","authors":"Saikat Nandy, Scott H. Holan, Michael Schweinberger","doi":"10.1111/insr.70009","DOIUrl":"https://doi.org/10.1111/insr.70009","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 3","pages":"380-384"},"PeriodicalIF":1.8,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Socio-demographic Latent Space Approach to Spatial Data When Geography Is Important But not All-Important1 当地理很重要但不是全部重要时,空间数据的社会人口潜在空间方法1
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-08-07 DOI: 10.1111/insr.70004
Saikat Nandy, Scott H. Holan, Michael Schweinberger

Many models for spatial and spatio-temporal data assume that ‘near things are more related than distant things’, which is known as the first law of geography. While geography may be important, it may not be all-important, for at least two reasons. First, technology helps bridge distance, so that regions separated by large distances may be more similar than would be expected based on geographical distance. Second, geographical, political and social divisions can make neighbouring regions dissimilar. We develop a flexible Bayesian approach for learning from spatial data in which units are close in an unobserved socio-demographic space and hence which units are similar. While classic approaches based on nearest-neighbour adjacency matrices may not fully capture all of the spatial correlation, the proposed approach learns neighbourhoods from data, and averages over all possible neighbourhood structures. We demonstrate the advantages of the proposed approach by presenting simulations along with applications to county-level American Community Survey data on median household income in the US states of Florida, North Carolina and South Carolina.

许多空间和时空数据模型都假设“近的事物比远的事物更相关”,这被称为地理学第一定律。虽然地理位置可能很重要,但它可能不是全部重要,至少有两个原因。首先,技术有助于弥合距离,因此相隔很远的地区可能比基于地理距离的预期更加相似。其次,地理、政治和社会的分裂会使邻近地区产生差异。我们开发了一种灵活的贝叶斯方法,用于从空间数据中学习,其中单元在未观察到的社会人口空间中接近,因此哪些单元是相似的。虽然基于最近邻邻接矩阵的经典方法可能无法完全捕获所有的空间相关性,但本文提出的方法从数据中学习邻域,并对所有可能的邻域结构进行平均。我们通过对美国佛罗里达州、北卡罗来纳州和南卡罗来纳州的家庭收入中位数的县级美国社区调查数据进行模拟和应用,展示了所提出方法的优势。
{"title":"A Socio-demographic Latent Space Approach to Spatial Data When Geography Is Important But not All-Important1","authors":"Saikat Nandy,&nbsp;Scott H. Holan,&nbsp;Michael Schweinberger","doi":"10.1111/insr.70004","DOIUrl":"https://doi.org/10.1111/insr.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Many models for spatial and spatio-temporal data assume that ‘near things are more related than distant things’, which is known as the first law of geography. While geography may be important, it may not be all-important, for at least two reasons. First, technology helps bridge distance, so that regions separated by large distances may be more similar than would be expected based on geographical distance. Second, geographical, political and social divisions can make neighbouring regions dissimilar. We develop a flexible Bayesian approach for learning from spatial data in which units are close in an unobserved socio-demographic space and hence which units are similar. While classic approaches based on nearest-neighbour adjacency matrices may not fully capture all of the spatial correlation, the proposed approach learns neighbourhoods from data, and averages over all possible neighbourhood structures. We demonstrate the advantages of the proposed approach by presenting simulations along with applications to county-level American Community Survey data on median household income in the US states of Florida, North Carolina and South Carolina.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 3","pages":"351-373"},"PeriodicalIF":1.8,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical Depth Meets Machine Learning: Kernel Mean Embeddings and Depth in Functional Data Analysis 统计深度满足机器学习:核均值嵌入和功能数据分析的深度
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-03-16 DOI: 10.1111/insr.12611
George Wynne, Stanislav Nagy

Statistical depth is the act of gauging how representative a point is compared with a reference probability measure. The depth allows introducing rankings and orderings to data living in multivariate, or function spaces. Though widely applied and with much experimental success, little theoretical progress has been made in analysing functional depths. This article highlights how the common h-depth and related depths from functional data analysis can be viewed as a kernel mean embedding, widely used in statistical machine learning. This facilitates answers to several open questions regarding the statistical properties of functional depths. We show that (i) h-depth has the interpretation of a kernel-based method; (ii) several h-depths possess explicit expressions, without the need to estimate them using Monte Carlo procedures; (iii) under minimal assumptions, h-depths and their maximisers are uniformly strongly consistent and asymptotically Gaussian (also in infinite-dimensional spaces and for imperfectly observed functional data); and (iv) several h-depths uniquely characterise probability distributions in separable Hilbert spaces. In addition, we also provide a link between the depth and empirical characteristic function based procedures for functional data. Finally, the unveiled connections enable to design an extension of the h-depth towards regression problems.

统计深度是衡量一个点与参考概率测量相比具有多大代表性的行为。深度允许对多变量空间或函数空间中的数据引入排名和排序。虽然在分析功能深度方面得到了广泛的应用和大量的实验成功,但在理论方面的进展却很少。本文强调了如何将函数数据分析中的常见h-depth和相关深度视为核均值嵌入,广泛用于统计机器学习。这有助于回答关于功能深度的统计特性的几个开放问题。我们表明(i) h-depth具有基于核的方法的解释;(ii)若干h-深度具有显式表达式,无需使用蒙特卡罗程序估计它们;(iii)在最小假设下,h-depth和它们的最大值是一致强一致和渐近高斯的(也适用于无限维空间和不完全观测的函数数据);(iv)几个h深度唯一地表征了可分离希尔伯特空间中的概率分布。此外,我们还提供了深度和基于经验特征函数的功能数据程序之间的联系。最后,揭示的连接可以设计h-depth对回归问题的扩展。
{"title":"Statistical Depth Meets Machine Learning: Kernel Mean Embeddings and Depth in Functional Data Analysis","authors":"George Wynne,&nbsp;Stanislav Nagy","doi":"10.1111/insr.12611","DOIUrl":"https://doi.org/10.1111/insr.12611","url":null,"abstract":"<div>\u0000 \u0000 <p>Statistical depth is the act of gauging how representative a point is compared with a reference probability measure. The depth allows introducing rankings and orderings to data living in multivariate, or function spaces. Though widely applied and with much experimental success, little theoretical progress has been made in analysing functional depths. This article highlights how the common \u0000<span></span><math>\u0000 <mi>h</mi></math>-depth and related depths from functional data analysis can be viewed as a kernel mean embedding, widely used in statistical machine learning. This facilitates answers to several open questions regarding the statistical properties of functional depths. We show that (i) \u0000<span></span><math>\u0000 <mi>h</mi></math>-depth has the interpretation of a kernel-based method; (ii) several \u0000<span></span><math>\u0000 <mi>h</mi></math>-depths possess explicit expressions, without the need to estimate them using Monte Carlo procedures; (iii) under minimal assumptions, \u0000<span></span><math>\u0000 <mi>h</mi></math>-depths and their maximisers are uniformly strongly consistent and asymptotically Gaussian (also in infinite-dimensional spaces and for imperfectly observed functional data); and (iv) several \u0000<span></span><math>\u0000 <mi>h</mi></math>-depths uniquely characterise probability distributions in separable Hilbert spaces. In addition, we also provide a link between the depth and empirical characteristic function based procedures for functional data. Finally, the unveiled connections enable to design an extension of the \u0000<span></span><math>\u0000 <mi>h</mi></math>-depth towards regression problems.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 2","pages":"317-348"},"PeriodicalIF":1.8,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144774096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Screening for Ultrahigh Dimensional Mixed Data via Wasserstein Distance 基于Wasserstein距离的超高维混合数据特征筛选
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-02-15 DOI: 10.1111/insr.12609
Bing Tian, Hong Wang

This article develops a novel feature screening procedure for ultrahigh dimensional mixed data based on Wasserstein distance, termed as Wasserstein-SIS. To handle the mixture of continuous and discrete data, we use Wasserstein distance as a new marginal utility to measure the difference between the joint distribution and the product of marginal distributions. In theory, we establish the sure screening property under less restrictive assumptions on data types. The proposed procedure does not require model specification, gives a more effective geometric measure to compare the discrepancy between distributions and avoids introducing biases caused by the choice of slicing rules for continuous data. Numerical comparison indicates that the proposed Wasserstein-SIS method performs better than existing methods in various models. A real data application also validates the better practicability of Wasserstein-SIS.

本文提出了一种基于Wasserstein距离的超高维混合数据特征筛选方法,称为Wasserstein- sis。为了处理连续和离散数据的混合,我们使用Wasserstein距离作为一种新的边际效用来度量联合分布和边际分布乘积之间的差值。理论上,我们在对数据类型较少限制的假设下建立了确定筛选性质。该方法不需要模型规范,提供了一种更有效的几何度量来比较分布之间的差异,并避免了连续数据切片规则选择带来的偏差。数值比较表明,本文提出的Wasserstein-SIS方法在各种模型下均优于现有方法。实际数据应用也验证了Wasserstein-SIS较好的实用性。
{"title":"Feature Screening for Ultrahigh Dimensional Mixed Data via Wasserstein Distance","authors":"Bing Tian,&nbsp;Hong Wang","doi":"10.1111/insr.12609","DOIUrl":"https://doi.org/10.1111/insr.12609","url":null,"abstract":"<div>\u0000 \u0000 <p>This article develops a novel feature screening procedure for ultrahigh dimensional mixed data based on Wasserstein distance, termed as Wasserstein-SIS. To handle the mixture of continuous and discrete data, we use Wasserstein distance as a new marginal utility to measure the difference between the joint distribution and the product of marginal distributions. In theory, we establish the sure screening property under less restrictive assumptions on data types. The proposed procedure does not require model specification, gives a more effective geometric measure to compare the discrepancy between distributions and avoids introducing biases caused by the choice of slicing rules for continuous data. Numerical comparison indicates that the proposed Wasserstein-SIS method performs better than existing methods in various models. A real data application also validates the better practicability of Wasserstein-SIS.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 2","pages":"267-287"},"PeriodicalIF":1.8,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144774049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Do Applied Researchers Use the Causal Forest? A Methodological Review 应用研究人员如何使用因果森林?方法回顾
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-02-12 DOI: 10.1111/insr.12610
Patrick Rehill

This methodological review examines the use of the causal forest method by applied researchers across 133 peer-reviewed papers. It shows that the emerging best practice relies heavily on the approach and tools created by the original authors of the causal forest such as their grf package and the approaches given by them in examples. Generally, researchers use the causal forest on a relatively low-dimensional dataset relying on observed controls or in some cases experiments to identify effects. There are several common ways to then communicate results–by mapping out the univariate distribution of individual-level treatment effect estimates, displaying variable importance results for the forest and graphing the distribution of treatment effects across covariates that are important either for theoretical reasons or because they have high variable importance. Some deviations from this common practice are interesting and deserve further development and use. Others are unnecessary or even harmful. The paper concludes by reflecting on the emerging best practice for causal forest use and paths for future research.

本方法学综述检查了133篇同行评议论文中应用研究人员对因果森林方法的使用。它表明,新兴的最佳实践在很大程度上依赖于因果森林的原作者创建的方法和工具,例如他们的grf包和他们在示例中给出的方法。一般来说,研究人员在相对低维的数据集上使用因果森林,依赖于观察到的控制,或者在某些情况下通过实验来识别影响。然后有几种常见的方法来传达结果——通过绘制个人水平处理效果估计的单变量分布,显示森林的变量重要性结果,以及绘制处理效果跨协变量分布的图表,这些协变量要么因为理论原因重要,要么因为它们具有高变量重要性。这种常见做法的一些偏差很有趣,值得进一步开发和使用。还有一些是不必要的,甚至是有害的。论文最后反思了新兴的因果森林利用的最佳实践和未来研究的路径。
{"title":"How Do Applied Researchers Use the Causal Forest? A Methodological Review","authors":"Patrick Rehill","doi":"10.1111/insr.12610","DOIUrl":"https://doi.org/10.1111/insr.12610","url":null,"abstract":"<p>This methodological review examines the use of the causal forest method by applied researchers across 133 peer-reviewed papers. It shows that the emerging best practice relies heavily on the approach and tools created by the original authors of the causal forest such as their grf package and the approaches given by them in examples. Generally, researchers use the causal forest on a relatively low-dimensional dataset relying on observed controls or in some cases experiments to identify effects. There are several common ways to then communicate results–by mapping out the univariate distribution of individual-level treatment effect estimates, displaying variable importance results for the forest and graphing the distribution of treatment effects across covariates that are important either for theoretical reasons or because they have high variable importance. Some deviations from this common practice are interesting and deserve further development and use. Others are unnecessary or even harmful. The paper concludes by reflecting on the emerging best practice for causal forest use and paths for future research.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 2","pages":"288-316"},"PeriodicalIF":1.8,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12610","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144774102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Number of Components for Matrix-Variate Mixtures: A Comparison Among Information Criteria 关于矩阵-变量混合的分量数:信息准则的比较
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-01-08 DOI: 10.1111/insr.12607
Salvatore D. Tomarchio, Antonio Punzo

This study explores the crucial task of determining the optimal number of components in mixture models, known as mixture order, when considering matrix-variate data. Despite the growing interest in this data type among practitioners and researchers, the effectiveness of information criteria in selecting the mixture order remains largely unexplored in this branch of the literature. Although the Bayesian information criterion (BIC) is commonly utilised, its effectiveness is only marginally tested in this context, and several other potentially valuable criteria exist. An extensive simulation study evaluates the performance of 10 information criteria across various data structures, specifically focusing on matrix-variate normal mixtures.

本研究探讨了在考虑矩阵变量数据时确定混合模型中组分的最佳数量的关键任务,称为混合顺序。尽管从业者和研究人员对这种数据类型的兴趣日益浓厚,但信息标准在选择混合顺序方面的有效性在这一文献分支中仍未得到很大程度的探索。虽然贝叶斯信息准则(BIC)是常用的,但在这种情况下,它的有效性只得到了少量的测试,还有其他一些潜在的有价值的标准存在。一项广泛的模拟研究评估了跨各种数据结构的10个信息标准的性能,特别关注矩阵变量正态混合。
{"title":"On the Number of Components for Matrix-Variate Mixtures: A Comparison Among Information Criteria","authors":"Salvatore D. Tomarchio,&nbsp;Antonio Punzo","doi":"10.1111/insr.12607","DOIUrl":"https://doi.org/10.1111/insr.12607","url":null,"abstract":"<p>This study explores the crucial task of determining the optimal number of components in mixture models, known as mixture order, when considering matrix-variate data. Despite the growing interest in this data type among practitioners and researchers, the effectiveness of information criteria in selecting the mixture order remains largely unexplored in this branch of the literature. Although the Bayesian information criterion (BIC) is commonly utilised, its effectiveness is only marginally tested in this context, and several other potentially valuable criteria exist. An extensive simulation study evaluates the performance of 10 information criteria across various data structures, specifically focusing on matrix-variate normal mixtures.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 2","pages":"222-245"},"PeriodicalIF":1.8,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12607","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144774150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chance-Corrected Interrater Agreement Statistics for Two-Rater Dichotomous Responses: A Method Review With Comparative Assessment Under Possibly Correlated Decisions 二评者二分反应的机会校正间评者一致性统计:在可能相关决策下的比较评估方法回顾
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-01-06 DOI: 10.1111/insr.12606
Zizhong Tian, Vernon M. Chinchilli, Chan Shen, Shouhao Zhou

Measurement of the interrater agreement (IRA) is critical for assessing the reliability and validity of ratings in various disciplines. While numerous IRA statistics have been developed, there is a lack of guidance on selecting appropriate measures especially when raters' decisions could be correlated. To address this gap, we review a family of chance-corrected IRA statistics for two-rater dichotomous-response cases, a fundamental setting that not only serves as the theoretical foundation for categorical-response or multirater IRA methods but is also practically dominant in most empirical studies, and we propose a novel data-generating framework to simulate correlated decision processes between raters. Subsequently, a new estimand, which calibrates the ‘true’ chance-corrected IRA, is introduced while accounting for the potential ‘probabilistic certainty’. Extensive simulations were conducted to evaluate the performance of the reviewed IRA methods under various practical scenarios and were summarised by an agglomerative hierarchical clustering analysis. Finally, we provide recommendations for selecting appropriate IRA statistics based on outcome prevalence and rater characteristics and highlight the need for further advancements in IRA estimation methodologies.

评价者一致性(IRA)的测量对于评估不同学科评价者的可靠性和有效性至关重要。虽然已经开发了许多IRA统计数据,但缺乏关于选择适当措施的指导,特别是当评级者的决定可能相关时。为了解决这一差距,我们回顾了一组针对两等级二分类反应案例的机会校正IRA统计数据,这一基本设置不仅是分类反应或多等级IRA方法的理论基础,而且在大多数实证研究中也占主导地位,我们提出了一个新的数据生成框架来模拟评级者之间的相关决策过程。随后,在考虑潜在的“概率确定性”的同时,引入了一个新的估计,该估计校准了“真实的”机会校正IRA。进行了大量的模拟来评估所述IRA方法在各种实际场景下的性能,并通过聚集分层聚类分析进行总结。最后,我们提供了基于结果流行率和比率特征选择适当的IRA统计数据的建议,并强调了IRA估计方法进一步发展的必要性。
{"title":"Chance-Corrected Interrater Agreement Statistics for Two-Rater Dichotomous Responses: A Method Review With Comparative Assessment Under Possibly Correlated Decisions","authors":"Zizhong Tian,&nbsp;Vernon M. Chinchilli,&nbsp;Chan Shen,&nbsp;Shouhao Zhou","doi":"10.1111/insr.12606","DOIUrl":"https://doi.org/10.1111/insr.12606","url":null,"abstract":"<p>Measurement of the interrater agreement (IRA) is critical for assessing the reliability and validity of ratings in various disciplines. While numerous IRA statistics have been developed, there is a lack of guidance on selecting appropriate measures especially when raters' decisions could be correlated. To address this gap, we review a family of chance-corrected IRA statistics for two-rater dichotomous-response cases, a fundamental setting that not only serves as the theoretical foundation for categorical-response or multirater IRA methods but is also practically dominant in most empirical studies, and we propose a novel data-generating framework to simulate correlated decision processes between raters. Subsequently, a new estimand, which calibrates the ‘true’ chance-corrected IRA, is introduced while accounting for the potential ‘probabilistic certainty’. Extensive simulations were conducted to evaluate the performance of the reviewed IRA methods under various practical scenarios and were summarised by an agglomerative hierarchical clustering analysis. Finally, we provide recommendations for selecting appropriate IRA statistics based on outcome prevalence and rater characteristics and highlight the need for further advancements in IRA estimation methodologies.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 2","pages":"199-221"},"PeriodicalIF":1.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12606","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144773939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting Estimation of Number of Trials in Binomial Distribution 二项分布试验数估计再认识
IF 1.8 3区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-01-06 DOI: 10.1111/insr.12608
Mina Georgieva, Brani Vidakovic

Estimating the parameter n when p is known or simultaneous estimation of n and p of the binomial distribution based on k1 independent observations has been considered by many authors over the last several decades. A range of estimators have been proposed, and questions regarding asymptotic and small sample properties received adequate treatment. In this paper, we provide an extensive review and a comprehensive performance comparison of the estimators from the literature. We propose a conceptually simple estimator of n that uses the marginal likelihood when p is integrated out by simultaneous optimisation w.r.t. n and the hyperparameters. We compare the proposed estimator with various existing estimators and find its performance competitive and, in some scenarios, superior.

在过去的几十年里,许多作者考虑了在p已知时估计参数n或基于k≥1个独立观测值同时估计二项分布的n和p的问题。提出了一系列的估计量,关于渐近和小样本性质的问题得到了充分的处理。在本文中,我们从文献中提供了广泛的回顾和全面的性能比较估计器。我们提出了一个概念上简单的n估计器,当p通过同时优化w.r.t.n和超参数积分时,它使用边际似然。我们将所提出的估计器与各种现有的估计器进行比较,发现其性能具有竞争力,并且在某些情况下具有优势。
{"title":"Revisiting Estimation of Number of Trials in Binomial Distribution","authors":"Mina Georgieva,&nbsp;Brani Vidakovic","doi":"10.1111/insr.12608","DOIUrl":"https://doi.org/10.1111/insr.12608","url":null,"abstract":"<div>\u0000 \u0000 <p>Estimating the parameter \u0000<span></span><math>\u0000 <mi>n</mi></math> when \u0000<span></span><math>\u0000 <mi>p</mi></math> is known or simultaneous estimation of \u0000<span></span><math>\u0000 <mi>n</mi></math> and \u0000<span></span><math>\u0000 <mi>p</mi></math> of the binomial distribution based on \u0000<span></span><math>\u0000 <mi>k</mi>\u0000 <mo>≥</mo>\u0000 <mn>1</mn></math> independent observations has been considered by many authors over the last several decades. A range of estimators have been proposed, and questions regarding asymptotic and small sample properties received adequate treatment. In this paper, we provide an extensive review and a comprehensive performance comparison of the estimators from the literature. We propose a conceptually simple estimator of \u0000<span></span><math>\u0000 <mi>n</mi></math> that uses the marginal likelihood when \u0000<span></span><math>\u0000 <mi>p</mi></math> is integrated out by simultaneous optimisation w.r.t. \u0000<span></span><math>\u0000 <mi>n</mi></math> and the hyperparameters. We compare the proposed estimator with various existing estimators and find its performance competitive and, in some scenarios, superior.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":"93 2","pages":"246-266"},"PeriodicalIF":1.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144773904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Statistical Review
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1