首页 > 最新文献

Australian & New Zealand Journal of Statistics最新文献

英文 中文
Population Size Estimation Using Covariates Having Missing Values and Measurement Error: Estimating Ethnic Group Sizes in New Zealand 用缺失值和测量误差的协变量估计人口规模:估计新西兰的种族群体规模
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-06-19 DOI: 10.1111/anzs.70014
Paul A. Smith, Peter G.M. van der Heijden, Maarten Cruyff, Francesco Pantalone, Hannes Diener, Kim Dunstan

We investigate the use of multiple linked lists for population size estimation and to estimate the relationships between covariates appearing on the lists. Over the lists, the covariates aim to measure the same concept. The relationships between the covariates are not fully known because of missing values on the covariates: some cases do not appear in some lists; some cases are on one or more of the lists but have missing covariate values on some of the lists; and some cases are not observed in any list. In earlier work, multiple system estimation has been combined with latent class analysis to give a consensus estimate where an underlying dichotomous categorical covariate is measured differently in different lists. This was applied to ethnicity covariates in New Zealand with two levels, Māori and non-Māori. In this paper, we apply this approach to ethnicity covariates with a larger number of categories, and find that it produces satisfactory results with four categories. We assess the purity of the latent classes using entropy and conditional probability measures. We also examine the evolution of annual estimates from multiple lists (where one list is the population census) over 2013–2020, finding that the estimated latent class proportions are very stable. We assess the impact of disclosure control measures on the outputs.

我们研究了使用多个链表来估计人口规模,并估计了出现在链表上的协变量之间的关系。在这些列表中,协变量旨在度量相同的概念。协变量之间的关系并不完全清楚,因为协变量上的值缺失:有些情况没有出现在某些列表中;有些情况在一个或多个列表中,但在某些列表中缺少协变量值;有些情况在任何列表中都没有观察到。在早期的工作中,多系统估计已与潜在类分析相结合,以给出共识估计,其中潜在的二分类协变量在不同的列表中被不同地测量。这适用于新西兰的两个水平的种族协变量,Māori和non-Māori。在本文中,我们将这种方法应用于具有大量类别的种族协变量,并发现它在四个类别上产生了令人满意的结果。我们使用熵和条件概率度量来评估潜在类的纯度。我们还研究了2013-2020年多个列表(其中一个列表是人口普查)的年度估计的演变,发现估计的潜在类别比例非常稳定。我们评估披露控制措施对产出的影响。
{"title":"Population Size Estimation Using Covariates Having Missing Values and Measurement Error: Estimating Ethnic Group Sizes in New Zealand","authors":"Paul A. Smith,&nbsp;Peter G.M. van der Heijden,&nbsp;Maarten Cruyff,&nbsp;Francesco Pantalone,&nbsp;Hannes Diener,&nbsp;Kim Dunstan","doi":"10.1111/anzs.70014","DOIUrl":"https://doi.org/10.1111/anzs.70014","url":null,"abstract":"<p>We investigate the use of multiple linked lists for population size estimation and to estimate the relationships between covariates appearing on the lists. Over the lists, the covariates aim to measure the same concept. The relationships between the covariates are not fully known because of missing values on the covariates: some cases do not appear in some lists; some cases are on one or more of the lists but have missing covariate values on some of the lists; and some cases are not observed in any list. In earlier work, multiple system estimation has been combined with latent class analysis to give a consensus estimate where an underlying dichotomous categorical covariate is measured differently in different lists. This was applied to ethnicity covariates in New Zealand with two levels, Māori and non-Māori. In this paper, we apply this approach to ethnicity covariates with a larger number of categories, and find that it produces satisfactory results with four categories. We assess the purity of the latent classes using entropy and conditional probability measures. We also examine the evolution of annual estimates from multiple lists (where one list is the population census) over 2013–2020, finding that the estimated latent class proportions are very stable. We assess the impact of disclosure control measures on the outputs.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 3","pages":"432-453"},"PeriodicalIF":0.8,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145110886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Homogeneity and Sparsity Pursuit Using Robust Adaptive Fused Lasso 基于鲁棒自适应融合套索的同质性和稀疏性追踪
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-06-19 DOI: 10.1111/anzs.70010
Le Chang, Yanlin Shi

Fused lasso regression is a popular method for identifying homogeneous groups and sparsity patterns in regression coefficients based on either the presumed order or a more general graph structure of the covariates. However, the traditional fused lasso may yield misleading outcomes in the presence of outliers. In this paper, we propose an extension of the fused lasso, namely the robust adaptive fused lasso (RAFL), which pursues homogeneity and sparsity patterns in regression coefficients while accounting for potential outliers within the data. By using Huber's loss or Tukey's biweight loss, RAFL can resist outliers in the responses or in both the responses and the covariates. We also demonstrate that when the adaptive weights are properly chosen, the proposed RAFL achieves consistency in variable selection, consistency in grouping and asymptotic normality. Furthermore, a novel optimization algorithm, which employs the alternating direction method of multipliers, embedded with an accelerated proximal gradient algorithm, is developed to solve RAFL efficiently. Our simulation study shows that RAFL offers substantial improvements in terms of both grouping accuracy and prediction accuracy compared with the fused lasso, particularly when dealing with contaminated data. Additionally, a real analysis of cookie data demonstrates the effectiveness of RAFL.

融合套索回归是一种流行的方法,用于根据协变量的假定顺序或更一般的图结构来识别回归系数中的齐次群和稀疏模式。然而,在存在异常值的情况下,传统的融合套索可能产生误导性的结果。在本文中,我们提出了融合套索的扩展,即鲁棒自适应融合套索(RAFL),它在考虑数据中潜在的异常值的同时,追求回归系数的均匀性和稀疏性模式。通过使用Huber的损失或Tukey的重损失,RAFL可以抵抗响应中的异常值或响应和协变量中的异常值。当自适应权值选择适当时,所提出的RAFL在变量选择、分组一致性和渐近正态性方面都达到了一致性。在此基础上,提出了一种新的优化算法,该算法采用乘法器交替方向法,嵌入一种加速的近端梯度算法,有效地求解了RAFL问题。我们的仿真研究表明,与融合套索相比,RAFL在分组精度和预测精度方面都有很大的提高,特别是在处理污染数据时。此外,对cookie数据的实际分析证明了RAFL的有效性。
{"title":"Homogeneity and Sparsity Pursuit Using Robust Adaptive Fused Lasso","authors":"Le Chang,&nbsp;Yanlin Shi","doi":"10.1111/anzs.70010","DOIUrl":"https://doi.org/10.1111/anzs.70010","url":null,"abstract":"<p>Fused lasso regression is a popular method for identifying homogeneous groups and sparsity patterns in regression coefficients based on either the presumed order or a more general graph structure of the covariates. However, the traditional fused lasso may yield misleading outcomes in the presence of outliers. In this paper, we propose an extension of the fused lasso, namely the robust adaptive fused lasso (RAFL), which pursues homogeneity and sparsity patterns in regression coefficients while accounting for potential outliers within the data. By using Huber's loss or Tukey's biweight loss, RAFL can resist outliers in the responses or in both the responses and the covariates. We also demonstrate that when the adaptive weights are properly chosen, the proposed RAFL achieves consistency in variable selection, consistency in grouping and asymptotic normality. Furthermore, a novel optimization algorithm, which employs the alternating direction method of multipliers, embedded with an accelerated proximal gradient algorithm, is developed to solve RAFL efficiently. Our simulation study shows that RAFL offers substantial improvements in terms of both grouping accuracy and prediction accuracy compared with the fused lasso, particularly when dealing with contaminated data. Additionally, a real analysis of cookie data demonstrates the effectiveness of RAFL.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"157-174"},"PeriodicalIF":0.8,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-dimensional graphical inference via partially penalised regression 通过部分惩罚回归的高维图形推理
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-05-07 DOI: 10.1111/anzs.70005
Ni Zhao, Zemin Zheng, Yang Li

Graphical models are important tools to characterise the conditional independence structure among a set of variables. Despite the rapid development of statistical inference for high-dimensional graphical models, existing methods typically need a stringent constraint on the sample size. In this paper, we develop a new graphical projection estimator (GPE) for statistical inference in Gaussian graphical models via partially penalised regression. The suggested inference procedure takes advantage of the strong signals, which can be identified in advance, and utilises partially penalised regression to avoid the penalisation on them when constructing the GPE. It leads to enhanced inference efficiency by removing the impacts of strong signals that contribute to the bias term. We show that the proposed GPE can enjoy asymptotic normality under a relaxed constraint on the sample size, which is of the same order as that needed for consistent estimation. The usefulness of our method is demonstrated through simulations and a prostate tumour gene expression dataset.

图形模型是描述一组变量间条件独立结构的重要工具。尽管高维图形模型的统计推断发展迅速,但现有方法通常需要严格的样本量约束。本文提出了一种新的基于部分惩罚回归的高斯图模型统计推断的图形投影估计器(GPE)。建议的推理过程利用了可以提前识别的强信号,并利用部分惩罚回归来避免在构建GPE时对它们进行惩罚。它通过消除导致偏置项的强信号的影响来提高推理效率。我们证明了所提出的GPE在放宽的样本量约束下可以享受渐近正态性,样本量与一致估计所需的样本量具有相同的顺序。通过模拟和前列腺肿瘤基因表达数据集证明了我们方法的实用性。
{"title":"High-dimensional graphical inference via partially penalised regression","authors":"Ni Zhao,&nbsp;Zemin Zheng,&nbsp;Yang Li","doi":"10.1111/anzs.70005","DOIUrl":"https://doi.org/10.1111/anzs.70005","url":null,"abstract":"<div>\u0000 \u0000 <p>Graphical models are important tools to characterise the conditional independence structure among a set of variables. Despite the rapid development of statistical inference for high-dimensional graphical models, existing methods typically need a stringent constraint on the sample size. In this paper, we develop a new graphical projection estimator (GPE) for statistical inference in Gaussian graphical models via partially penalised regression. The suggested inference procedure takes advantage of the strong signals, which can be identified in advance, and utilises partially penalised regression to avoid the penalisation on them when constructing the GPE. It leads to enhanced inference efficiency by removing the impacts of strong signals that contribute to the bias term. We show that the proposed GPE can enjoy asymptotic normality under a relaxed constraint on the sample size, which is of the same order as that needed for consistent estimation. The usefulness of our method is demonstrated through simulations and a prostate tumour gene expression dataset.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"265-291"},"PeriodicalIF":0.8,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How data or error covariance can change and still retain BLUEs as well as their covariance or the sum of squares of errors 数据或误差协方差如何改变并保持blue及其协方差或误差平方和
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-29 DOI: 10.1111/anzs.70003
Stephen J. Haslett, Jarkko Isotalo, Augustyn Markiewicz, Simo Puntanen

Misspecification of the error covariance in linear models usually leads to incorrect inference and conclusions. We consider two linear models, A$$ mathcal{A} $$and B$$ mathcal{B} $$, with the same design matrix but different error covariance matrices. The conditions under which every representation of the best linear unbiased estimator (BLUE) of any estimable parametric vector under A$$ mathcal{A} $$ remains BLUE under B$$ mathcal{B} $$have been well known since C.R. Rao's paper in 1971: Unified theory of linear estimation, Sankhyā Ser. A, Vol. 33, pp. 371–394. However, there are no previously published results on retaining the weighted sum of squares of errors (SSE) for non-full-rank design or error covariance matrices, and the question of when the covariance matrix of the BLUEs is also retained has been partially explored only recently. For change in any specified error covariance matrix, we provide necessary and sufficient conditions (nasc) for both BLUEs and their covariance matrix to remain unaltered and to retain this property for all submodels. We also consider nasc for SSE to be unchanged. We decompose SSE under error covariance changes, and derive nasc under which error covariance change leaves hypothesis tests for fixed-effect deletion under normality unaltered. We also show that simultaneous retention of BLUEs and both their covariance and SSE is not possible. We outline the effects of weak and strong error covariance singularity. We provide applications (via data cloning) to maintaining data confidentiality in Official Statistics without using Confidentialised Unit Record Files (CURFs), to certain types of experimental design and to estimation of fixed parameters for linear models for single nucleotide polymorphisms (SNPs) in genetics.

线性模型中误差协方差的不规范往往导致不正确的推断和结论。我们考虑两个线性模型A $$ mathcal{A} $$和B $$ mathcal{B} $$,它们具有相同的设计矩阵,但误差协方差矩阵不同。在A $$ mathcal{A} $$下,任意可估计参数向量的最佳线性无偏估计量(BLUE)的每一个表示在B $$ mathcal{B} $$下保持BLUE的条件,自1971年C.R. Rao的论文以来已经众所周知:统一线性估计理论,sankhyaya Ser。A,第33卷,第371-394页。然而,关于保留非全秩设计或误差协方差矩阵的加权误差平方和(SSE)的问题,之前没有发表过结果,并且blue的协方差矩阵何时也被保留的问题最近才得到部分探讨。对于任何指定误差协方差矩阵的变化,我们提供了blue及其协方差矩阵保持不变的充分必要条件(nasc),并在所有子模型中保持这一性质。我们也认为上交所的nasc不变。我们对误差协方差变化下的SSE进行分解,得到误差协方差变化下正态性下固定效应删除的假设检验不变的nasc。我们还表明,同时保留blue及其协方差和SSE是不可能的。我们概述了弱误差和强误差协方差奇点的影响。我们提供应用程序(通过数据克隆)来保持官方统计数据的机密性,而不使用机密单位记录文件(curf),某些类型的实验设计和估计遗传学中单核苷酸多态性(snp)线性模型的固定参数。
{"title":"How data or error covariance can change and still retain BLUEs as well as their covariance or the sum of squares of errors","authors":"Stephen J. Haslett,&nbsp;Jarkko Isotalo,&nbsp;Augustyn Markiewicz,&nbsp;Simo Puntanen","doi":"10.1111/anzs.70003","DOIUrl":"https://doi.org/10.1111/anzs.70003","url":null,"abstract":"<p>Misspecification of the error covariance in linear models usually leads to incorrect inference and conclusions. We consider two linear models, <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>A</mi>\u0000 </mrow>\u0000 <annotation>$$ mathcal{A} $$</annotation>\u0000 </semantics></math>\u0000and <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 </mrow>\u0000 <annotation>$$ mathcal{B} $$</annotation>\u0000 </semantics></math>, with the same design matrix but different error covariance matrices. The conditions under which every representation of the best linear unbiased estimator (BLUE) of any estimable parametric vector under <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>A</mi>\u0000 </mrow>\u0000 <annotation>$$ mathcal{A} $$</annotation>\u0000 </semantics></math> remains BLUE under <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>B</mi>\u0000 </mrow>\u0000 <annotation>$$ mathcal{B} $$</annotation>\u0000 </semantics></math>\u0000have been well known since C.R. Rao's paper in 1971: Unified theory of linear estimation, <i>Sankhyā Ser. A</i>, Vol. 33, pp. 371–394. However, there are no previously published results on retaining the weighted sum of squares of errors (SSE) for non-full-rank design or error covariance matrices, and the question of when the covariance matrix of the BLUEs is also retained has been partially explored only recently. For change in any specified error covariance matrix, we provide necessary and sufficient conditions (nasc) for both BLUEs and their covariance matrix to remain unaltered and to retain this property for all submodels. We also consider nasc for SSE to be unchanged. We decompose SSE under error covariance changes, and derive nasc under which error covariance change leaves hypothesis tests for fixed-effect deletion under normality unaltered. We also show that simultaneous retention of BLUEs and both their covariance and SSE is not possible. We outline the effects of weak and strong error covariance singularity. We provide applications (via data cloning) to maintaining data confidentiality in Official Statistics without using Confidentialised Unit Record Files (CURFs), to certain types of experimental design and to estimation of fixed parameters for linear models for single nucleotide polymorphisms (SNPs) in genetics.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"175-201"},"PeriodicalIF":0.8,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science. By Aubrey Clayton, New York, Columbia University Press, 1st ed., 2021. 368 pages. AU$ 57.95 (hardcover). ISBN: 10:0231199945. 伯努利谬误:统计不合逻辑与现代科学的危机。奥布里·克莱顿著,纽约,哥伦比亚大学出版社,第一版,2021年。368页。57.95澳元(精装)。ISBN: 10:0231199945。
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-29 DOI: 10.1111/anzs.70007
Mahdi Nouraie
{"title":"Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science. By Aubrey Clayton, New York, Columbia University Press, 1st ed., 2021. 368 pages. AU$ 57.95 (hardcover). ISBN: 10:0231199945.","authors":"Mahdi Nouraie","doi":"10.1111/anzs.70007","DOIUrl":"https://doi.org/10.1111/anzs.70007","url":null,"abstract":"","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"344-345"},"PeriodicalIF":0.8,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autocovariance function estimation via difference schemes for a semiparametric change point model with m $$ m $$ -dependent errors 误差为m $$ m $$的半参数变点模型的差分格式自协方差函数估计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-29 DOI: 10.1111/anzs.70002
Michael Levine, Inder Tecuapetla-Gómez

We discuss a broad class of difference-based estimators of the autocovariance function in a semiparametric regression model where the signal consists of the sum of a smooth function and another stepwise function whose number of jumps and locations are unknown (change points) while the errors are stationary and m$$ m $$-dependent. We establish that the influence of the smooth part of the signal over the bias of our estimators is negligible; this is a general result as it does not depend on the distribution of the errors. We show that the influence of the unknown smooth function is negligible also in the mean squared error (MSE) of our estimators. Although we assumed Gaussian errors to derive the latter result, our finite sample studies suggest that the class of proposed estimators still show small MSE when the errors are not Gaussian. Our simulation study also demonstrates that, when the error process is mis-specified as an AR(1)$$ (1) $$ instead of an m$$ m $$-dependent process, our proposed method can estimate autocovariances about as well as some methods specifically designed for the AR(1) case, and sometimes even better than them. We also allow both the number of change points and the magnitude of the largest jump grow with the sample size n$$ n $$. In this case, we provide conditions on the interplay between the growth rate of these two quantities as well as the vanishing rate of the modulus of continuity (of the signal's smooth part) that ensure n$$ sqrt{n} $$ consistency of our autocovariance estimators. As an application, we use our approach to provide a better understanding of the possible autocovariance structure of a time series of global averaged annual temperature anomalies. Finally, the R package dbacf complements this article.

我们讨论了半参数回归模型中自协方差函数的一类基于差分的估计,其中信号由平滑函数和另一个逐步函数的和组成,该函数的跳跃数量和位置是未知的(变化点),而误差是平稳的且与m $$ m $$相关。我们证明了信号的平滑部分对估计器偏置的影响可以忽略不计;这是一个一般的结果,因为它不依赖于误差的分布。我们表明未知平滑函数的影响在我们估计的均方误差(MSE)中也可以忽略不计。虽然我们假设高斯误差来推导后一种结果,但我们的有限样本研究表明,当误差不是高斯时,所提出的估计器类仍然显示出较小的MSE。我们的模拟研究还表明,当错误地将误差过程指定为AR(1) $$ (1) $$而不是m $$ m $$依赖过程时,我们提出的方法可以估计关于以及为AR(1)情况专门设计的一些方法的自协方差。有时甚至比他们更好。我们还允许变化点的数量和最大跳跃的大小随样本量n $$ n $$而增长。在这种情况下,我们提供了这两个量的增长率之间的相互作用的条件以及连续模的消失率(信号的平滑部分),以确保我们的自协方差估计的n $$ sqrt{n} $$一致性。作为一项应用,我们使用我们的方法来更好地理解全球平均年温度异常时间序列可能的自协方差结构。最后,R包backf对本文进行了补充。
{"title":"Autocovariance function estimation via difference schemes for a semiparametric change point model with \u0000 \u0000 \u0000 m\u0000 \u0000 $$ m $$\u0000 -dependent errors","authors":"Michael Levine,&nbsp;Inder Tecuapetla-Gómez","doi":"10.1111/anzs.70002","DOIUrl":"https://doi.org/10.1111/anzs.70002","url":null,"abstract":"<div>\u0000 \u0000 <p>We discuss a broad class of difference-based estimators of the autocovariance function in a semiparametric regression model where the signal consists of the sum of a smooth function and another stepwise function whose number of jumps and locations are unknown (change points) while the errors are stationary and <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>m</mi>\u0000 </mrow>\u0000 <annotation>$$ m $$</annotation>\u0000 </semantics></math>-dependent. We establish that the influence of the smooth part of the signal over the bias of our estimators is negligible; this is a general result as it does not depend on the distribution of the errors. We show that the influence of the unknown smooth function is negligible also in the mean squared error (MSE) of our estimators. Although we assumed Gaussian errors to derive the latter result, our finite sample studies suggest that the class of proposed estimators still show small MSE when the errors are not Gaussian. Our simulation study also demonstrates that, when the error process is mis-specified as an AR<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>(</mo>\u0000 <mn>1</mn>\u0000 <mo>)</mo>\u0000 </mrow>\u0000 <annotation>$$ (1) $$</annotation>\u0000 </semantics></math> instead of an <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>m</mi>\u0000 </mrow>\u0000 <annotation>$$ m $$</annotation>\u0000 </semantics></math>-dependent process, our proposed method can estimate autocovariances about as well as some methods specifically designed for the AR(1) case, and sometimes even better than them. We also allow both the number of change points and the magnitude of the largest jump grow with the sample size <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 </mrow>\u0000 <annotation>$$ n $$</annotation>\u0000 </semantics></math>. In this case, we provide conditions on the interplay between the growth rate of these two quantities as well as the vanishing rate of the modulus of continuity (of the signal's smooth part) that ensure <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msqrt>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 </mrow>\u0000 </msqrt>\u0000 </mrow>\u0000 <annotation>$$ sqrt{n} $$</annotation>\u0000 </semantics></math> consistency of our autocovariance estimators. As an application, we use our approach to provide a better understanding of the possible autocovariance structure of a time series of global averaged annual temperature anomalies. Finally, the <span>R</span> package <span>dbacf</span> complements this article.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"202-223"},"PeriodicalIF":0.8,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional Data Analysis with R. By C. M. Crainiceanu, J. Goldsmith, A. Leroux, and E. Cui, Boca Raton, FL: Chapman and Hall/CRC. 2024. 338 pages. AU$ 138.40 (hardback). ISBN: 978-1-032-24471-6. C. M. Crainiceanu, J. Goldsmith, A. Leroux, and E. Cui, Boca Raton, FL: Chapman and Hall/CRC。2024. 338页。138.40澳元(精装本)。ISBN: 978-1-032-24471-6。
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-29 DOI: 10.1111/anzs.70006
Faïcel Chamroukhi
{"title":"Functional Data Analysis with R. By C. M. Crainiceanu, J. Goldsmith, A. Leroux, and E. Cui, Boca Raton, FL: Chapman and Hall/CRC. 2024. 338 pages. AU$ 138.40 (hardback). ISBN: 978-1-032-24471-6.","authors":"Faïcel Chamroukhi","doi":"10.1111/anzs.70006","DOIUrl":"https://doi.org/10.1111/anzs.70006","url":null,"abstract":"","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"341-343"},"PeriodicalIF":0.8,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical balancing as an unconstrained optimisation problem 统计平衡是一个无约束的优化问题
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-28 DOI: 10.1111/anzs.70004
N. T. Longford

Within the potential outcomes framework, balancing the treatment groups is a key step in estimating the average treatment effect in an observational study. Methods for optimal matching or weighting solve nonlinear programming problems. We present an alternative, related to ridge regression. Its solution has a closed form and is a smooth function of a set of tuning parameters. The method is accompanied by a simple way of exploring the sensitivity with respect to bias due to an unobserved confounder. It is applied to retrospective studies in neonatal research, concerned with clinical care for preterm born babies in the first few weeks of their lives.

在潜在结果框架内,平衡治疗组是估计观察性研究中平均治疗效果的关键步骤。最优匹配或加权方法解决非线性规划问题。我们提出一种与脊回归有关的替代方法。其解具有封闭形式,是一组调谐参数的光滑函数。该方法附有一种简单的方法来探索由于未观察到的混杂因素引起的偏差的敏感性。它适用于新生儿研究的回顾性研究,涉及早产儿生命最初几周的临床护理。
{"title":"Statistical balancing as an unconstrained optimisation problem","authors":"N. T. Longford","doi":"10.1111/anzs.70004","DOIUrl":"https://doi.org/10.1111/anzs.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Within the potential outcomes framework, balancing the treatment groups is a key step in estimating the average treatment effect in an observational study. Methods for optimal matching or weighting solve nonlinear programming problems. We present an alternative, related to ridge regression. Its solution has a closed form and is a smooth function of a set of tuning parameters. The method is accompanied by a simple way of exploring the sensitivity with respect to bias due to an unobserved confounder. It is applied to retrospective studies in neonatal research, concerned with clinical care for preterm born babies in the first few weeks of their lives.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"292-319"},"PeriodicalIF":0.8,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The incremental progression from fixed to random factors in the analysis of variance: a new synthesis 方差分析中从固定因素到随机因素的递增过程:新的综述
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-01 DOI: 10.1111/anzs.70001
Marti J. Anderson, Ray N. Gorley, Antonio Terlizzi

Classically, the distinction between a fixed versus a random factor in analysis of variance has been considered a binary choice. Here we consider that any given factor can also occur along an incremental series of steps between these two extremes, depending on the sampling fraction of its levels from the wider population. Fixed factors occur where all possible levels are drawn, and random factors occur in the limit as the population of possible levels approaches infinity. When some identifiable fraction of a finite population of possible levels is drawn, the factor can be thought of as something in between fixed and random, and can be analysed explicitly as finite directly within the analysis of variance (ANOVA) framework. Requiring explicit specification of the population size from which observed levels are drawn for each factor, we provide a unified approach to derive expectations of mean squares (EMS) in ANOVA for any types of factors along the entire graded progression from fixed to random, inclusive, that may be nested within or crossed with one another, from balanced, asymmetrical or unbalanced designs, including multi-level hierarchical sampling designs, mixed models and interactions. Implications for estimation of variance components, tailored bootstrap methods and tests of hypotheses under minimal assumptions of exchangeability are described and further extended to multivariate dissimilarity-based settings.

传统上,在方差分析中,固定因素与随机因素之间的区别被认为是二元选择。在这里,我们认为任何给定的因素也可以沿着这两个极端之间的一系列增量步骤出现,这取决于其从更广泛的人群中抽取的水平的抽样分数。固定因素出现在绘制所有可能的关卡时,而随机因素出现在可能关卡的数量接近无穷大时。当在可能水平的有限总体中绘制一些可识别的部分时,该因素可以被认为是介于固定和随机之间的东西,并且可以在方差分析(ANOVA)框架中直接明确地分析为有限。需要明确说明从每个因素得出观察水平的总体大小,我们提供了一种统一的方法,在ANOVA中推导出从固定到随机的整个渐变过程中任何类型的因素的均方(EMS)期望,这些因素可能嵌套在一起或彼此交叉,来自平衡,不对称或不平衡的设计,包括多层次分层抽样设计,混合模型和相互作用。本文描述了方差成分估计的含义、量身定制的自举方法和在可交换性最小假设下的假设检验,并进一步扩展到基于多元差异的设置。
{"title":"The incremental progression from fixed to random factors in the analysis of variance: a new synthesis","authors":"Marti J. Anderson,&nbsp;Ray N. Gorley,&nbsp;Antonio Terlizzi","doi":"10.1111/anzs.70001","DOIUrl":"https://doi.org/10.1111/anzs.70001","url":null,"abstract":"<p>Classically, the distinction between a fixed versus a random factor in analysis of variance has been considered a binary choice. Here we consider that any given factor can also occur along an incremental series of steps between these two extremes, depending on the sampling fraction of its levels from the wider population. Fixed factors occur where all possible levels are drawn, and random factors occur in the limit as the population of possible levels approaches infinity. When some identifiable fraction of a finite population of possible levels is drawn, the factor can be thought of as something in between fixed and random, and can be analysed explicitly as finite directly within the analysis of variance (ANOVA) framework. Requiring explicit specification of the population size from which observed levels are drawn for each factor, we provide a unified approach to derive expectations of mean squares (EMS) in ANOVA for any types of factors along the entire graded progression from fixed to random, inclusive, that may be nested within or crossed with one another, from balanced, asymmetrical or unbalanced designs, including multi-level hierarchical sampling designs, mixed models and interactions. Implications for estimation of variance components, tailored bootstrap methods and tests of hypotheses under minimal assumptions of exchangeability are described and further extended to multivariate dissimilarity-based settings.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 1","pages":"3-30"},"PeriodicalIF":0.8,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143831039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Least-squares estimators of the linear-by-linear association parameter from an ordinal log-linear model 序对数线性模型的线性乘线性关联参数的最小二乘估计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-13 DOI: 10.1111/anzs.70000
Eric J. Beh, Sidra Zafar, Irene L. Hudson

When modelling the association between the ordinal categorical variables of a contingency table, ordinal log-linear models are typically used; these models are a variation of the more popular log-linear model, which has attracted considerable attention in the statistics and allied literature since the late 1960s. Estimating the parameters of an ordinal log-linear model usually involves the use of iterative techniques, typically Newton's method and iterative proportional fitting. However, the early 2000s brought with it more direct estimation methods that do not require the use of iterative techniques. When the focus is on the parameter that reflects the linear-by-linear association between the variables, these methods have proven to provide unbiased, consistent and normally distributed estimates. Despite this new work, no attention has been given to the estimation of the least-squares estimator. Therefore, this article derives the least-squares estimator of the linear-by-linear association parameter and shows it to be equivalent to one of the existing non-iterative estimators recently described. We also derive two further least-squares estimators based on the Box-Cox transformation and derive their variance.

当对列联表的有序分类变量之间的关联进行建模时,通常使用有序对数线性模型;这些模型是更流行的对数线性模型的一种变体,自20世纪60年代末以来,对数线性模型在统计学和相关文献中引起了相当大的关注。估计有序对数线性模型的参数通常涉及使用迭代技术,典型的是牛顿法和迭代比例拟合。然而,21世纪初带来了更直接的评估方法,不需要使用迭代技术。当重点放在反映变量之间线性关联的参数上时,这些方法已被证明可以提供无偏、一致和正态分布的估计。尽管有这些新工作,但对最小二乘估计量的估计还没有给予重视。因此,本文导出了线性乘线性关联参数的最小二乘估计量,并证明它等价于最近描述的一种现有的非迭代估计量。我们还基于Box-Cox变换推导了两个进一步的最小二乘估计量,并推导了它们的方差。
{"title":"Least-squares estimators of the linear-by-linear association parameter from an ordinal log-linear model","authors":"Eric J. Beh,&nbsp;Sidra Zafar,&nbsp;Irene L. Hudson","doi":"10.1111/anzs.70000","DOIUrl":"https://doi.org/10.1111/anzs.70000","url":null,"abstract":"<p>When modelling the association between the ordinal categorical variables of a contingency table, ordinal log-linear models are typically used; these models are a variation of the more popular log-linear model, which has attracted considerable attention in the statistics and allied literature since the late 1960s. Estimating the parameters of an ordinal log-linear model usually involves the use of iterative techniques, typically Newton's method and iterative proportional fitting. However, the early 2000s brought with it more direct estimation methods that do not require the use of iterative techniques. When the focus is on the parameter that reflects the linear-by-linear association between the variables, these methods have proven to provide unbiased, consistent and normally distributed estimates. Despite this new work, no attention has been given to the estimation of the least-squares estimator. Therefore, this article derives the least-squares estimator of the linear-by-linear association parameter and shows it to be equivalent to one of the existing non-iterative estimators recently described. We also derive two further least-squares estimators based on the Box-Cox transformation and derive their variance.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"67 2","pages":"137-156"},"PeriodicalIF":0.8,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.70000","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Australian & New Zealand Journal of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1