Australian & New Zealand Journal of Statistics最新文献

英文中文

Odds-symmetry model for cumulative probabilities and decomposition of a conditional symmetry model in square contingency tables 平方列联表中累积概率的奇数-对称模型及条件对称模型的分解

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-12-06 DOI: 10.1111/anzs.12346

Shuji Ando

For the analysis of square contingency tables, it is necessary to estimate an unknown distribution with high confidence from an obtained observation. For that purpose, we need to introduce a statistical model that fits the data well and has parsimony. This study proposes asymmetry models based on cumulative probabilities for square contingency tables with the same row and column ordinal classifications. In the proposed models, the odds, for all i<j, that an observation will fall in row category i or below, and column category j or above, instead of row category j or above, and column category i or below, depend on only row category i or column category j. This is notwithstanding that the odds are constant without relying on row and column categories under the conditional symmetry (CS) model. The proposed models constantly hold when the CS model holds. However, the converse is not necessarily true. This study also shows that it is necessary to satisfy the extended marginal homogeneity model, in addition to the proposed models, to satisfy the CS model. These decomposition theorems explain why the CS model does not hold. The proposed models provide a better fit for application to a single data set of real-world occupational data for father-and-son dyads.

对于平方列联表的分析，需要从已获得的观测值中估计出具有高置信度的未知分布。为此，我们需要引入一种能够很好地拟合数据并具有简约性的统计模型。本研究提出了基于累积概率的方形列联表的不对称模型，具有相同的行和列顺序分类。在提出的模型中，对于所有i<j，一个观测值将落在第i行类别或以下，列类别j或以上，而不是行类别j或以上，列类别i或以下的几率，仅取决于行类别i或列类别j。尽管在条件对称(CS)模型下，几率是恒定的，不依赖于行和列类别。当CS模型成立时，所提出的模型一直成立。然而，反过来未必正确。研究还表明，在满足CS模型的基础上，还需要满足扩展边际均匀性模型。这些分解定理解释了为什么CS模型不成立。所提出的模型更适合应用于父子二人组真实职业数据的单一数据集。

{"title":"Odds-symmetry model for cumulative probabilities and decomposition of a conditional symmetry model in square contingency tables","authors":"Shuji Ando","doi":"10.1111/anzs.12346","DOIUrl":"10.1111/anzs.12346","url":null,"abstract":"<div>\u0000 \u0000 For the analysis of square contingency tables, it is necessary to estimate an unknown distribution with high confidence from an obtained observation. For that purpose, we need to introduce a statistical model that fits the data well and has parsimony. This study proposes asymmetry models based on cumulative probabilities for square contingency tables with the same row and column ordinal classifications. In the proposed models, the odds, for all i<j, that an observation will fall in row category i or below, and column category j or above, instead of row category j or above, and column category i or below, depend on only row category i or column category j. This is notwithstanding that the odds are constant without relying on row and column categories under the conditional symmetry (CS) model. The proposed models constantly hold when the CS model holds. However, the converse is not necessarily true. This study also shows that it is necessary to satisfy the extended marginal homogeneity model, in addition to the proposed models, to satisfy the CS model. These decomposition theorems explain why the CS model does not hold. The proposed models provide a better fit for application to a single data set of real-world occupational data for father-and-son dyads.\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"63 4","pages":"674-684"},"PeriodicalIF":1.1,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77244237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Proportional inverse Gaussian distribution: A new tool for analysing continuous proportional data 比例反高斯分布:分析连续比例数据的新工具

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-11-23 DOI: 10.1111/anzs.12345

Pengyi Liu, Guo-Liang Tian, Kam Chuen Yuen, Chi Zhang, Man-Lai Tang

Outcomes in the form of rates, fractions, proportions and percentages often appear in various fields. Existing beta and simplex distributions are frequently unable to exhibit satisfactory performances in fitting such continuous data. This paper aims to develop the normalised inverse Gaussian (N-IG) distribution proposed by Lijoi, Mena & Prünster (2005, Journal of the American Statistical Association, 100, 1278–1291) as a new tool for analysing continuous proportional data in (0,1) and renames the N-IG as proportional inverse Gaussian (PIG) distribution. Our main contributions include: (i) To overcome the difficulty of an integral in the PIG density function, we propose a novel minorisation–maximisation (MM) algorithm via the continuous version of Jensen's inequality to calculate the maximum likelihood estimates of the parameters in the PIG distribution; (ii) We also develop an MM algorithm aided by the gradient descent algorithm for the PIG regression model, which allows us to explore the relationship between a set of covariates with the mean parameter; (iii) Both the comparative studies and the real data analyses show that the PIG distribution is better when comparing with the beta and simplex distributions in terms of the AIC, the Cramér–von Mises and the Kolmogorov–Smirnov tests. In addition, bootstrap confidence intervals and testing hypothesis on the symmetry of the PIG density are also presented. Simulation studies are conducted and the hospital stay data of Barcelona in 1988 and 1990 are analysed to illustrate the proposed methods.

比率、分数、比例和百分比形式的结果经常出现在各个领域。现有的beta和单纯形分布在拟合此类连续数据时往往不能表现出令人满意的性能。本文旨在发展Lijoi, Mena &提出的归一化逆高斯分布(N-IG)。pr nster (2005, Journal of American Statistical Association, 100, 1278-1291)作为分析(0,1)中连续比例数据的新工具，并将N-IG重命名为比例逆高斯分布(PIG)。我们的主要贡献包括:(i)为了克服PIG密度函数中积分的困难，我们提出了一种新的最小化-最大化(MM)算法，该算法通过Jensen不等式的连续版本来计算PIG分布中参数的最大似然估计;(ii)我们还开发了一种由梯度下降算法辅助的MM算法，用于PIG回归模型，这使我们能够探索一组协变量与平均参数之间的关系;(iii)对比研究和实际数据分析均表明，在AIC、cram - von Mises和Kolmogorov-Smirnov检验方面，PIG分布优于beta分布和单纯形分布。此外，还提出了自举置信区间和关于PIG密度对称性的检验假设。本文进行了模拟研究，并分析了巴塞罗那1988年和1990年的住院数据，以说明所提出的方法。

{"title":"Proportional inverse Gaussian distribution: A new tool for analysing continuous proportional data","authors":"Pengyi Liu, Guo-Liang Tian, Kam Chuen Yuen, Chi Zhang, Man-Lai Tang","doi":"10.1111/anzs.12345","DOIUrl":"10.1111/anzs.12345","url":null,"abstract":"<div>\u0000 \u0000 Outcomes in the form of rates, fractions, proportions and percentages often appear in various fields. Existing beta and simplex distributions are frequently unable to exhibit satisfactory performances in fitting such continuous data. This paper aims to develop the normalised inverse Gaussian (N-IG) distribution proposed by Lijoi, Mena & Prünster (2005, Journal of the American Statistical Association, 100, 1278–1291) as a new tool for analysing continuous proportional data in (0,1) and renames the N-IG as proportional inverse Gaussian (PIG) distribution. Our main contributions include: (i) To overcome the difficulty of an integral in the PIG density function, we propose a novel minorisation–maximisation (MM) algorithm via the continuous version of Jensen's inequality to calculate the maximum likelihood estimates of the parameters in the PIG distribution; (ii) We also develop an MM algorithm aided by the gradient descent algorithm for the PIG regression model, which allows us to explore the relationship between a set of covariates with the mean parameter; (iii) Both the comparative studies and the real data analyses show that the PIG distribution is better when comparing with the beta and simplex distributions in terms of the AIC, the Cramér–von Mises and the Kolmogorov–Smirnov tests. In addition, bootstrap confidence intervals and testing hypothesis on the symmetry of the PIG density are also presented. Simulation studies are conducted and the hospital stay data of Barcelona in 1988 and 1990 are analysed to illustrate the proposed methods.\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"63 4","pages":"579-605"},"PeriodicalIF":1.1,"publicationDate":"2021-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87974708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

BNPdensity: Bayesian nonparametric mixture modelling in R bnp密度:贝叶斯非参数混合建模

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-11-17 DOI: 10.1111/anzs.12342

J. Arbel, G. Kon Kam King, A. Lijoi, L. Nieto-Barajas, I. Prünster

Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the R package BNPdensity in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalised random measures, which represent a generalisation of the popular Dirichlet process mixture. One striking advantage of this generalisation is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson & Klass algorithm. The package also offers several goodness-of-fit diagnostics such as QQ plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the species sensitivity distribution problem, showcasing the benefits of the Bayesian nonparametric framework.

在潜在的模型错误规范下，稳健的统计数据建模通常需要离开参数世界而进入非参数世界。在后者中，参数是无限维对象，如函数、概率分布或无限向量。在贝叶斯非参数方法中，为这些参数设计了先验分布，为实际中管理非参数模型的复杂性提供了一个把柄。然而，大多数现代贝叶斯非参数模型对于实践者来说似乎经常是遥不可及的，因为推理算法需要仔细设计来处理无限数量的参数。这项工作的目的是通过为贝叶斯非参数推理提供计算工具来促进这一过程。本文描述了R包BNPdensity中可用的一组函数，用于对无限混合模型(包括所有类型的截尾数据)进行密度估计。该包提供了访问一个大的类这样的模型基于标准化的随机措施，这代表了流行的狄利克雷过程混合物的推广。这种泛化的一个显著优点是，它提供了比狄利克雷更健壮的聚类数量先验。另一个关键的优点是在指定集群的规模和位置参数的先验时完全灵活，因为不需要共轭。推理是使用一种被称为弗格森(Ferguson)的理论基础近似抽样方法进行的。Klass算法。该软件包还提供了一些适合度诊断，如QQ图，包括交叉验证标准，条件预测坐标。该方法以物种敏感性分布问题为例，展示了贝叶斯非参数框架的优越性。

{"title":"BNPdensity: Bayesian nonparametric mixture modelling in R","authors":"J. Arbel, G. Kon Kam King, A. Lijoi, L. Nieto-Barajas, I. Prünster","doi":"10.1111/anzs.12342","DOIUrl":"10.1111/anzs.12342","url":null,"abstract":"<div>\u0000 \u0000 Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the R package BNPdensity in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalised random measures, which represent a generalisation of the popular Dirichlet process mixture. One striking advantage of this generalisation is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson & Klass algorithm. The package also offers several goodness-of-fit diagnostics such as QQ plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the species sensitivity distribution problem, showcasing the benefits of the Bayesian nonparametric framework.\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"63 3","pages":"542-564"},"PeriodicalIF":1.1,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90676545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Experimental design in practice: The importance of blocking and treatment structures 实践中的实验设计:阻塞和处理结构的重要性

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-11-08 DOI: 10.1111/anzs.12343

E.R. Williams, C.G. Forde, J. Imaki, K. Oelkers

Experimental design and analysis has evolved substantially over the last 100 years, driven to a large extent by the power and availability of the computer. To demonstrate this development and encourage the use of experimental design in practice, three experiments from different research areas are presented. In these examples multiple blocking factors have been employed and they show how extraneous variation can be accommodated and interpreted. The examples are used to discuss the importance of blocking and treatment structures in the conduct of designed experiments.

在过去的100年里，实验设计和分析已经有了很大的发展，很大程度上是由计算机的能力和可用性驱动的。为了展示这一发展并鼓励在实践中使用实验设计，本文介绍了来自不同研究领域的三个实验。在这些例子中，多个阻碍因素被采用，它们显示了如何适应和解释外来的变化。通过实例讨论了阻塞和处理结构在设计实验中的重要性。

引用次数: 1

Accelerating adaptation in the adaptive Metropolis–Hastings random walk algorithm 自适应Metropolis-Hastings随机漫步算法中的加速自适应

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-11-03 DOI: 10.1111/anzs.12344

Simon E.F. Spencer

The Metropolis–Hastings random walk algorithm remains popular with practitioners due to the wide variety of situations in which it can be successfully applied and the extreme ease with which it can be implemented. Adaptive versions of the algorithm use information from the early iterations of the Markov chain to improve the efficiency of the proposal. The aim of this paper is to reduce the number of iterations needed to adapt the proposal to the target, which is particularly important when the likelihood is time-consuming to evaluate. First, the accelerated shaping algorithm is a generalisation of both the adaptive proposal and adaptive Metropolis algorithms. It is designed to remove, from the estimate of the covariance matrix of the target, misleading information from the start of the chain. Second, the accelerated scaling algorithm rapidly changes the scale of the proposal to achieve a target acceptance rate. The usefulness of these approaches is illustrated with a range of examples.

大都会-黑斯廷斯随机游走算法仍然受到实践者的欢迎，因为它可以在各种各样的情况下成功应用，并且可以极其容易地实现。该算法的自适应版本使用来自马尔可夫链的早期迭代的信息来提高建议的效率。本文的目的是减少使建议适应目标所需的迭代次数，当评估可能性非常耗时时，这一点尤为重要。首先，加速整形算法是自适应proposal算法和自适应Metropolis算法的推广。它的目的是从目标的协方差矩阵的估计中去除从链开始的误导性信息。其次，加速缩放算法快速改变提案的尺度，以达到目标接受率。通过一系列例子说明了这些方法的有用性。

引用次数: 6

Variable selection using penalised likelihoods for point patterns on a linear network 使用惩罚似然对线性网络上的点模式进行变量选择

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-10-18 DOI: 10.1111/anzs.12341

Suman Rakshit, Greg McSwiggan, Gopalan Nair, Adrian Baddeley

Motivated by the analysis of a comprehensive database of road traffic accidents, we investigate methods of variable selection for spatial point process models on a linear network. The original data may include explanatory spatial covariates, such as road curvature, and ‘mark’ variables attributed to individual accidents, such as accident severity. The treatment of mark variables is new. Variable selection is applied to the canonical covariates, which may include spatial covariate effects, mark effects and mark-covariate interactions. We approximate the likelihood of the point process model by that of a generalised linear model, in such a way that spatial covariates and marks are both associated with canonical covariates. We impose a convex penalty on the log likelihood, principally the elastic-net penalty, and maximise the penalised loglikelihood by cyclic coordinate ascent. A simulation study compares the performances of the lasso, ridge regression and elastic-net methods of variable selection on their ability to select variables correctly, and on their bias and standard error. Standard techniques for selecting the regularisation parameter γ often yielded unsatisfactory results. We propose two new rules for selecting γ which are designed to have better performance. The methods are tested on a small dataset on crimes in a Chicago neighbourhood, and applied to a large dataset of road traffic accidents in Western Australia.

通过对道路交通事故综合数据库的分析，研究了线性网络空间点过程模型的变量选择方法。原始数据可能包括解释性空间协变量，如道路曲率，以及归因于个别事故的“标记”变量，如事故严重程度。标记变量的处理是新的。变量选择应用于典型协变量，其中可能包括空间协变量效应、标记效应和标记-协变量相互作用。我们通过广义线性模型近似点过程模型的似然，以这样一种方式，空间协变量和标记都与正则协变量相关联。我们在对数似然上施加一个凸惩罚，主要是弹性网惩罚，并通过循环坐标上升最大化惩罚的对数似然。仿真研究比较了套索、脊回归和弹性网三种变量选择方法正确选择变量的能力，以及它们的偏差和标准误差。选择正则化参数γ的标准技术常常产生不满意的结果。我们提出了两个新的选择γ的规则，它们具有更好的性能。这些方法在芝加哥社区的一个小型犯罪数据集上进行了测试，并应用于西澳大利亚州的一个大型道路交通事故数据集。

{"title":"Variable selection using penalised likelihoods for point patterns on a linear network","authors":"Suman Rakshit, Greg McSwiggan, Gopalan Nair, Adrian Baddeley","doi":"10.1111/anzs.12341","DOIUrl":"10.1111/anzs.12341","url":null,"abstract":"<div>\u0000 \u0000 Motivated by the analysis of a comprehensive database of road traffic accidents, we investigate methods of variable selection for spatial point process models on a linear network. The original data may include explanatory spatial covariates, such as road curvature, and ‘mark’ variables attributed to individual accidents, such as accident severity. The treatment of mark variables is new. Variable selection is applied to the canonical covariates, which may include spatial covariate effects, mark effects and mark-covariate interactions. We approximate the likelihood of the point process model by that of a generalised linear model, in such a way that spatial covariates and marks are both associated with canonical covariates. We impose a convex penalty on the log likelihood, principally the elastic-net penalty, and maximise the penalised loglikelihood by cyclic coordinate ascent. A simulation study compares the performances of the lasso, ridge regression and elastic-net methods of variable selection on their ability to select variables correctly, and on their bias and standard error. Standard techniques for selecting the regularisation parameter γ often yielded unsatisfactory results. We propose two new rules for selecting γ which are designed to have better performance. The methods are tested on a small dataset on crimes in a Chicago neighbourhood, and applied to a large dataset of road traffic accidents in Western Australia.\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"63 3","pages":"417-454"},"PeriodicalIF":1.1,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90533201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

ECM algorithm for estimating vector ARMA model with variance gamma distribution and possible unbounded density 用ECM算法估计具有方差分布和可能无界密度的向量ARMA模型

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-10-18 DOI: 10.1111/anzs.12340

Thanakorn Nitithumbundit, Jennifer S.K. Chan

The simultaneous analysis of several financial time series is salient in portfolio setting and risk management. This paper proposes a novel alternating expectation conditional maximisation (AECM) algorithm to estimate the vector autoregressive moving average (VARMA) model with variance gamma (VG) error distribution in the multivariate skewed setting. We explain why the VARMA-VG model is suitable for high-frequency returns (HFRs) because VG distribution provides thick tails to capture the high kurtosis in the data and unbounded central density further captures the majority of near-zero HFRs. The distribution can also be expressed in normal-mean-variance mixtures to facilitate model implementation using the Bayesian or expectation maximisation (EM) approach. We adopt the EM approach to avoid the time-consuming Markov chain Monto Carlo sampling and solve the unbounded density problem in the classical maximum likelihood estimation. We conduct extensive simulation studies to evaluate the accuracy of the proposed AECM estimator and apply the models to analyse the dependency between two HFR series from the time zones that only differ by one hour.

同时分析多个金融时间序列在投资组合设置和风险管理中具有重要意义。本文提出了一种新的交替期望条件最大化(AECM)算法，用于估计多元偏态设置下具有方差伽玛(VG)误差分布的向量自回归移动平均(VARMA)模型。我们解释了为什么VARMA-VG模型适用于高频回报(HFRs)，因为VG分布提供了厚尾来捕获数据中的高峰度，无界中心密度进一步捕获了大多数接近零的HFRs。分布也可以用正态-均值-方差混合表示，以方便使用贝叶斯或期望最大化(EM)方法实现模型。采用EM方法避免了耗时的马尔可夫链蒙特卡罗采样，解决了经典极大似然估计中的无界密度问题。我们进行了广泛的模拟研究，以评估所提出的AECM估计器的准确性，并应用模型来分析两个仅相差一小时的时区HFR序列之间的相关性。

{"title":"ECM algorithm for estimating vector ARMA model with variance gamma distribution and possible unbounded density","authors":"Thanakorn Nitithumbundit, Jennifer S.K. Chan","doi":"10.1111/anzs.12340","DOIUrl":"https://doi.org/10.1111/anzs.12340","url":null,"abstract":"<div>\u0000 \u0000 The simultaneous analysis of several financial time series is salient in portfolio setting and risk management. This paper proposes a novel alternating expectation conditional maximisation (AECM) algorithm to estimate the vector autoregressive moving average (VARMA) model with variance gamma (VG) error distribution in the multivariate skewed setting. We explain why the VARMA-VG model is suitable for high-frequency returns (HFRs) because VG distribution provides thick tails to capture the high kurtosis in the data and unbounded central density further captures the majority of near-zero HFRs. The distribution can also be expressed in normal-mean-variance mixtures to facilitate model implementation using the Bayesian or expectation maximisation (EM) approach. We adopt the EM approach to avoid the time-consuming Markov chain Monto Carlo sampling and solve the unbounded density problem in the classical maximum likelihood estimation. We conduct extensive simulation studies to evaluate the accuracy of the proposed AECM estimator and apply the models to analyse the dependency between two HFR series from the time zones that only differ by one hour.\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"63 3","pages":"485-516"},"PeriodicalIF":1.1,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137538704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Inverse G-Wishart distribution and variational message passing 逆G-Wishart分布与变分消息传递

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-10-07 DOI: 10.1111/anzs.12339

Luca Maestrini, Matt P. Wand

Message passing on a factor graph is a powerful paradigm for the coding of approximate inference algorithms for arbitrarily large graphical models. The notion of a factor graph fragment allows for compartmentalisation of algebra and computer code. We show that the Inverse G-Wishart family of distributions enables fundamental variational message passing factor graph fragments to be expressed elegantly and succinctly. Such fragments arise in models for which approximate inference concerning covariance matrix or variance parameters is made, and are ubiquitous in contemporary statistics and machine learning.

在因子图上传递消息是为任意大型图形模型编写近似推理算法的强大范例。因子图片段的概念允许代数和计算机代码的划分。我们证明了逆G-Wishart分布族使基本变分消息传递因子图片段能够优雅而简洁地表达。这种片段出现在对协方差矩阵或方差参数进行近似推理的模型中，在当代统计学和机器学习中无处不在。

引用次数: 5

An adequacy approach for deciding the number of clusters for OTRIMLE robust Gaussian mixture-based clustering 基于OTRIMLE鲁棒高斯混合聚类的聚类数量决定的充分性方法

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-09-03 DOI: 10.1111/anzs.12338

Christian Hennig, Pietro Coretto

We introduce a new approach to deciding the number of clusters. The approach is applied to Optimally Tuned Robust Improper Maximum Likelihood Estimation (OTRIMLE; Coretto & Hennig, Journal of the American Statistical Association 111, 1648–1659) of a Gaussian mixture model allowing for observations to be classified as ‘noise’, but it can be applied to other clustering methods as well. The quality of a clustering is assessed by a statistic Q that measures how close the within-cluster distributions are to elliptical unimodal distributions that have the only mode in the mean. This non-parametric measure allows for non-Gaussian clusters as long as they have a good quality according to Q. The simplicity of a model is assessed by a measure S that prefers a smaller number of clusters unless additional clusters can reduce the estimated noise proportion substantially. The simplest model is then chosen that is adequate for the data in the sense that its observed value of Q is not significantly larger than what is expected for data truly generated from the fitted model, as can be assessed by parametric bootstrap. The approach is compared with model-based clustering using the Bayesian information criterion (BIC) and the integrated complete likelihood (ICL) in a simulation study and on two real data sets.

我们介绍了一种确定集群数量的新方法。该方法应用于最优调谐鲁棒不当极大似然估计(OTRIMLE;Coretto,Hennig，《美国统计协会杂志》(Journal of American Statistical Association)，第111期，1648-1659期)，他提出了一种高斯混合模型，该模型允许将观测结果归类为“噪声”，但它也可以应用于其他聚类方法。聚类的质量是通过统计量Q来评估的，该统计量Q测量聚类内分布与椭圆单峰分布的接近程度，椭圆单峰分布的唯一模式是在平均值中。这种非参数度量允许非高斯聚类，只要它们根据q具有良好的质量。模型的简单性由度量S评估，该度量S倾向于较少数量的聚类，除非额外的聚类可以大幅降低估计的噪声比例。然后选择最简单的模型，该模型适合于数据，因为其观察到的Q值不会显著大于从拟合模型真正生成的数据的预期值，可以通过参数自举来评估。在仿真研究和两个真实数据集上，将该方法与基于贝叶斯信息准则(BIC)和集成完全似然(ICL)的模型聚类方法进行了比较。

{"title":"An adequacy approach for deciding the number of clusters for OTRIMLE robust Gaussian mixture-based clustering","authors":"Christian Hennig, Pietro Coretto","doi":"10.1111/anzs.12338","DOIUrl":"10.1111/anzs.12338","url":null,"abstract":"We introduce a new approach to deciding the number of clusters. The approach is applied to Optimally Tuned Robust Improper Maximum Likelihood Estimation (OTRIMLE; Coretto & Hennig, Journal of the American Statistical Association 111, 1648–1659) of a Gaussian mixture model allowing for observations to be classified as ‘noise’, but it can be applied to other clustering methods as well. The quality of a clustering is assessed by a statistic Q that measures how close the within-cluster distributions are to elliptical unimodal distributions that have the only mode in the mean. This non-parametric measure allows for non-Gaussian clusters as long as they have a good quality according to Q. The simplicity of a model is assessed by a measure S that prefers a smaller number of clusters unless additional clusters can reduce the estimated noise proportion substantially. The simplest model is then chosen that is adequate for the data in the sense that its observed value of Q is not significantly larger than what is expected for data truly generated from the fitted model, as can be assessed by parametric bootstrap. The approach is compared with model-based clustering using the Bayesian information criterion (BIC) and the integrated complete likelihood (ICL) in a simulation study and on two real data sets.","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":"64 2","pages":"230-254"},"PeriodicalIF":1.1,"publicationDate":"2021-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1111/anzs.12338","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75692546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

What is the effective sample size of a spatial point process? 空间点过程的有效样本量是多少?

IF 1.1 4区数学 Q3 STATISTICS & PROBABILITY

Australian & New Zealand Journal of Statistics

Pub Date : 2021-07-21 DOI: 10.1111/anzs.12337

Ian W. Renner, David I. Warton, Francis K.C. Hui

Point process models are a natural approach for modelling data that arise as point events. In the case of Poisson counts, these may be fitted easily as a weighted Poisson regression. Point processes lack the notion of sample size. This is problematic for model selection, because various classical criteria such as the Bayesian information criterion (BIC) are a function of the sample size, n, and are derived in an asymptotic framework where n tends to infinity. In this paper, we develop an asymptotic result for Poisson point process models in which the observed number of point events, m, plays the role that sample size does in the classical regression context. Following from this result, we derive a version of BIC for point process models, and when fitted via penalised likelihood, conditions for the LASSO penalty that ensure consistency in estimation and the oracle property. We discuss challenges extending these results to the wider class of Gibbs models, of which the Poisson point process model is a special case.

点过程模型是对作为点事件产生的数据进行建模的自然方法。在泊松计数的情况下，这些可以很容易地拟合为加权泊松回归。点过程缺乏样本大小的概念。这对于模型选择是有问题的，因为各种经典准则，如贝叶斯信息准则(BIC)是样本量n的函数，并且是在n趋于无穷大的渐近框架中导出的。在本文中，我们开发了泊松点过程模型的渐近结果，其中观察到的点事件数m在经典回归环境中起着样本大小的作用。根据这一结果，我们为点过程模型导出了一个版本的BIC，当通过惩罚似然进行拟合时，LASSO惩罚的条件确保了估计和oracle属性的一致性。我们讨论了将这些结果扩展到更广泛的吉布斯模型的挑战，其中泊松点过程模型是一个特例。

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Australian & New Zealand Journal of Statistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀