首页 > 最新文献

Australian & New Zealand Journal of Statistics最新文献

英文 中文
Global implicit function theorems and the online expectation–maximisation algorithm 全局隐函数定理和在线期望最大化算法
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2022-01-24 DOI: 10.1111/anzs.12356
Hien Duy Nguyen, Florence Forbes
The expectation–maximisation (EM) algorithm framework is an important tool for statistical computation. Due to the changing nature of data, online and mini‐batch variants of EM and EM‐like algorithms have become increasingly popular. The consistency of the estimator sequences that are produced by these EM variants often rely on an assumption regarding the continuous differentiability of a parameter update function. In many cases, the parameter update function is not in closed form and may only be defined implicitly, which makes the verification of the continuous differentiability property difficult. We demonstrate how a global implicit function theorem can be used to verify such properties in the cases of finite mixtures of distributions in the exponential family, and more generally, when the component‐specific distributions admit data augmentation schemes, within the exponential family. We then illustrate the use of such a theorem in the cases of mixtures of beta distributions, gamma distributions, fully visible Boltzmann machines and Student distributions. Via numerical simulations, we provide empirical evidence towards the consistency of the online EM algorithm parameter estimates in such cases.
期望最大化(EM)算法框架是统计计算的重要工具。由于数据性质的变化,EM和类EM算法的在线和小批量变体变得越来越流行。由这些EM变量产生的估计序列的一致性通常依赖于关于参数更新函数的连续可微性的假设。在许多情况下,参数更新函数不是封闭形式,只能隐式定义,这使得连续可微性的验证变得困难。我们演示了如何使用全局隐函数定理来验证指数族分布的有限混合情况下的这些性质,更一般地说,当成分特定分布允许数据增强方案时,在指数族内。然后,我们说明了在β分布、γ分布、完全可见玻尔兹曼机和学生分布的混合情况下使用这个定理。通过数值模拟,我们为在线EM算法参数估计在这种情况下的一致性提供了经验证据。
{"title":"Global implicit function theorems and the online expectation–maximisation algorithm","authors":"Hien Duy Nguyen, Florence Forbes","doi":"10.1111/anzs.12356","DOIUrl":"10.1111/anzs.12356","url":null,"abstract":"The expectation–maximisation (EM) algorithm framework is an important tool for statistical computation. Due to the changing nature of data, online and mini‐batch variants of EM and EM‐like algorithms have become increasingly popular. The consistency of the estimator sequences that are produced by these EM variants often rely on an assumption regarding the continuous differentiability of a parameter update function. In many cases, the parameter update function is not in closed form and may only be defined implicitly, which makes the verification of the continuous differentiability property difficult. We demonstrate how a global implicit function theorem can be used to verify such properties in the cases of finite mixtures of distributions in the exponential family, and more generally, when the component‐specific distributions admit data augmentation schemes, within the exponential family. We then illustrate the use of such a theorem in the cases of mixtures of beta distributions, gamma distributions, fully visible Boltzmann machines and Student distributions. Via numerical simulations, we provide empirical evidence towards the consistency of the online EM algorithm parameter estimates in such cases.","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83582209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Sufficient dimension reduction for clustered data via finite mixture modelling 通过有限混合模型对聚类数据进行足够的降维
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2022-01-22 DOI: 10.1111/anzs.12349
F.K.C. Hui, L.H. Nghiem

Sufficient dimension reduction (SDR) is an attractive approach to regression modelling. However, despite its rich literature and growing popularity in application, surprisingly little research has been done on how to perform SDR for clustered data, for example as is commonly arises in longitudinal studies. Indeed, current popular SDR methods have been mostly based on a marginal estimating equation approach. In this article, we propose a new approach to SDR for clustered data based on a combination of finite mixture modelling and mixed effects regression. Finite mixture models offer a flexible means of estimating the fixed effects central subspace, based on slicing the space up and probabilistically clustering observations to each slice (mixture component). Dimension reduction is achieved by having the mixing proportions vary only through the sufficient fixed effect predictors. We then incorporate random effects as a natural means of accounting for correlations within clusters. We employ a Monte Carlo expectation–maximisation algorithm to estimate the model parameters and fixed effects central subspace, and discuss methods for associated uncertainty quantification and prediction. Simulation studies demonstrate that our approach performs strongly against both estimating equation methods for estimating the fixed effects central subspace, and SDR methods which do not account for within-cluster correlation. Finally, we apply the proposed approach to a data set on air pollutant monitoring across 13 stations in the Eastern United States.

充分降维(SDR)是一种有吸引力的回归建模方法。然而,尽管其文献丰富,应用日益普及,但令人惊讶的是,关于如何对聚类数据执行SDR的研究却很少,例如在纵向研究中常见的研究。事实上,目前流行的SDR方法大多基于边际估计方程方法。在本文中,我们提出了一种基于有限混合建模和混合效应回归相结合的聚类数据SDR新方法。有限混合模型提供了一种灵活的方法来估计固定效应的中心子空间,基于对空间的分割和对每个切片(混合分量)的概率聚类观察。只有通过足够的固定效应预测因子,混合比例才会发生变化,从而实现降维。然后,我们将随机效应作为计算集群内相关性的自然手段。我们采用蒙特卡罗期望最大化算法来估计模型参数和固定效应中心子空间,并讨论了相关的不确定性量化和预测方法。仿真研究表明,我们的方法对估计固定效应中心子空间的估计方程方法和不考虑簇内相关性的SDR方法都有很强的性能。最后,我们将提出的方法应用于美国东部13个站点的空气污染物监测数据集。
{"title":"Sufficient dimension reduction for clustered data via finite mixture modelling","authors":"F.K.C. Hui,&nbsp;L.H. Nghiem","doi":"10.1111/anzs.12349","DOIUrl":"10.1111/anzs.12349","url":null,"abstract":"<div>\u0000 \u0000 <p>Sufficient dimension reduction (SDR) is an attractive approach to regression modelling. However, despite its rich literature and growing popularity in application, surprisingly little research has been done on how to perform SDR for clustered data, for example as is commonly arises in longitudinal studies. Indeed, current popular SDR methods have been mostly based on a marginal estimating equation approach. In this article, we propose a new approach to SDR for clustered data based on a combination of finite mixture modelling and mixed effects regression. Finite mixture models offer a flexible means of estimating the fixed effects central subspace, based on slicing the space up and probabilistically clustering observations to each slice (mixture component). Dimension reduction is achieved by having the mixing proportions vary only through the sufficient fixed effect predictors. We then incorporate random effects as a natural means of accounting for correlations within clusters. We employ a Monte Carlo expectation–maximisation algorithm to estimate the model parameters and fixed effects central subspace, and discuss methods for associated uncertainty quantification and prediction. Simulation studies demonstrate that our approach performs strongly against both estimating equation methods for estimating the fixed effects central subspace, and SDR methods which do not account for within-cluster correlation. Finally, we apply the proposed approach to a data set on air pollutant monitoring across 13 stations in the Eastern United States.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73971724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Bayesian credible intervals for population attributable risk from case–control, cohort and cross-sectional studies 来自病例对照、队列和横断面研究的人群归因风险的贝叶斯可信区间
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2022-01-17 DOI: 10.1111/anzs.12352
Sarah Pirikahu, Geoffrey Jones, Martin L. Hazelton

Population attributable risk (PAR) and population attributable fraction (PAF) are used in epidemiology to predict the impact of removing a risk factor from the population. Until recently, no standard approach for calculating confidence intervals or the variance for PAR in particular was available in the literature. Previously we outlined a fully Bayesian approach to provide credible intervals for the PAR and PAF from a cross-sectional study, where the data was presented in the form of a 2×2 table. However, extensions to cater for other frequently used study designs were not provided. In this paper we provide methodology to calculate credible intervals for the PAR and PAF for case–control and cohort studies. Additionally, we extend the cross-sectional example to allow for the incorporation of uncertainty that arises when an imperfect diagnostic test is used. In all these situations the model becomes over-parameterised, or non-identifiable, which can result in standard ‘off-the-shelf’ Markov Chain Monte Carlo (MCMC) updaters taking a long time to converge or even failing altogether. We adapt an importance sampling methodology to overcome this problem, and propose some novel MCMC samplers that take into consideration the shape of the posterior ridge to aid in the convergence of the Markov chain.

人口归因风险(PAR)和人口归因分数(PAF)在流行病学中用于预测从人群中去除危险因素的影响。直到最近,在文献中还没有计算置信区间或PAR方差的标准方法。之前,我们概述了一种完全贝叶斯方法,从横断面研究中为PAR和PAF提供可信的区间,其中数据以2×2表的形式呈现。但是,没有提供扩展以满足其他常用的研究设计。在本文中,我们提供了计算病例对照和队列研究的PAR和PAF可信区间的方法。此外,我们扩展了横断面的例子,以允许合并不确定性,当一个不完善的诊断测试是使用。在所有这些情况下,模型变得过度参数化或不可识别,这可能导致标准的“现成”马尔可夫链蒙特卡罗(MCMC)更新需要很长时间才能收敛,甚至完全失败。为了克服这一问题,我们采用了一种重要采样方法,并提出了一些考虑后脊形状的新型MCMC采样器,以帮助马尔可夫链收敛。
{"title":"Bayesian credible intervals for population attributable risk from case–control, cohort and cross-sectional studies","authors":"Sarah Pirikahu,&nbsp;Geoffrey Jones,&nbsp;Martin L. Hazelton","doi":"10.1111/anzs.12352","DOIUrl":"10.1111/anzs.12352","url":null,"abstract":"<div>\u0000 \u0000 <p>Population attributable risk (PAR) and population attributable fraction (PAF) are used in epidemiology to predict the impact of removing a risk factor from the population. Until recently, no standard approach for calculating confidence intervals or the variance for PAR in particular was available in the literature. Previously we outlined a fully Bayesian approach to provide credible intervals for the PAR and PAF from a cross-sectional study, where the data was presented in the form of a 2×2 table. However, extensions to cater for other frequently used study designs were not provided. In this paper we provide methodology to calculate credible intervals for the PAR and PAF for case–control and cohort studies. Additionally, we extend the cross-sectional example to allow for the incorporation of uncertainty that arises when an imperfect diagnostic test is used. In all these situations the model becomes over-parameterised, or non-identifiable, which can result in standard ‘off-the-shelf’ Markov Chain Monte Carlo (MCMC) updaters taking a long time to converge or even failing altogether. We adapt an importance sampling methodology to overcome this problem, and propose some novel MCMC samplers that take into consideration the shape of the posterior ridge to aid in the convergence of the Markov chain.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79829223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring the values of cricket players 衡量板球运动员的价值
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2022-01-15 DOI: 10.1111/anzs.12353
Pranjal Chandrakar, Shubhabrata Das

Sports franchises that participate in team sports can make better decisions regarding their players’ financial compensation, renewal of the contracts, bidding strategies during the auction, etc., if they can adequately assess the value or worth of their players. Evaluating the value of a player in a team sport is difficult because various team members play different roles. In this study, we resolve this by measuring the value of a player in terms of how his inclusion in the team affects the team's probability of winning. With this notion of value, we develop a technique to measure the worth of a cricket player for his franchise. To illustrate this technique, we evaluate the values of cricket players who play in the Indian Premier League. We also study the relationship between players’ values and their salaries. We find that a few popular players earn disproportionately more than others. This disproportionality in the income of popular players cannot be justified by their performance alone, as adjudged by their values in this work. We attribute the disproportionality in the income to the factors not captured via conventional yardsticks, including leadership or brand value.

如果能够充分评估球员的价值或价值,参与团队运动的体育特许经营机构可以在球员的经济补偿、合同续约、拍卖期间的竞标策略等方面做出更好的决策。评估团队运动中球员的价值是困难的,因为不同的团队成员扮演不同的角色。在本研究中,我们通过衡量球员的价值来解决这个问题,即球员加入球队对球队获胜概率的影响。有了这个价值的概念,我们开发了一种技术来衡量一个板球运动员对他的球队的价值。为了说明这一技术,我们评估了在印度超级联赛中打球的板球运动员的价值观。我们还研究了球员的价值和薪水之间的关系。我们发现一些受欢迎的球员比其他人赚得多得不成比例。受欢迎球员收入的不均衡不能仅仅通过他们的表现来证明,正如他们在这项工作中的价值观所判断的那样。我们将收入的不均衡归因于传统标准无法捕捉到的因素,包括领导力或品牌价值。
{"title":"Measuring the values of cricket players","authors":"Pranjal Chandrakar,&nbsp;Shubhabrata Das","doi":"10.1111/anzs.12353","DOIUrl":"10.1111/anzs.12353","url":null,"abstract":"<div>\u0000 \u0000 <p>Sports franchises that participate in team sports can make better decisions regarding their players’ financial compensation, renewal of the contracts, bidding strategies during the auction, etc., if they can adequately assess the value or worth of their players. Evaluating the value of a player in a team sport is difficult because various team members play different roles. In this study, we resolve this by measuring the value of a player in terms of how his inclusion in the team affects the team's probability of winning. With this notion of value, we develop a technique to measure the worth of a cricket player for his franchise. To illustrate this technique, we evaluate the values of cricket players who play in the Indian Premier League. We also study the relationship between players’ values and their salaries. We find that a few popular players earn disproportionately more than others. This disproportionality in the income of popular players cannot be justified by their performance alone, as adjudged by their values in this work. We attribute the disproportionality in the income to the factors not captured via conventional yardsticks, including leadership or brand value.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74128017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection boundary for a sparse gamma scale mixture model 稀疏伽玛尺度混合模型的检测边界
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2022-01-11 DOI: 10.1111/anzs.12347
Michael I. Stewart

We derive the detection boundary for the one-sided version of the gamma scale mixture model where the contaminating component has a larger mean than the known reference distribution. We also derive an adaptive test which is able to almost uniformly attain the best possible performance in terms of detection of local alternatives.

我们导出了单侧版本的伽马尺度混合模型的检测边界,其中污染成分的平均值大于已知的参考分布。我们还推导了一种自适应测试,该测试能够在检测局部替代方案方面几乎一致地获得最佳性能。
{"title":"Detection boundary for a sparse gamma scale mixture model","authors":"Michael I. Stewart","doi":"10.1111/anzs.12347","DOIUrl":"10.1111/anzs.12347","url":null,"abstract":"<div>\u0000 \u0000 <p>We derive the detection boundary for the one-sided version of the gamma scale mixture model where the contaminating component has a larger mean than the known reference distribution. We also derive an adaptive test which is able to almost uniformly attain the best possible performance in terms of detection of local alternatives.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2022-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77949170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Odds-symmetry model for cumulative probabilities and decomposition of a conditional symmetry model in square contingency tables 平方列联表中累积概率的奇数-对称模型及条件对称模型的分解
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2021-12-06 DOI: 10.1111/anzs.12346
Shuji Ando

For the analysis of square contingency tables, it is necessary to estimate an unknown distribution with high confidence from an obtained observation. For that purpose, we need to introduce a statistical model that fits the data well and has parsimony. This study proposes asymmetry models based on cumulative probabilities for square contingency tables with the same row and column ordinal classifications. In the proposed models, the odds, for all i<j, that an observation will fall in row category i or below, and column category j or above, instead of row category j or above, and column category i or below, depend on only row category i or column category j. This is notwithstanding that the odds are constant without relying on row and column categories under the conditional symmetry (CS) model. The proposed models constantly hold when the CS model holds. However, the converse is not necessarily true. This study also shows that it is necessary to satisfy the extended marginal homogeneity model, in addition to the proposed models, to satisfy the CS model. These decomposition theorems explain why the CS model does not hold. The proposed models provide a better fit for application to a single data set of real-world occupational data for father-and-son dyads.

对于平方列联表的分析,需要从已获得的观测值中估计出具有高置信度的未知分布。为此,我们需要引入一种能够很好地拟合数据并具有简约性的统计模型。本研究提出了基于累积概率的方形列联表的不对称模型,具有相同的行和列顺序分类。在提出的模型中,对于所有i<j,一个观测值将落在第i行类别或以下,列类别j或以上,而不是行类别j或以上,列类别i或以下的几率,仅取决于行类别i或列类别j。尽管在条件对称(CS)模型下,几率是恒定的,不依赖于行和列类别。当CS模型成立时,所提出的模型一直成立。然而,反过来未必正确。研究还表明,在满足CS模型的基础上,还需要满足扩展边际均匀性模型。这些分解定理解释了为什么CS模型不成立。所提出的模型更适合应用于父子二人组真实职业数据的单一数据集。
{"title":"Odds-symmetry model for cumulative probabilities and decomposition of a conditional symmetry model in square contingency tables","authors":"Shuji Ando","doi":"10.1111/anzs.12346","DOIUrl":"10.1111/anzs.12346","url":null,"abstract":"<div>\u0000 \u0000 <p>For the analysis of square contingency tables, it is necessary to estimate an unknown distribution with high confidence from an obtained observation. For that purpose, we need to introduce a statistical model that fits the data well and has parsimony. This study proposes asymmetry models based on cumulative probabilities for square contingency tables with the same row and column ordinal classifications. In the proposed models, the odds, for all <i>i</i>&lt;<i>j</i>, that an observation will fall in row category <i>i</i> or below, and column category <i>j</i> or above, instead of row category <i>j</i> or above, and column category <i>i</i> or below, depend on only row category <i>i</i> or column category <i>j</i>. This is notwithstanding that the odds are constant without relying on row and column categories under the conditional symmetry (CS) model. The proposed models constantly hold when the CS model holds. However, the converse is not necessarily true. This study also shows that it is necessary to satisfy the extended marginal homogeneity model, in addition to the proposed models, to satisfy the CS model. These decomposition theorems explain why the CS model does not hold. The proposed models provide a better fit for application to a single data set of real-world occupational data for father-and-son dyads.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77244237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Proportional inverse Gaussian distribution: A new tool for analysing continuous proportional data 比例反高斯分布:分析连续比例数据的新工具
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2021-11-23 DOI: 10.1111/anzs.12345
Pengyi Liu, Guo-Liang Tian, Kam Chuen Yuen, Chi Zhang, Man-Lai Tang

Outcomes in the form of rates, fractions, proportions and percentages often appear in various fields. Existing beta and simplex distributions are frequently unable to exhibit satisfactory performances in fitting such continuous data. This paper aims to develop the normalised inverse Gaussian (N-IG) distribution proposed by Lijoi, Mena & Prünster (2005, Journal of the American Statistical Association, 100, 1278–1291) as a new tool for analysing continuous proportional data in (0,1) and renames the N-IG as proportional inverse Gaussian (PIG) distribution. Our main contributions include: (i) To overcome the difficulty of an integral in the PIG density function, we propose a novel minorisation–maximisation (MM) algorithm via the continuous version of Jensen's inequality to calculate the maximum likelihood estimates of the parameters in the PIG distribution; (ii) We also develop an MM algorithm aided by the gradient descent algorithm for the PIG regression model, which allows us to explore the relationship between a set of covariates with the mean parameter; (iii) Both the comparative studies and the real data analyses show that the PIG distribution is better when comparing with the beta and simplex distributions in terms of the AIC, the Cramér–von Mises and the Kolmogorov–Smirnov tests. In addition, bootstrap confidence intervals and testing hypothesis on the symmetry of the PIG density are also presented. Simulation studies are conducted and the hospital stay data of Barcelona in 1988 and 1990 are analysed to illustrate the proposed methods.

比率、分数、比例和百分比形式的结果经常出现在各个领域。现有的beta和单纯形分布在拟合此类连续数据时往往不能表现出令人满意的性能。本文旨在发展Lijoi, Mena &提出的归一化逆高斯分布(N-IG)。pr nster (2005, Journal of American Statistical Association, 100, 1278-1291)作为分析(0,1)中连续比例数据的新工具,并将N-IG重命名为比例逆高斯分布(PIG)。我们的主要贡献包括:(i)为了克服PIG密度函数中积分的困难,我们提出了一种新的最小化-最大化(MM)算法,该算法通过Jensen不等式的连续版本来计算PIG分布中参数的最大似然估计;(ii)我们还开发了一种由梯度下降算法辅助的MM算法,用于PIG回归模型,这使我们能够探索一组协变量与平均参数之间的关系;(iii)对比研究和实际数据分析均表明,在AIC、cram - von Mises和Kolmogorov-Smirnov检验方面,PIG分布优于beta分布和单纯形分布。此外,还提出了自举置信区间和关于PIG密度对称性的检验假设。本文进行了模拟研究,并分析了巴塞罗那1988年和1990年的住院数据,以说明所提出的方法。
{"title":"Proportional inverse Gaussian distribution: A new tool for analysing continuous proportional data","authors":"Pengyi Liu,&nbsp;Guo-Liang Tian,&nbsp;Kam Chuen Yuen,&nbsp;Chi Zhang,&nbsp;Man-Lai Tang","doi":"10.1111/anzs.12345","DOIUrl":"10.1111/anzs.12345","url":null,"abstract":"<div>\u0000 \u0000 <p>Outcomes in the form of rates, fractions, proportions and percentages often appear in various fields. Existing beta and simplex distributions are frequently unable to exhibit satisfactory performances in fitting such continuous data. This paper aims to develop the normalised inverse Gaussian (N-IG) distribution proposed by Lijoi, Mena &amp; Prünster (2005, Journal of the American Statistical Association, <b>100</b>, 1278–1291) as a new tool for analysing continuous proportional data in (0,1) and renames the N-IG as proportional inverse Gaussian (PIG) distribution. Our main contributions include: (i) To overcome the difficulty of an integral in the PIG density function, we propose a novel minorisation–maximisation (MM) algorithm via the continuous version of Jensen's inequality to calculate the maximum likelihood estimates of the parameters in the PIG distribution; (ii) We also develop an MM algorithm aided by the gradient descent algorithm for the PIG regression model, which allows us to explore the relationship between a set of covariates with the mean parameter; (iii) Both the comparative studies and the real data analyses show that the PIG distribution is better when comparing with the beta and simplex distributions in terms of the AIC, the Cramér–von Mises and the Kolmogorov–Smirnov tests. In addition, bootstrap confidence intervals and testing hypothesis on the symmetry of the PIG density are also presented. Simulation studies are conducted and the hospital stay data of Barcelona in 1988 and 1990 are analysed to illustrate the proposed methods.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2021-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87974708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
BNPdensity: Bayesian nonparametric mixture modelling in R bnp密度:贝叶斯非参数混合建模
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2021-11-17 DOI: 10.1111/anzs.12342
J. Arbel, G. Kon Kam King, A. Lijoi, L. Nieto-Barajas, I. Prünster

Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the R package BNPdensity in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalised random measures, which represent a generalisation of the popular Dirichlet process mixture. One striking advantage of this generalisation is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson & Klass algorithm. The package also offers several goodness-of-fit diagnostics such as QQ plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the species sensitivity distribution problem, showcasing the benefits of the Bayesian nonparametric framework.

在潜在的模型错误规范下,稳健的统计数据建模通常需要离开参数世界而进入非参数世界。在后者中,参数是无限维对象,如函数、概率分布或无限向量。在贝叶斯非参数方法中,为这些参数设计了先验分布,为实际中管理非参数模型的复杂性提供了一个把柄。然而,大多数现代贝叶斯非参数模型对于实践者来说似乎经常是遥不可及的,因为推理算法需要仔细设计来处理无限数量的参数。这项工作的目的是通过为贝叶斯非参数推理提供计算工具来促进这一过程。本文描述了R包BNPdensity中可用的一组函数,用于对无限混合模型(包括所有类型的截尾数据)进行密度估计。该包提供了访问一个大的类这样的模型基于标准化的随机措施,这代表了流行的狄利克雷过程混合物的推广。这种泛化的一个显著优点是,它提供了比狄利克雷更健壮的聚类数量先验。另一个关键的优点是在指定集群的规模和位置参数的先验时完全灵活,因为不需要共轭。推理是使用一种被称为弗格森(Ferguson)的理论基础近似抽样方法进行的。Klass算法。该软件包还提供了一些适合度诊断,如QQ图,包括交叉验证标准,条件预测坐标。该方法以物种敏感性分布问题为例,展示了贝叶斯非参数框架的优越性。
{"title":"BNPdensity: Bayesian nonparametric mixture modelling in R","authors":"J. Arbel,&nbsp;G. Kon Kam King,&nbsp;A. Lijoi,&nbsp;L. Nieto-Barajas,&nbsp;I. Prünster","doi":"10.1111/anzs.12342","DOIUrl":"10.1111/anzs.12342","url":null,"abstract":"<div>\u0000 \u0000 <p>Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the <span>R</span> package <span>BNPdensity</span> in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalised random measures, which represent a generalisation of the popular Dirichlet process mixture. One striking advantage of this generalisation is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson &amp; Klass algorithm. The package also offers several goodness-of-fit diagnostics such as QQ plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the species sensitivity distribution problem, showcasing the benefits of the Bayesian nonparametric framework.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90676545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Experimental design in practice: The importance of blocking and treatment structures 实践中的实验设计:阻塞和处理结构的重要性
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2021-11-08 DOI: 10.1111/anzs.12343
E.R. Williams, C.G. Forde, J. Imaki, K. Oelkers

Experimental design and analysis has evolved substantially over the last 100 years, driven to a large extent by the power and availability of the computer. To demonstrate this development and encourage the use of experimental design in practice, three experiments from different research areas are presented. In these examples multiple blocking factors have been employed and they show how extraneous variation can be accommodated and interpreted. The examples are used to discuss the importance of blocking and treatment structures in the conduct of designed experiments.

在过去的100年里,实验设计和分析已经有了很大的发展,很大程度上是由计算机的能力和可用性驱动的。为了展示这一发展并鼓励在实践中使用实验设计,本文介绍了来自不同研究领域的三个实验。在这些例子中,多个阻碍因素被采用,它们显示了如何适应和解释外来的变化。通过实例讨论了阻塞和处理结构在设计实验中的重要性。
{"title":"Experimental design in practice: The importance of blocking and treatment structures","authors":"E.R. Williams,&nbsp;C.G. Forde,&nbsp;J. Imaki,&nbsp;K. Oelkers","doi":"10.1111/anzs.12343","DOIUrl":"10.1111/anzs.12343","url":null,"abstract":"<div>\u0000 \u0000 <p>Experimental design and analysis has evolved substantially over the last 100 years, driven to a large extent by the power and availability of the computer. To demonstrate this development and encourage the use of experimental design in practice, three experiments from different research areas are presented. In these examples multiple blocking factors have been employed and they show how extraneous variation can be accommodated and interpreted. The examples are used to discuss the importance of blocking and treatment structures in the conduct of designed experiments.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79888954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accelerating adaptation in the adaptive Metropolis–Hastings random walk algorithm 自适应Metropolis-Hastings随机漫步算法中的加速自适应
IF 1.1 4区 数学 Q3 Mathematics Pub Date : 2021-11-03 DOI: 10.1111/anzs.12344
Simon E.F. Spencer

The Metropolis–Hastings random walk algorithm remains popular with practitioners due to the wide variety of situations in which it can be successfully applied and the extreme ease with which it can be implemented. Adaptive versions of the algorithm use information from the early iterations of the Markov chain to improve the efficiency of the proposal. The aim of this paper is to reduce the number of iterations needed to adapt the proposal to the target, which is particularly important when the likelihood is time-consuming to evaluate. First, the accelerated shaping algorithm is a generalisation of both the adaptive proposal and adaptive Metropolis algorithms. It is designed to remove, from the estimate of the covariance matrix of the target, misleading information from the start of the chain. Second, the accelerated scaling algorithm rapidly changes the scale of the proposal to achieve a target acceptance rate. The usefulness of these approaches is illustrated with a range of examples.

大都会-黑斯廷斯随机游走算法仍然受到实践者的欢迎,因为它可以在各种各样的情况下成功应用,并且可以极其容易地实现。该算法的自适应版本使用来自马尔可夫链的早期迭代的信息来提高建议的效率。本文的目的是减少使建议适应目标所需的迭代次数,当评估可能性非常耗时时,这一点尤为重要。首先,加速整形算法是自适应proposal算法和自适应Metropolis算法的推广。它的目的是从目标的协方差矩阵的估计中去除从链开始的误导性信息。其次,加速缩放算法快速改变提案的尺度,以达到目标接受率。通过一系列例子说明了这些方法的有用性。
{"title":"Accelerating adaptation in the adaptive Metropolis–Hastings random walk algorithm","authors":"Simon E.F. Spencer","doi":"10.1111/anzs.12344","DOIUrl":"10.1111/anzs.12344","url":null,"abstract":"<p>The Metropolis–Hastings random walk algorithm remains popular with practitioners due to the wide variety of situations in which it can be successfully applied and the extreme ease with which it can be implemented. Adaptive versions of the algorithm use information from the early iterations of the Markov chain to improve the efficiency of the proposal. The aim of this paper is to reduce the number of iterations needed to adapt the proposal to the target, which is particularly important when the likelihood is time-consuming to evaluate. First, the accelerated shaping algorithm is a generalisation of both the adaptive proposal and adaptive Metropolis algorithms. It is designed to remove, from the estimate of the covariance matrix of the target, misleading information from the start of the chain. Second, the accelerated scaling algorithm rapidly changes the scale of the proposal to achieve a target acceptance rate. The usefulness of these approaches is illustrated with a range of examples.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2021-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.12344","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76002648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Australian & New Zealand Journal of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1