Simulation-based inferences have attracted much attention in recent years, as the direct computation of the likelihood function in many real-world problems is difficult or even impossible. Iterated filtering (Ionides, Bretó, and King 2006; Ionides, Bhadra, Atchadé,and King 2011) enables maximization of likelihood function via model perturbations and approximation of the gradient of loglikelihood through sequential Monte Carlo filtering. By an application of Stein’s identity, Doucet, Jacob, and Rubenthaler (2013) developed asecond-order approximation of the gradient of log-likelihood using sequential Monte Carlo smoothing. Based on these gradient approximations, we develop a new algorithm for maximizing the likelihood using the Nesterov accelerated gradient. We adopt the accelerated inexact gradient algorithm (Ghadimi and Lan 2016) to iterated filtering framework, relaxing the unbiased gradient approximation condition. We devise a perturbation policy for iterated filtering, allowing the new algorithm to converge at an optimal rate for both concave and non-concave log-likelihood functions. It is comparable to the recently developed Bayes map iterated filtering approach and outperforms the original iterated filtering approach.
基于模拟的推理近年来引起了人们的广泛关注,因为在许多现实问题中,直接计算似然函数是困难的,甚至是不可能的。迭代滤波(Ionides, Bretó, and King 2006;Ionides, Bhadra, atchad,and King 2011)通过模型扰动实现似然函数的最大化,并通过顺序蒙特卡罗滤波逼近对数似然梯度。Doucet、Jacob和Rubenthaler(2013)运用Stein恒等式,利用序贯蒙特卡罗平滑开发了对数似然梯度的二阶近似。基于这些梯度近似,我们开发了一种利用Nesterov加速梯度最大化似然的新算法。我们在迭代滤波框架中采用加速不精确梯度算法(Ghadimi and Lan 2016),放宽了无偏梯度逼近条件。我们设计了一种迭代滤波的扰动策略,允许新算法以最优速率收敛于凹和非凹对数似然函数。它可以与最近开发的贝叶斯映射迭代滤波方法相媲美,并且优于原始迭代滤波方法。
{"title":"Accelerated Iterated Filtering","authors":"D. Nguyen","doi":"10.17713/ajs.v52i4.1503","DOIUrl":"https://doi.org/10.17713/ajs.v52i4.1503","url":null,"abstract":"Simulation-based inferences have attracted much attention in recent years, as the direct computation of the likelihood function in many real-world problems is difficult or even impossible. Iterated filtering (Ionides, Bretó, and King 2006; Ionides, Bhadra, Atchadé,and King 2011) enables maximization of likelihood function via model perturbations and approximation of the gradient of loglikelihood through sequential Monte Carlo filtering. By an application of Stein’s identity, Doucet, Jacob, and Rubenthaler (2013) developed asecond-order approximation of the gradient of log-likelihood using sequential Monte Carlo smoothing. Based on these gradient approximations, we develop a new algorithm for maximizing the likelihood using the Nesterov accelerated gradient. We adopt the accelerated inexact gradient algorithm (Ghadimi and Lan 2016) to iterated filtering framework, relaxing the unbiased gradient approximation condition. We devise a perturbation policy for iterated filtering, allowing the new algorithm to converge at an optimal rate for both concave and non-concave log-likelihood functions. It is comparable to the recently developed Bayes map iterated filtering approach and outperforms the original iterated filtering approach.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85926700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The aim of this paper is to introduce a generalized LASSO regression model that is derived using a generalized Laplace (GL) distribution. Five different GL distributions are obtained through the T -R{Y } framework with quantile functions of standard uniform, Weibull, log-logistic, logistic, and extreme value distributions. The properties, including quantile function, mode, and Shannon entropy of these GL distributions are derived. A particular case of GL distributions called the beta-Laplace distribution is explored. Some additional components to the constraint in the ordinary LASSO regression model are obtained through the Bayesian interpretation of LASSO with beta-Laplace priors. The geometric interpretations of these additional components are presented. The effects of the parameters from beta-Laplace distribution in the generalized LASSO regression model are also discussed. Two real data sets are analyzed to illustrate the flexibility and usefulness of the generalized LASSO regression model in the process of variable selection with better prediction performance. Consequently, this research study demonstrates that more flexible statistical distributions can be used to enhance LASSO in terms of flexibility in variable selection and shrinkage with better prediction.
{"title":"A Generalization of LASSO Modeling via Bayesian Interpretation","authors":"Gayan Warahena-Liyanage, F. Famoye, Carl Lee","doi":"10.17713/ajs.v52i4.1455","DOIUrl":"https://doi.org/10.17713/ajs.v52i4.1455","url":null,"abstract":"The aim of this paper is to introduce a generalized LASSO regression model that is derived using a generalized Laplace (GL) distribution. Five different GL distributions are obtained through the T -R{Y } framework with quantile functions of standard uniform, Weibull, log-logistic, logistic, and extreme value distributions. The properties, including quantile function, mode, and Shannon entropy of these GL distributions are derived. A particular case of GL distributions called the beta-Laplace distribution is explored. Some additional components to the constraint in the ordinary LASSO regression model are obtained through the Bayesian interpretation of LASSO with beta-Laplace priors. The geometric interpretations of these additional components are presented. The effects of the parameters from beta-Laplace distribution in the generalized LASSO regression model are also discussed. Two real data sets are analyzed to illustrate the flexibility and usefulness of the generalized LASSO regression model in the process of variable selection with better prediction performance. Consequently, this research study demonstrates that more flexible statistical distributions can be used to enhance LASSO in terms of flexibility in variable selection and shrinkage with better prediction.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"7 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73071828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Stefanini, N. D. Nikiforova, Eleonora Peruffo, Martina Bisello, Chiara Litardi, Massimiliano Mascherini
Upward convergence, that is an improvement of Members States’ economic and social indicators’ performance, is a core policy target of the European Union. This concept isembodied in the European Pillar of social rights (EPSR) proclaimed by EU leaders in 2017. In 2018 Eurofound has developed a methodology to measure convergence, which has beenimplemented in the convergEU R package described in this work (rel. 0.5.0). The package extends the original STATA toolbox developed by Eurofound beyond the calculations ofthe four main measures of convergence as it provides functions to download, filter, impute, smooth indicators. Country and indicator fiches are automatically prepared and compiledin HTML format. Graphical output includes qualitative patterns of change along time to emphasize the key behaviour of an indicator with respect to the average of an aggregationof Member States. Besides EU Member States, the analysis of other collections of regions is supported for general indicators if context information is provided. A user-friendly,no coding required, shiny-based web application provides policy makers with a tool to produce convergence reports on selected indicators and countries.
{"title":"Monitoring Convergence in the European Union with the convergEU Package for R","authors":"F. Stefanini, N. D. Nikiforova, Eleonora Peruffo, Martina Bisello, Chiara Litardi, Massimiliano Mascherini","doi":"10.17713/ajs.v52i4.1468","DOIUrl":"https://doi.org/10.17713/ajs.v52i4.1468","url":null,"abstract":"Upward convergence, that is an improvement of Members States’ economic and social indicators’ performance, is a core policy target of the European Union. This concept isembodied in the European Pillar of social rights (EPSR) proclaimed by EU leaders in 2017. In 2018 Eurofound has developed a methodology to measure convergence, which has beenimplemented in the convergEU R package described in this work (rel. 0.5.0). The package extends the original STATA toolbox developed by Eurofound beyond the calculations ofthe four main measures of convergence as it provides functions to download, filter, impute, smooth indicators. Country and indicator fiches are automatically prepared and compiledin HTML format. Graphical output includes qualitative patterns of change along time to emphasize the key behaviour of an indicator with respect to the average of an aggregationof Member States. Besides EU Member States, the analysis of other collections of regions is supported for general indicators if context information is provided. A user-friendly,no coding required, shiny-based web application provides policy makers with a tool to produce convergence reports on selected indicators and countries.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"42 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85120063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Compositional approaches are beginning to permeate high throughput biomedical sciences in the areas of microbiome, genomics, transcriptomics and proteomics. Yet non-compositional approaches are still commonly observed. Non-compositional approaches are particularly problematic in network analysis based on correlation, ordination and exploratory data analysis based on distance, and differential abundance analysis based on normalization. Here we describe the aIc R package, a simple tool that answers the fundamental question: does the dataset or normalization exhibit compositional artefacts that will skew interpretations when analyzing high throughput biomedical data? The aIc R package includes options for several of the most widely used normalizations and filtering methods. The R package includes tests for subcompositional dominance and coherence along with perturbation and scale invariance. Exploratory analysis is facilitated by an R Shiny app that makes the process simple for those not wishing to use an R console. This simple approach will allow research groups to acknowledge and account for potential artefacts in data analysis resulting in more robust and reliable inferences.
{"title":"amIcompositional: Simple Tests for Compositional Behaviour of High Throughput Data with Common Transformations","authors":"G. Gloor","doi":"10.17713/ajs.v52i4.1617","DOIUrl":"https://doi.org/10.17713/ajs.v52i4.1617","url":null,"abstract":"\u0000\u0000\u0000Compositional approaches are beginning to permeate high throughput biomedical sciences in the areas of microbiome, genomics, transcriptomics and proteomics. Yet non-compositional approaches are still commonly observed. Non-compositional approaches are particularly problematic in network analysis based on correlation, ordination and exploratory data analysis based on distance, and differential abundance analysis based on normalization. Here we describe the aIc R package, a simple tool that answers the fundamental question: does the dataset or normalization exhibit compositional artefacts that will skew interpretations when analyzing high throughput biomedical data? The aIc R package includes options for several of the most widely used normalizations and filtering methods. The R package includes tests for subcompositional dominance and coherence along with perturbation and scale invariance. Exploratory analysis is facilitated by an R Shiny app that makes the process simple for those not wishing to use an R console. This simple approach will allow research groups to acknowledge and account for potential artefacts in data analysis resulting in more robust and reliable inferences.\u0000\u0000\u0000","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"40 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76733072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It is known that normal distribution plays an important role in analysing symmetric data. However, this symmetric assumption may not hold in many real word and in such cases, asymmetric distribution, including skew normal distribution, are known as the best alternative. Constructing asymmetric distributions is carried out using the conditional/selection approach of several independent variable conditioning on other set of variables and this approach does not work well when the independence between variablesviolated. In this work we construct an asymmetric distribution when variables are dependent using a copula. Specifically, we consider the random vectors X and Y are connected using a copula function CX,Y and we study the selection distribution Z = (X|Y ∈ T ).We present some special cases of our proposed distribution, among them, multivariate skew-normal distribution. Some properties such as moments and moment generating function are investigated. Also, numerical analysis including simulation study as well asa real data set analysis are presented for illustration.
{"title":"Multivariate Asymmetric Distributions of Copula Related Random Variables","authors":"A. Sheikhi, Freshteh Arad, R. Mesiar","doi":"10.17713/ajs.v52i4.1446","DOIUrl":"https://doi.org/10.17713/ajs.v52i4.1446","url":null,"abstract":"\u0000\u0000\u0000It is known that normal distribution plays an important role in analysing symmetric data. However, this symmetric assumption may not hold in many real word and in such cases, asymmetric distribution, including skew normal distribution, are known as the best alternative. Constructing asymmetric distributions is carried out using the conditional/selection approach of several independent variable conditioning on other set of variables and this approach does not work well when the independence between variablesviolated. In this work we construct an asymmetric distribution when variables are dependent using a copula. Specifically, we consider the random vectors X and Y are connected using a copula function CX,Y and we study the selection distribution Z = (X|Y ∈ T ).We present some special cases of our proposed distribution, among them, multivariate skew-normal distribution. Some properties such as moments and moment generating function are investigated. Also, numerical analysis including simulation study as well asa real data set analysis are presented for illustration.\u0000\u0000\u0000","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"20 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75261323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carmen Borrego-Salcido, R. Juárez-Del-Toro, Alejandro Steven Fonseca-Zendejas
This study pretends to contribute to a better understanding of the COVID-19 dynamics through the non-parametric technique of phase synchronization by comparing the fifteen most affected countries by the number of positive cases plus China, where the firstoutbreak took place in December 2019. It was possible to state the number of cycles and waves for each one of the studied countries and to determine periods of synchronization between them. The results also showed the average duration of the cycles and some coincidences regarding Nason (2020); Bontempi (2021); Coccia (2021); Rusiñol, Zammit, Itarte, Forés, Martı́nez-Puchol, Girones, Borrego, Corominas, and Bofill-Mas (2021). This study is limited by the reliability of the number of positive cases reported by national governments and health authorities because of an insufficient number of tests and a great number of asymptomatic persons but presents a legit alternative to predict the evolution of the pandemic in a country due to the forward looking behavior of another one, therefore studies like this could be useful to implement contention measures and to prepare the health systems in advance.
{"title":"The Waves and Cycles of COVID-19 Pandemic: A Phase Synchronization Approach","authors":"Carmen Borrego-Salcido, R. Juárez-Del-Toro, Alejandro Steven Fonseca-Zendejas","doi":"10.17713/ajs.v52i3.1450","DOIUrl":"https://doi.org/10.17713/ajs.v52i3.1450","url":null,"abstract":"This study pretends to contribute to a better understanding of the COVID-19 dynamics through the non-parametric technique of phase synchronization by comparing the fifteen most affected countries by the number of positive cases plus China, where the firstoutbreak took place in December 2019. It was possible to state the number of cycles and waves for each one of the studied countries and to determine periods of synchronization between them. The results also showed the average duration of the cycles and some coincidences regarding Nason (2020); Bontempi (2021); Coccia (2021); Rusiñol, Zammit, Itarte, Forés, Martı́nez-Puchol, Girones, Borrego, Corominas, and Bofill-Mas (2021). This study is limited by the reliability of the number of positive cases reported by national governments and health authorities because of an insufficient number of tests and a great number of asymptomatic persons but presents a legit alternative to predict the evolution of the pandemic in a country due to the forward looking behavior of another one, therefore studies like this could be useful to implement contention measures and to prepare the health systems in advance.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"117 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79466902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The goal of this paper is to introduce the easily changeable kurtosis (ECK) distribution. The uniform distribution appears as a special cases of the ECK distribution. The new distribution tends to the normal distribution. Properties of the ECK distribution such as PDF, CDF, modes, inflection points, quantiles, moments, moment generating function, Moors’ measure, moments of order statistics, random number generator and the Fisher Information Matrix are derived. The unknown parameters of the ECK distribution are estimated by the maximum likelihood method. The Shannon, Renyi and Tsallis entropies are calculated. Illustrative examples of applicability and flexibility of the ECK distribution are given. The most important R codes are presented in the Appendix.
{"title":"Easily Changeable Kurtosis Distribution","authors":"P. Sulewski","doi":"10.17713/ajs.v52i3.1434","DOIUrl":"https://doi.org/10.17713/ajs.v52i3.1434","url":null,"abstract":"The goal of this paper is to introduce the easily changeable kurtosis (ECK) distribution. The uniform distribution appears as a special cases of the ECK distribution. The new distribution tends to the normal distribution. Properties of the ECK distribution such as PDF, CDF, modes, inflection points, quantiles, moments, moment generating function, Moors’ measure, moments of order statistics, random number generator and the Fisher Information Matrix are derived. The unknown parameters of the ECK distribution are estimated by the maximum likelihood method. The Shannon, Renyi and Tsallis entropies are calculated. Illustrative examples of applicability and flexibility of the ECK distribution are given. The most important R codes are presented in the Appendix.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"13 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87255277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Kharazmi, D. Kumar, S. Dey, St. Anthony’s College
In this article, we explore a new probability density function, called the power modified Lindley distribution. Its main feature is to operate a simple trade-off among the generalized exponential, Weibull and gamma distributions, offering an alternative to these three well-established distributions. The proposed model turns out to be quite flexible: its probability density function can be right skewed and its associated hazard rate function may be increasing, decreasing, unimodal and constant. First the model parameters of the proposed distribution are obtained by the maximum likelihood method. Next, Bayes estimators of the unknown parameters are obtained under different loss functions. In addition, bootstrap confidence intervals are provided to compare with Bayes credible intervals. Besides, log power modified Lindley regression model for censored data is proposed. Two real data sets are analyzed to illustrate the flexibility and importance of the proposed model.
{"title":"Power Modified Lindley Distribution: Properties, Classical and Bayesian Estimation and Regression Model with Applications","authors":"O. Kharazmi, D. Kumar, S. Dey, St. Anthony’s College","doi":"10.17713/ajs.v52i3.1386","DOIUrl":"https://doi.org/10.17713/ajs.v52i3.1386","url":null,"abstract":"In this article, we explore a new probability density function, called the power modified Lindley distribution. Its main feature is to operate a simple trade-off among the generalized exponential, Weibull and gamma distributions, offering an alternative to these three well-established distributions. The proposed model turns out to be quite flexible: its probability density function can be right skewed and its associated hazard rate function may be increasing, decreasing, unimodal and constant. First the model parameters of the proposed distribution are obtained by the maximum likelihood method. Next, Bayes estimators of the unknown parameters are obtained under different loss functions. In addition, bootstrap confidence intervals are provided to compare with Bayes credible intervals. Besides, log power modified Lindley regression model for censored data is proposed. Two real data sets are analyzed to illustrate the flexibility and importance of the proposed model.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"27 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84330598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we attempt to see further by relating theory with practice: First, we review the principles on which three interrelated well developed methods for the analysis and visualization of contingency tables and compositional data are erected: Correspondence analysis based on Benzécri’s principle of distributional equivalence, Goodman’s RC association model based on Yule’s principle of scale invariance, and compositional data analysis based on Aitchison’s principle of subcompositional coherence. Second, we introduce a novel index named intrinsic measure of the quality of the signs of the residuals for the choice of the method. The criterion is based on taxicab singular value decomposition, on which the package TaxicabCA in R is developed. We present a minimal R script thatcan be executed to obtain the numerical results and the maps in this paper. Third, we introduce a flexible method based on the novel index for the choice of the constant to be added to contingency tables with zero counts so that logratio methods can be applied.
{"title":"Taxicab Correspondence Analysis and Taxicab Logratio Analysis: A Comparison on Contingency Tables and Compositional Data","authors":"V. Choulakian, J. Allard, S. Mahdi","doi":"10.17713/ajs.v52i3.1302","DOIUrl":"https://doi.org/10.17713/ajs.v52i3.1302","url":null,"abstract":"In this paper, we attempt to see further by relating theory with practice: First, we review the principles on which three interrelated well developed methods for the analysis and visualization of contingency tables and compositional data are erected: Correspondence analysis based on Benzécri’s principle of distributional equivalence, Goodman’s RC association model based on Yule’s principle of scale invariance, and compositional data analysis based on Aitchison’s principle of subcompositional coherence. Second, we introduce a novel index named intrinsic measure of the quality of the signs of the residuals for the choice of the method. The criterion is based on taxicab singular value decomposition, on which the package TaxicabCA in R is developed. We present a minimal R script thatcan be executed to obtain the numerical results and the maps in this paper. Third, we introduce a flexible method based on the novel index for the choice of the constant to be added to contingency tables with zero counts so that logratio methods can be applied.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"5 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87746254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Count data in environmental epidemiology or ecology often display substantial over-dispersion, and failing to account for the over-dispersion could result in biased estimates and underestimated standard errors. This study develops a new generalized linear model family to model over-dispersed count data by assuming that the response variable follows the discrete Lindley distribution. The iterative weighted least square is developed to fit the model. Furthermore, asymptotic properties of estimators, the goodness of fit statistics are also derived. Lastly, some simulation studies and empirical data applications are carried out, and the generalized discrete Lindley linear model shows a better performance than the Poisson distribution model.
{"title":"Using the Discrete Lindley Distribution to Deal with Over-dispersion in Count Data","authors":"M. Nguyen, M. Nguyen, N. Le","doi":"10.17713/ajs.v52i3.1465","DOIUrl":"https://doi.org/10.17713/ajs.v52i3.1465","url":null,"abstract":"Count data in environmental epidemiology or ecology often display substantial over-dispersion, and failing to account for the over-dispersion could result in biased estimates and underestimated standard errors. This study develops a new generalized linear model family to model over-dispersed count data by assuming that the response variable follows the discrete Lindley distribution. The iterative weighted least square is developed to fit the model. Furthermore, asymptotic properties of estimators, the goodness of fit statistics are also derived. Lastly, some simulation studies and empirical data applications are carried out, and the generalized discrete Lindley linear model shows a better performance than the Poisson distribution model.","PeriodicalId":51761,"journal":{"name":"Austrian Journal of Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88576663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}