Becky Tang , Henry A. Frye , John A. Silander Jr. , Alan E. Gelfand
{"title":"Zero-inflated multivariate tobit regression modeling","authors":"Becky Tang , Henry A. Frye , John A. Silander Jr. , Alan E. Gelfand","doi":"10.1016/j.jspi.2024.106229","DOIUrl":null,"url":null,"abstract":"<div><p>A frequent challenge encountered in real-world applications is data having a high proportion of zeros. Focusing on ecological abundance data, much attention has been given to zero-inflated count data. Models for non-negative continuous abundance data with an excess of zeros are rarely discussed. Work presented here considers the creation of a point mass at zero through a left-censoring approach or through a hurdle approach. We incorporate both mechanisms to capture the analog of zero-inflation for count data. Additionally, primary attention has been given to univariate zero-inflated modeling (e.g., single species), whereas data often arise jointly (e.g., a collection of species). With multivariate abundance data, a key issue is to capture dependence among the species at a site, both in terms of positive abundance as well as absence. Therefore, our contribution is a model for multivariate zero-inflated continuous data that are non-negative. Working in a Bayesian framework, we discuss the issue of separating the two sources of zeros and offer model comparison metrics for multivariate zero-inflated data. In an application, we model the total biomass for five tree species obtained from plots established in the Forest Inventory Analysis database in the Northeast region of the United States.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106229"},"PeriodicalIF":0.8000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Planning and Inference","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378375824000867","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
A frequent challenge encountered in real-world applications is data having a high proportion of zeros. Focusing on ecological abundance data, much attention has been given to zero-inflated count data. Models for non-negative continuous abundance data with an excess of zeros are rarely discussed. Work presented here considers the creation of a point mass at zero through a left-censoring approach or through a hurdle approach. We incorporate both mechanisms to capture the analog of zero-inflation for count data. Additionally, primary attention has been given to univariate zero-inflated modeling (e.g., single species), whereas data often arise jointly (e.g., a collection of species). With multivariate abundance data, a key issue is to capture dependence among the species at a site, both in terms of positive abundance as well as absence. Therefore, our contribution is a model for multivariate zero-inflated continuous data that are non-negative. Working in a Bayesian framework, we discuss the issue of separating the two sources of zeros and offer model comparison metrics for multivariate zero-inflated data. In an application, we model the total biomass for five tree species obtained from plots established in the Forest Inventory Analysis database in the Northeast region of the United States.
期刊介绍:
The Journal of Statistical Planning and Inference offers itself as a multifaceted and all-inclusive bridge between classical aspects of statistics and probability, and the emerging interdisciplinary aspects that have a potential of revolutionizing the subject. While we maintain our traditional strength in statistical inference, design, classical probability, and large sample methods, we also have a far more inclusive and broadened scope to keep up with the new problems that confront us as statisticians, mathematicians, and scientists.
We publish high quality articles in all branches of statistics, probability, discrete mathematics, machine learning, and bioinformatics. We also especially welcome well written and up to date review articles on fundamental themes of statistics, probability, machine learning, and general biostatistics. Thoughtful letters to the editors, interesting problems in need of a solution, and short notes carrying an element of elegance or beauty are equally welcome.