A comparison of zero-inflated and hurdle models for modeling zero-inflated count data.

Q2 Mathematics Journal of Statistical Distributions and Applications Pub Date : 2021-01-01 Epub Date: 2021-06-24 DOI:10.1186/s40488-021-00121-4
Cindy Xin Feng
{"title":"A comparison of zero-inflated and hurdle models for modeling zero-inflated count data.","authors":"Cindy Xin Feng","doi":"10.1186/s40488-021-00121-4","DOIUrl":null,"url":null,"abstract":"<p><p>Counts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.</p>","PeriodicalId":52216,"journal":{"name":"Journal of Statistical Distributions and Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8570364/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Distributions and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s40488-021-00121-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/6/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Counts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
零膨胀和障碍模型在零膨胀计数数据建模中的比较。
在实际工作中,经常会遇到计数数据中包含过多零的情况。例如,医疗服务就诊次数往往包括许多零,代表在随访期间没有使用过医疗服务的患者。这类数据的一个共同特点是,计数量往往会有过多的零,超出普通计数分布(如泊松分布或负二项分布)所能容纳的范围。零膨胀模型或障碍模型通常用于拟合此类数据。尽管零膨胀模型和阶跃模型越来越受欢迎,但对这两类模型之间的根本区别仍然缺乏研究。在本文中,我们回顾了零膨胀模型和阶跃模型,并强调了它们在数据生成过程方面的差异。我们还进行了模拟研究,以评估这两类模型的性能。回归模型的最终选择应在对拟合度进行仔细评估后做出,并应适合特定的相关数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Statistical Distributions and Applications
Journal of Statistical Distributions and Applications Decision Sciences-Statistics, Probability and Uncertainty
自引率
0.00%
发文量
0
审稿时长
13 weeks
期刊最新文献
A generalization to the log-inverse Weibull distribution and its applications in cancer research Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models Structural properties of generalised Planck distributions New class of Lindley distributions: properties and applications Tolerance intervals in statistical software and robustness under model misspecification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1