Saeed Shahsavari, Abbas Moghimbeigi, Rohollah Kalhor, Ali Moghadas Jafari, Mehrdad Bagherpour-Kalo, Mehdi Yaseri, Mostafa Hosseini
{"title":"Zero-Inflated Count Regression Models in Solving Challenges Posed by Outlier-Prone Data; an Application to Length of Hospital Stay.","authors":"Saeed Shahsavari, Abbas Moghimbeigi, Rohollah Kalhor, Ali Moghadas Jafari, Mehrdad Bagherpour-Kalo, Mehdi Yaseri, Mostafa Hosseini","doi":"10.22037/aaem.v12i1.2074","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Ignoring outliers in data may lead to misleading results. Length of stay (LOS) is often considered a count variable with a high frequency of outliers. This study exemplifies the potential of robust methodologies in enhancing the accuracy and reliability of analyses conducted on skewed and outlier-prone count data of LOS.</p><p><strong>Methods: </strong>The application of Zero-Inflated Poisson (ZIP) and robust Zero-Inflated Poisson (RZIP) models in solving challenges posed by outlier LOS data were evaluated. The ZIP model incorporates two components, tackling excess zeros with a zero-inflation component and modeling positive counts with a Poisson component. The RZIP model introduces the Robust Expectation-Solution (RES) algorithm to enhance parameter estimation and address the impact of outliers on the model's performance.</p><p><strong>Results: </strong>Data from 254 intensive care unit patients were analyzed (62.2% male). Patients aged 65 or older accounted for 58.3% of the sample. Notably, 38.6% of patients exhibited zero LOS. The overall mean LOS was 5.89 (± 9.81) days, and 9.45% of cases displayed outliers. Our analysis using the RZIP model revealed significant predictors of LOS, including age, underlying comorbidities (p<0.001), and insurance status (p=0.013). Model comparison demonstrated the RZIP model's superiority over ZIP, as evidenced by lower Akaike information criteria (AIC) and Bayesians information criteria (BIC) values.</p><p><strong>Conclusions: </strong>The application of the RZIP model allowed us to uncover meaningful insights into the factors influencing LOS, paving the way for more informed decision-making in hospital management.</p>","PeriodicalId":8146,"journal":{"name":"Archives of Academic Emergency Medicine","volume":"12 1","pages":"e13"},"PeriodicalIF":2.9000,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10871051/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Academic Emergency Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22037/aaem.v12i1.2074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Ignoring outliers in data may lead to misleading results. Length of stay (LOS) is often considered a count variable with a high frequency of outliers. This study exemplifies the potential of robust methodologies in enhancing the accuracy and reliability of analyses conducted on skewed and outlier-prone count data of LOS.
Methods: The application of Zero-Inflated Poisson (ZIP) and robust Zero-Inflated Poisson (RZIP) models in solving challenges posed by outlier LOS data were evaluated. The ZIP model incorporates two components, tackling excess zeros with a zero-inflation component and modeling positive counts with a Poisson component. The RZIP model introduces the Robust Expectation-Solution (RES) algorithm to enhance parameter estimation and address the impact of outliers on the model's performance.
Results: Data from 254 intensive care unit patients were analyzed (62.2% male). Patients aged 65 or older accounted for 58.3% of the sample. Notably, 38.6% of patients exhibited zero LOS. The overall mean LOS was 5.89 (± 9.81) days, and 9.45% of cases displayed outliers. Our analysis using the RZIP model revealed significant predictors of LOS, including age, underlying comorbidities (p<0.001), and insurance status (p=0.013). Model comparison demonstrated the RZIP model's superiority over ZIP, as evidenced by lower Akaike information criteria (AIC) and Bayesians information criteria (BIC) values.
Conclusions: The application of the RZIP model allowed us to uncover meaningful insights into the factors influencing LOS, paving the way for more informed decision-making in hospital management.