{"title":"零膨胀泊松响应变量纵向模型中缺失数据的估计和估算","authors":"D. S. Martinez-Lobo, O. O. Melo, N. A. Cruz","doi":"arxiv-2409.11040","DOIUrl":null,"url":null,"abstract":"This research deals with the estimation and imputation of missing data in\nlongitudinal models with a Poisson response variable inflated with zeros. A\nmethodology is proposed that is based on the use of maximum likelihood,\nassuming that data is missing at random and that there is a correlation between\nthe response variables. In each of the times, the expectation maximization (EM)\nalgorithm is used: in step E, a weighted regression is carried out, conditioned\non the previous times that are taken as covariates. In step M, the estimation\nand imputation of the missing data are performed. The good performance of the\nmethodology in different loss scenarios is demonstrated in a simulation study\ncomparing the model only with complete data, and estimating missing data using\nthe mode of the data of each individual. Furthermore, in a study related to the\ngrowth of corn, it is tested on real data to develop the algorithm in a\npractical scenario.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"203 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Estimation and imputation of missing data in longitudinal models with Zero-Inflated Poisson response variable\",\"authors\":\"D. S. Martinez-Lobo, O. O. Melo, N. A. Cruz\",\"doi\":\"arxiv-2409.11040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research deals with the estimation and imputation of missing data in\\nlongitudinal models with a Poisson response variable inflated with zeros. A\\nmethodology is proposed that is based on the use of maximum likelihood,\\nassuming that data is missing at random and that there is a correlation between\\nthe response variables. In each of the times, the expectation maximization (EM)\\nalgorithm is used: in step E, a weighted regression is carried out, conditioned\\non the previous times that are taken as covariates. In step M, the estimation\\nand imputation of the missing data are performed. The good performance of the\\nmethodology in different loss scenarios is demonstrated in a simulation study\\ncomparing the model only with complete data, and estimating missing data using\\nthe mode of the data of each individual. Furthermore, in a study related to the\\ngrowth of corn, it is tested on real data to develop the algorithm in a\\npractical scenario.\",\"PeriodicalId\":501425,\"journal\":{\"name\":\"arXiv - STAT - Methodology\",\"volume\":\"203 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Methodology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
本研究探讨了在纵向模型中,对带有零填充的泊松响应变量的缺失数据进行估计和估算的问题。研究提出了一种基于最大似然法的方法,假设数据是随机缺失的,且响应变量之间存在相关性。在每个时间段,都使用期望最大化(EM)算法:在步骤 E 中,以作为协变量的前几个时间段为条件,进行加权回归。在步骤 M 中,对缺失数据进行估计和估算。在一项模拟研究中,仅使用完整数据对模型进行了比较,并使用每个个体的数据模式对缺失数据进行了估计,结果表明该方法在不同的损失情况下具有良好的性能。此外,在一项与玉米生长相关的研究中,对真实数据进行了测试,以便在实际场景中开发算法。
Estimation and imputation of missing data in longitudinal models with Zero-Inflated Poisson response variable
This research deals with the estimation and imputation of missing data in
longitudinal models with a Poisson response variable inflated with zeros. A
methodology is proposed that is based on the use of maximum likelihood,
assuming that data is missing at random and that there is a correlation between
the response variables. In each of the times, the expectation maximization (EM)
algorithm is used: in step E, a weighted regression is carried out, conditioned
on the previous times that are taken as covariates. In step M, the estimation
and imputation of the missing data are performed. The good performance of the
methodology in different loss scenarios is demonstrated in a simulation study
comparing the model only with complete data, and estimating missing data using
the mode of the data of each individual. Furthermore, in a study related to the
growth of corn, it is tested on real data to develop the algorithm in a
practical scenario.