Introduction: With the presence of COVID 19, some countries also faced an increase in number of cases due to Monkeypox virus. The main aim of this research was to investigate whether it is possible to fit count data regression models to predict the daily incidence of Monkeypox confirmed cases. Methods: In this study we have used two types of traditional count regression models like Poisson regression model and Negative binomial regression model using identity and logarithmic link function. Since our data was overdispersed, Negative binomial regression model with logarithmic link function fitted well as compared to other models. The parameters were estimated using SPSS, version 23.0. Results: The Negative Binomial Regression model with logarithm function fits well to the data related to Monkeypox cases. Therefore, the model shows that majority of the countries like Brazil, Canada, France, Germany, Peru, Spain, United Kingdom and United States of America shows significant decrease in number of cases with respect to time. The prediction line was plotted using this model where the line predicts well about the daily Monkeypox cases reported by different countries. Conclusion: From our study, we concluded that the count data regression model can be used widely to predict the incidence of any disease. The countries like Canada and Brazil have largest and smallest slope coefficient which shows maximum and minimum decrease in expected number of cases confirmed each day respectively.
{"title":"Count Data Regression Modelling: An Application to Monkeypox Confirmed Cases","authors":"Divya Vijithaswan Nair, Rujuta Hadaye","doi":"10.18502/jbe.v9i2.14626","DOIUrl":"https://doi.org/10.18502/jbe.v9i2.14626","url":null,"abstract":"Introduction: With the presence of COVID 19, some countries also faced an increase in number of cases due to Monkeypox virus. The main aim of this research was to investigate whether it is possible to fit count data regression models to predict the daily incidence of Monkeypox confirmed cases. \u0000Methods: In this study we have used two types of traditional count regression models like Poisson regression model and Negative binomial regression model using identity and logarithmic link function. Since our data was overdispersed, Negative binomial regression model with logarithmic link function fitted well as compared to other models. The parameters were estimated using SPSS, version 23.0. \u0000Results: The Negative Binomial Regression model with logarithm function fits well to the data related to Monkeypox cases. Therefore, the model shows that majority of the countries like Brazil, Canada, France, Germany, Peru, Spain, United Kingdom and United States of America shows significant decrease in number of cases with respect to time. The prediction line was plotted using this model where the line predicts well about the daily Monkeypox cases reported by different countries. \u0000Conclusion: From our study, we concluded that the count data regression model can be used widely to predict the incidence of any disease. The countries like Canada and Brazil have largest and smallest slope coefficient which shows maximum and minimum decrease in expected number of cases confirmed each day respectively.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"137 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139453288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Akbarzadeh-Jahromi, Negar Taheri, Babak Dashtdar, Nasim Taheri, Fatemeh Abiri, Marjan Zare
Introduction: Human Papilloma Virus infection (HPV) high-risk genotypes are responsible for up to 70% of invasive cervical cancers. It was aimed to determine the national and provincial prevalence of the total HPV and its high-risk genotypes including HPV genotype 16 (HPV16) and HPV genotype 18 (HPV18), and HPV genotypes other than genotypes of 16 and 18 (HPV other genotypes) among Iranian healthy women. Methods: Iran with 28 provinces locates at latitude and longitude of 32° 00' north and 53° 00' east. All Persian and English studies reporting HPV infection based on cervical specimens were selected through searching the PubMed, Magiran, Scopus, Irandoc databases, and Google Scholar research search engine. Sample size and event rates were used to compute the overall event rates and 95% confidence interval (95% C.I); Fixed or random effects model, heterogeneity indices including Q-statistics (p-value), and degree of heterogeneity (I2) were reported. The search was done up to February 29, 2022. Comprehensive Meta-analysis 2.2.064 and ArcGIS 10.8.2 software tools were used at a significance level of <0.05. Results: The meta-analysis included nineteen studies with 258839 participants. The national meta-analysis resulted in a total HPV prevalence of 0.025 (95% C.I 0.016, 0.039); those of HPV16, HPV18, and HPV other genotypes were 0.032 (95% C.I 0.019, 0.051), 0.028 (95% C.I 0.019, 0.040), and 0.048 (95% C.I 0.033, 0.069), respectively. The provincial meta-analysis showed that the total HPV prevalence was highest in Zanjn and Kerman (0.323 and 0.240, respectively); that of HPV16 was highest in Boushehr and Khozestan (0.298 and 0.253, respectively); that of HPV18 was highest in Tehran (0.089) and that of HPV other genotypes was highest in Khozestan (0.542). Conclusion: The current results would help policymakers and health managers accentuate on further implementation of screening strategies and health services in needier areas such as Zanjan, Kerma, Khozestan, and Tehran.
{"title":"The Prevalence of Human Papilloma Virus Infection and Its High Risk Genotypes among Healthy Women in 28 Provinces in Iran; A Systematic Review and Meta-Analysis","authors":"M. Akbarzadeh-Jahromi, Negar Taheri, Babak Dashtdar, Nasim Taheri, Fatemeh Abiri, Marjan Zare","doi":"10.18502/jbe.v9i2.14625","DOIUrl":"https://doi.org/10.18502/jbe.v9i2.14625","url":null,"abstract":"Introduction: Human Papilloma Virus infection (HPV) high-risk genotypes are responsible for up to 70% of invasive cervical cancers. It was aimed to determine the national and provincial prevalence of the total HPV and its high-risk genotypes including HPV genotype 16 (HPV16) and HPV genotype 18 (HPV18), and HPV genotypes other than genotypes of 16 and 18 (HPV other genotypes) among Iranian healthy women. \u0000Methods: Iran with 28 provinces locates at latitude and longitude of 32° 00' north and 53° 00' east. All Persian and English studies reporting HPV infection based on cervical specimens were selected through searching the PubMed, Magiran, Scopus, Irandoc databases, and Google Scholar research search engine. Sample size and event rates were used to compute the overall event rates and 95% confidence interval (95% C.I); Fixed or random effects model, heterogeneity indices including Q-statistics (p-value), and degree of heterogeneity (I2) were reported. The search was done up to February 29, 2022. Comprehensive Meta-analysis 2.2.064 and ArcGIS 10.8.2 software tools were used at a significance level of <0.05. \u0000Results: The meta-analysis included nineteen studies with 258839 participants. The national meta-analysis resulted in a total HPV prevalence of 0.025 (95% C.I 0.016, 0.039); those of HPV16, HPV18, and HPV other genotypes were 0.032 (95% C.I 0.019, 0.051), 0.028 (95% C.I 0.019, 0.040), and 0.048 (95% C.I 0.033, \u00000.069), respectively. The provincial meta-analysis showed that the total HPV prevalence was highest in Zanjn and Kerman (0.323 and 0.240, respectively); that of HPV16 was highest in Boushehr and Khozestan (0.298 and 0.253, respectively); that of HPV18 was highest in Tehran (0.089) and that of HPV other genotypes was highest in Khozestan (0.542). \u0000Conclusion: The current results would help policymakers and health managers accentuate on further implementation of screening strategies and health services in needier areas such as Zanjan, Kerma, Khozestan, and Tehran.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"124 31","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139391512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parisa Rezanejad-Asl, Farid Zayeri, Abbas Hajifathali
Introduction: The mixed effects logistic regression model is a common model for analysing correlated binary data as longitudinal data. The between and within subject variances are typically considered to be homogeneous but longitudinal data often show heterogeneity in these variances. This study proposes a Bayesian mixed effects location scale model to accommodate heteroscedasticity in binary data analysis. Methods: This study was carried out in two stages; first, the simulation study was used to evaluate the accuracy of the proposed model with the Bayesian approach and then the proposed model was applied to a real data. In simulation study, the data were generated from the mixed effects location scale model with different correlations between the random location effect and random scale effect and different sample sizes. In order to evaluate the accuracy of the estimations, the Root Mean Square Error, bias and Coverage Probability were calculated and the deviance information criterion was used to select the appropriate model. At the end we utilized this model to analyse uric acid levels of patients with haematological disorders. Results: The simulation results show the accuracy of model parameter estimates as well as the correlation between random location and scale effects. They also display that if a random scale effect is present in the data, it should be accounted for in model. Otherwise, the model is forced to assign the within subject variation due to these subject random effects to the error term. The results of real data are also in line with this. The odds of having normal UA levels increases by a factor of 26% per week. Due to the positive value of the covariance parameter, patients with higher mean of UA levels show higher variation in UA levels. Furthermore, the significance of the covariates in the between subject and within subject variances model, as well as the significance of the random scale variance determines the heterogeneity across subjects. Conclusion: Bayesian mixed effects location scale model provides a useful tool for analysing correlated binary data with heteroscedasticity because it considers data correlation and modelling mean and variance simultaneously. Furthermore, it improves the accuracy of statistical inference in longitudinal studies compared to classic mixed effects models.
{"title":"Addressing Heteroscedasticity in Correlated Binary Data: A Bayesian Mixed Effects Location Scale Approach","authors":"Parisa Rezanejad-Asl, Farid Zayeri, Abbas Hajifathali","doi":"10.18502/jbe.v9i2.14628","DOIUrl":"https://doi.org/10.18502/jbe.v9i2.14628","url":null,"abstract":"Introduction: The mixed effects logistic regression model is a common model for analysing correlated binary data as longitudinal data. The between and within subject variances are typically considered to be homogeneous but longitudinal data often show heterogeneity in these variances. This study proposes a Bayesian mixed effects location scale model to accommodate heteroscedasticity in binary data analysis. \u0000Methods: This study was carried out in two stages; first, the simulation study was used to evaluate the accuracy of the proposed model with the Bayesian approach and then the proposed model was applied to a real data. In simulation study, the data were generated from the mixed effects location scale model with different correlations between the random location effect and random scale effect and different sample sizes. In order to evaluate the accuracy of the estimations, the Root Mean Square Error, bias and Coverage Probability were calculated and the deviance information criterion was used to select the appropriate model. At the end we utilized this model to analyse uric acid levels of patients with haematological disorders. \u0000Results: The simulation results show the accuracy of model parameter estimates as well as the correlation between random location and scale effects. They also display that if a random scale effect is present in the data, it should be accounted for in model. Otherwise, the model is forced to assign the within subject variation due to these subject random effects to the error term. The results of real data are also in line with this. The odds of having normal UA levels increases by a factor of 26% per week. Due to the positive value of the covariance parameter, patients with higher mean of UA levels show higher variation in UA levels. Furthermore, the significance of the covariates in the between subject and within subject variances model, as well as the significance of the random scale variance determines the heterogeneity across subjects. \u0000Conclusion: Bayesian mixed effects location scale model provides a useful tool for analysing correlated binary data with heteroscedasticity because it considers data correlation and modelling mean and variance simultaneously. Furthermore, it improves the accuracy of statistical inference in longitudinal studies compared to classic mixed effects models.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"11 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139452513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gideon Addo, P. Ossei, Bismark Amponsah Yeboah, W. Ayibor, Raphael Doh-Nani, Seidu Mohammed, Michael Obuobi, Roselyn Assor Appau
Introduction: Hospital length of stay (LOS) remains a vital metric for assessing patient outcomes and healthcare resource utilization. Given the substantial financial impact of diagnosing and treating colorectal anomalies, coupled with an increased susceptibility to postoperative complications, it is crucial to understand the factors affecting LOS following colorectal surgery. Our primary objective was to investigate the preoperative, intraoperative, and postoperative risk factors that have substantial influence over LOS following a colorectal procedure. Methods: This study analyzed data from a retrospective study of adults who underwent various colorectal surgeries (colostomy, ileostomy, small bowel resection, etc.) at Cleveland Clinic Foundation (January 2005 - December 2014). Predictor variables were categorized into preoperative (patient demographics, medical history, comorbidities, lifestyle factors), intraoperative, and postoperative factors. LOS was grouped into short-term (SLOS) (≤ 7 days), medium-term (MLOS) (8-30 days), and long-term (LLOS) (> 30 days) stays. Multinomial logistic regression models assessed predictor effects on LOS. Results: Among the 7874 patients, 50.7% were females, with a minimum age of 20 years. SLOS were observed in 61.1%, MLOS in 37.6%, and LLOS in 1.3% of patients. Advanced age correlated with prolonged LOS, possibly due to age-related health challenges like weak immune systems. Coagulopathy, and fluid and electrolyte disorders raised MLOS and LLOS risk, likely due to complications like significant bleeding and electrolyte imbalances. Surgery duration predicted longer LOS, elevating LLOS and MLOS by 52% and 42%. Postoperative infections were associated to extended stays, possibly due to subsequent interventions, monitoring and recovery delays. Conclusion: Our study revealed that key preoperative predictors of LOS included Age, coagulopathy, fluid and electrolyte disorders, severe weight loss, and drug abuse. Notably, intraoperative factors such as surgical approach (open vs laparoscopic) and surgery duration, alongside postoperative complications including superficial and serious infections, significantly influenced LOS. By incorporating these insights into the preoperative planning, clinicians could potentially develop tailored interventions to mitigate risk factors and enhance postoperative recovery, thus potentially reducing LOS and improving patient outcomes.
简介:住院时间(LOS)仍然是评估患者预后和医疗资源利用率的重要指标。鉴于诊断和治疗结直肠畸形会产生巨大的经济影响,加上术后并发症的易感性增加,了解结直肠手术后影响住院时间的因素至关重要。我们的主要目的是调查对结直肠手术后的 LOS 有重大影响的术前、术中和术后风险因素。方法:本研究分析了克利夫兰诊所基金会对接受各种结直肠手术(结肠造口术、回肠造口术、小肠切除术等)的成人进行的回顾性研究数据(2005 年 1 月至 2014 年 12 月)。预测变量分为术前(患者人口统计学、病史、合并症、生活方式因素)、术中和术后因素。住院时间分为短期(SLOS)(≤ 7 天)、中期(MLOS)(8-30 天)和长期(LLOS)(> 30 天)。多项式逻辑回归模型评估了预测因素对 LOS 的影响。结果:在 7874 名患者中,50.7% 为女性,最小年龄为 20 岁。61.1%的患者发生了SLOS,37.6%发生了MLOS,1.3%发生了LLOS。高龄与 LOS 延长相关,这可能是由于与年龄相关的健康挑战,如免疫系统薄弱。凝血功能障碍、体液和电解质紊乱增加了MLOS和LLOS风险,这可能是由于大量出血和电解质失衡等并发症造成的。手术持续时间预示着更长的LOS,使LLOS和MLOS分别增加了52%和42%。术后感染与住院时间延长有关,这可能是由于后续干预、监测和恢复延迟造成的。结论:我们的研究显示,术前预测住院时间的主要因素包括年龄、凝血功能障碍、体液和电解质紊乱、体重严重下降和药物滥用。值得注意的是,手术方式(开腹手术与腹腔镜手术)和手术持续时间等术中因素,以及术后并发症(包括浅表感染和严重感染)都对手术时间有显著影响。通过在术前计划中纳入这些见解,临床医生有可能制定出量身定制的干预措施,以减轻风险因素并促进术后恢复,从而有可能缩短生命周期并改善患者的预后。
{"title":"Determinants of Hospital Stay Duration Post-Colorectal Surgery","authors":"Gideon Addo, P. Ossei, Bismark Amponsah Yeboah, W. Ayibor, Raphael Doh-Nani, Seidu Mohammed, Michael Obuobi, Roselyn Assor Appau","doi":"10.18502/jbe.v9i2.14627","DOIUrl":"https://doi.org/10.18502/jbe.v9i2.14627","url":null,"abstract":"Introduction: Hospital length of stay (LOS) remains a vital metric for assessing patient outcomes and healthcare resource utilization. Given the substantial financial impact of diagnosing and treating colorectal anomalies, coupled with an increased susceptibility to postoperative complications, it is crucial to understand the factors affecting LOS following colorectal surgery. Our primary objective was to investigate the preoperative, intraoperative, and postoperative risk factors that have substantial influence over LOS following a colorectal procedure. \u0000Methods: This study analyzed data from a retrospective study of adults who underwent various colorectal surgeries (colostomy, ileostomy, small bowel resection, etc.) at Cleveland Clinic Foundation (January 2005 \u0000- December 2014). Predictor variables were categorized into preoperative (patient demographics, medical history, comorbidities, lifestyle factors), intraoperative, and postoperative factors. LOS was grouped into short-term (SLOS) (≤ 7 days), medium-term (MLOS) (8-30 days), and long-term (LLOS) (> 30 days) stays. Multinomial logistic regression models assessed predictor effects on LOS. \u0000Results: Among the 7874 patients, 50.7% were females, with a minimum age of 20 years. SLOS were observed in 61.1%, MLOS in 37.6%, and LLOS in 1.3% of patients. Advanced age correlated with prolonged LOS, possibly due to age-related health challenges like weak immune systems. Coagulopathy, and fluid and electrolyte disorders raised MLOS and LLOS risk, likely due to complications like significant bleeding and electrolyte imbalances. Surgery duration predicted longer LOS, elevating LLOS and MLOS by 52% and 42%. Postoperative infections were associated to extended stays, possibly due to subsequent interventions, monitoring and recovery delays. \u0000Conclusion: Our study revealed that key preoperative predictors of LOS included Age, coagulopathy, fluid and electrolyte disorders, severe weight loss, and drug abuse. Notably, intraoperative factors such as surgical approach (open vs laparoscopic) and surgery duration, alongside postoperative complications including superficial and serious infections, significantly influenced LOS. By incorporating these insights into the preoperative planning, clinicians could potentially develop tailored interventions to mitigate risk factors and enhance postoperative recovery, thus potentially reducing LOS and improving patient outcomes.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"102 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139391431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction: The HIV Sentinel Surveillance (HSS) conducted by National AIDS Control Organization (NACO) is the predominant data source for HIV estimations in India. While the HSS targets the key populations at risk of HIV infection, the National Family Health Survey (NFHS) measures the community- based HIV prevalence. Improvised HIV estimates in India were attributed to the HIV prevalence data obtained from the NACO-HSS and NFHS. Methods: Bayesian analysis was performed to determine the state-level prevalence of HIV among females in seven South Indian States. The analysis involved plotting the prior, likelihood, and posterior distributions, facilitating a visual assessment of the data. The HIV prevalence among females calculated from the NFHS (2015-16) survey data was used for prior distributions. HIV prevalence among pregnant women obtained from the HIV Sentinel Surveillance 2019 was used for likelihood. Bayesian analysis was performed using the R programming (RStudio 2022.02.0). A posterior probability distribution was obtained using the prior distribution and the likelihood by applying the Bayes theorem. Graphical representation was achieved through R's plotting functions. Kerala and Pondicherry were not included in the analysis due to zero or very low prevalence reported in both NFHS and HSS. Results: The Bayesian estimates of HIV prevalence among females were 0.38 % [95% CI:0.29 - 0.47] in Andhra Pradesh, 0.28 [95% CI:0.23 - 0.35] in Karnataka, 0.27 [95% CI:0.20 - 0.34] Odisha, 0.27 % [95% CI:0.19 - 0.36] in Telangana and 0.19 [95% CI:0.15 - 0.24] in Tamil Nadu. Conclusion: Bayesian techniques present a versatile and robust strategy for modelling and analysing HIV- related data, offering a flexible and powerful approach to data analysis.
导言:国家艾滋病控制组织(NACO)开展的艾滋病毒哨点监测(HSS)是印度估算艾滋病毒感染率的主要数据来源。HSS 针对的是有感染 HIV 风险的主要人群,而全国家庭健康调查(NFHS)测量的是基于社区的 HIV 感染率。根据从 NACO-HSS 和 NFHS 中获得的 HIV 感染率数据,对印度的 HIV 感染率进行了改进估计。方法:采用贝叶斯分析法确定了印度南部七个邦的女性艾滋病毒流行率。分析包括绘制先验分布、似然分布和后验分布图,以便对数据进行直观评估。先验分布使用的是根据 NFHS(2015-16 年)调查数据计算得出的女性艾滋病毒感染率。从 2019 年艾滋病毒哨点监测中获得的孕妇艾滋病毒感染率用于似然。贝叶斯分析使用 R 编程(RStudio 2022.02.0)进行。通过应用贝叶斯定理,利用先验分布和似然得到后验概率分布。通过 R 的绘图功能实现了图形表示。由于 NFHS 和 HSS 报告的流行率为零或非常低,因此喀拉拉邦和本迪榭里未纳入分析。结果安得拉邦女性艾滋病感染率的贝叶斯估计值为 0.38 % [95% CI:0.29 - 0.47],卡纳塔克邦为 0.28 [95% CI:0.23 - 0.35],奥迪沙邦为 0.27 [95% CI:0.20 - 0.34],特兰甘纳邦为 0.27 % [95% CI:0.19 - 0.36],泰米尔纳德邦为 0.19 [95% CI:0.15 - 0.24]。结论贝叶斯技术是建立模型和分析 HIV 相关数据的一种灵活而强大的策略,为数据分析提供了一种灵活而强大的方法。
{"title":"Estimation of HIV Prevalence among the Female Population in South India: A Bayesian Approach","authors":"Elangovan Arumugum, Vasna Joshua","doi":"10.18502/jbe.v9i2.14624","DOIUrl":"https://doi.org/10.18502/jbe.v9i2.14624","url":null,"abstract":"Introduction: The HIV Sentinel Surveillance (HSS) conducted by National AIDS Control Organization (NACO) is the predominant data source for HIV estimations in India. While the HSS targets the key populations at risk of HIV infection, the National Family Health Survey (NFHS) measures the community- based HIV prevalence. Improvised HIV estimates in India were attributed to the HIV prevalence data obtained from the NACO-HSS and NFHS. \u0000Methods: Bayesian analysis was performed to determine the state-level prevalence of HIV among females in seven South Indian States. The analysis involved plotting the prior, likelihood, and posterior distributions, facilitating a visual assessment of the data. The HIV prevalence among females calculated from the NFHS (2015-16) survey data was used for prior distributions. HIV prevalence among pregnant women obtained from the HIV Sentinel Surveillance 2019 was used for likelihood. Bayesian analysis was performed using the R programming (RStudio 2022.02.0). A posterior probability distribution was obtained using the prior distribution and the likelihood by applying the Bayes theorem. Graphical representation was achieved through R's plotting functions. Kerala and Pondicherry were not included in the analysis due to zero or very low prevalence reported in both NFHS and HSS. \u0000Results: The Bayesian estimates of HIV prevalence among females were 0.38 % [95% CI:0.29 - 0.47] in Andhra Pradesh, 0.28 [95% CI:0.23 - 0.35] in Karnataka, 0.27 [95% CI:0.20 - 0.34] Odisha, 0.27 % [95% CI:0.19 - 0.36] in Telangana and 0.19 [95% CI:0.15 - 0.24] in Tamil Nadu. \u0000Conclusion: Bayesian techniques present a versatile and robust strategy for modelling and analysing HIV- related data, offering a flexible and powerful approach to data analysis.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"84 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139390502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction: Nutrition Clinical Trials (NCTs) are pivotal in establishing causal links between nutritional interventions and chronic diseases. This review comprehensively examines prevalent clinical trial designs, emphasizing their strengths and limitations. The goal is to provide insights into the selection and optimization of these designs for dietary intervention studies. Methods: Various study designs in NCTs are explored, including quasi-experimental designs, double-blind randomized placebo-controlled trials for nutrient/functional foods supplementation, community-based lifestyle interventions, pragmatic nutrition interventions, and field trial projects. The characteristics, advantages, and challenges of each design are discussed. Real examples are presented to illustrate how these designs can be tailored and optimized for dietary intervention studies. Results: Parallel randomized clinical trials are acknowledged as the gold standard, despite requiring substantial sample sizes and having inherent limitations. Cross-over NCTs emerge as valuable for assessing temporary treatment effects while mitigating potential confounders and interpatient variability. However, they may not be suitable for acute diseases and progressive disorders, and attrition rates can be higher. Multi-arm randomized designs offer increased study power with a lower sample size but necessitate more intricate design, analysis, and result reporting. Conclusion: In conclusion, each study design in NCTs comes with its set of strengths and limitations. The selection of an appropriate design should consider determinants and common considerations to provide robust evidence for establishing cause-and-effect associations or assessing the safety and efficacy of food products in nutrition research. This comprehensive understanding aids researchers in making informed choices when planning and conducting nutrition clinical trials.
{"title":"Common Study Designs of Nutrition Clinical Trials: Review of the Basic Elements and the Pros and Cons","authors":"P. Mirmiran, H. Malmir, Z. Bahadoran","doi":"10.18502/jbe.v9i2.14623","DOIUrl":"https://doi.org/10.18502/jbe.v9i2.14623","url":null,"abstract":"Introduction: Nutrition Clinical Trials (NCTs) are pivotal in establishing causal links between nutritional interventions and chronic diseases. This review comprehensively examines prevalent clinical trial designs, emphasizing their strengths and limitations. The goal is to provide insights into the selection and optimization of these designs for dietary intervention studies. \u0000Methods: Various study designs in NCTs are explored, including quasi-experimental designs, double-blind randomized placebo-controlled trials for nutrient/functional foods supplementation, community-based lifestyle interventions, pragmatic nutrition interventions, and field trial projects. The characteristics, advantages, and challenges of each design are discussed. Real examples are presented to illustrate how these designs can be tailored and optimized for dietary intervention studies. \u0000Results: Parallel randomized clinical trials are acknowledged as the gold standard, despite requiring substantial sample sizes and having inherent limitations. Cross-over NCTs emerge as valuable for assessing temporary treatment effects while mitigating potential confounders and interpatient variability. However, they may not be suitable for acute diseases and progressive disorders, and attrition rates can be higher. Multi-arm randomized designs offer increased study power with a lower sample size but necessitate more intricate design, analysis, and result reporting. \u0000Conclusion: In conclusion, each study design in NCTs comes with its set of strengths and limitations. The selection of an appropriate design should consider determinants and common considerations to provide robust evidence for establishing cause-and-effect associations or assessing the safety and efficacy of food products in nutrition research. This comprehensive understanding aids researchers in making informed choices when planning and conducting nutrition clinical trials.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"114 48","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139390883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad Reza Baghestani, Farid Zayeri, Mojtaba Meshkat
Introduction: A cure rate survival model was developed based on the assumption that the number of competing reasons for the event of interest has the Geometric distribution and the time allocated to the event of interest follows the Generalized Birnbaum-Saunders distribution.
Methods: The Geometric Generalized Birnbaum–Saunders distribution was defined and two useful representations were represented for its density function which contributes to the creation of some mathematical properties. Furthermore, the parameters of the model with cure rate were estimated by using the maximum likelihood method.
Results: Several simulations were performed and a real data set was analyzed from the medical area for different sample sizes and censoring percentages.In the melanoma data set and regarding the AIC and SBC selection criteria, the Geometric Generalized Birnbaum–Saunders distribution model was preferred and was selected as the appropriate model in the present study.
Conclusion: Geometric Generalized Birnbaum–Saunders distribution is a highly flexible lifetime model which allows for different degrees of Kurtosis and asymmetry.by considering the advantages of the Geometric Generalized Birnbaum–Saunders distribution model, the model can be implemented as an appropriate alternative to explain or predict the survival time for long-term individuals.
{"title":"The Geometric Generalized Birnbaum–Saunders model with long-Term Survivors","authors":"Ahmad Reza Baghestani, Farid Zayeri, Mojtaba Meshkat","doi":"10.18502/jbe.v9i1.13976","DOIUrl":"https://doi.org/10.18502/jbe.v9i1.13976","url":null,"abstract":"Introduction: A cure rate survival model was developed based on the assumption that the number of competing reasons for the event of interest has the Geometric distribution and the time allocated to the event of interest follows the Generalized Birnbaum-Saunders distribution.
 Methods: The Geometric Generalized Birnbaum–Saunders distribution was defined and two useful representations were represented for its density function which contributes to the creation of some mathematical properties. Furthermore, the parameters of the model with cure rate were estimated by using the maximum likelihood method.
 Results: Several simulations were performed and a real data set was analyzed from the medical area for different sample sizes and censoring percentages.In the melanoma data set and regarding the AIC and SBC selection criteria, the Geometric Generalized Birnbaum–Saunders distribution model was preferred and was selected as the appropriate model in the present study.
 Conclusion: Geometric Generalized Birnbaum–Saunders distribution is a highly flexible lifetime model which allows for different degrees of Kurtosis and asymmetry.by considering the advantages of the Geometric Generalized Birnbaum–Saunders distribution model, the model can be implemented as an appropriate alternative to explain or predict the survival time for long-term individuals.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"2008 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135813943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zainab M. Al-Balushi, Amadou Sarr, M Mazharul Islam
Introduction: Little attention has been paid to modeling count data with the geometric distribution. There are many real-life phenomena with a constant probability of first success. However, in practice, the probability of the first success may vary, making simple geometric models unsuitable for modeling such data. One can assume one of many continuous distributions for modeling the probability of first success with the parameter space [0, 1]. In this respect Beta distribution defined on the standard unit interval [0,1] is the most useful distribution due to its ability to accommodate a wide range of shapes. Thus, in this paper, by mixing Beta and geometric distribution, we developed a Beta-geometric distribution for modeling the count data through application to real-life count data on time to the first antenatal care (ANC) visit.
Methods: The estimation of the distribution parameters using the method of moments, maximum likelihood estimation (MLE) method, and Bayesian estimation approach are provided. Based on the Beta-geometric distribution, we developed a new Beta-geometric regression model for analyzing count data that follow the geometric distribution. The goodness of fit of the derived model has been tested using real data on time to the first ANC visit.
Results: Beta-geometric distribution has a simple form for its probability mass function (pmf), and is flexible in capturing both underdispersion and overdispersion that may present in count data. It was found that the proposed Beta-geometric regression model fit the count data on the first ANC visit better than simple geometric distribution or Negative Binomial distribution.
Conclusion: Unlike the Poisson or Negative Binomial distribution, Beta-geometric distribution does not need an additional parameter to accommodate underdispersion or overdispersion and thus could be a flexible choice for analyzing any count data. The goodness of fit test of the Beta-geometric model provides better fitting of the model to real data on time to first ANC visit than geometric or Negative binomial models.
{"title":"Beta-Geometric Regression for Modeling Count Data on First Antenatal Care Visit (ANC) with Application","authors":"Zainab M. Al-Balushi, Amadou Sarr, M Mazharul Islam","doi":"10.18502/jbe.v9i1.13977","DOIUrl":"https://doi.org/10.18502/jbe.v9i1.13977","url":null,"abstract":"Introduction: Little attention has been paid to modeling count data with the geometric distribution. There are many real-life phenomena with a constant probability of first success. However, in practice, the probability of the first success may vary, making simple geometric models unsuitable for modeling such data. One can assume one of many continuous distributions for modeling the probability of first success with the parameter space [0, 1]. In this respect Beta distribution defined on the standard unit interval [0,1] is the most useful distribution due to its ability to accommodate a wide range of shapes. Thus, in this paper, by mixing Beta and geometric distribution, we developed a Beta-geometric distribution for modeling the count data through application to real-life count data on time to the first antenatal care (ANC) visit.
 Methods: The estimation of the distribution parameters using the method of moments, maximum likelihood estimation (MLE) method, and Bayesian estimation approach are provided. Based on the Beta-geometric distribution, we developed a new Beta-geometric regression model for analyzing count data that follow the geometric distribution. The goodness of fit of the derived model has been tested using real data on time to the first ANC visit.
 Results: Beta-geometric distribution has a simple form for its probability mass function (pmf), and is flexible in capturing both underdispersion and overdispersion that may present in count data. It was found that the proposed Beta-geometric regression model fit the count data on the first ANC visit better than simple geometric distribution or Negative Binomial distribution.
 Conclusion: Unlike the Poisson or Negative Binomial distribution, Beta-geometric distribution does not need an additional parameter to accommodate underdispersion or overdispersion and thus could be a flexible choice for analyzing any count data. The goodness of fit test of the Beta-geometric model provides better fitting of the model to real data on time to first ANC visit than geometric or Negative binomial models.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"2012 35","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135814114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mehrdad Bagherpour-kalo, Parvaneh Darabi, Ali Moghadas Jafari, Hamid Najafimehr, Kamal Azam, Mostafa Hosseini
Introduction: Restless legs syndrome (RLS) is a common sensorimotor sleep disorder, and rheumatoid arthritis (RA) is an inflammatory autoimmune disease that causes disability. Previous studies showed that the prevalence of RLS varies in different populations of RA (13.2 – 68.4%). It raises the need for a pooled metaanalysis to determine a more reliable estimate. Therefore, we aimed to perform a meta-analysis to assess the pooled prevalence of RLS in RA patients.
Methods: Meta-analysis was performed according to the PRISMA checklist. Embase, MEDLINE, Ovid, Web-of-Science, and Scopus databases were used for the systematic search, and eligible studies were analyzed using R version 4.0.3. For further review, we performed sensitivity analyzes to identify influential studies.
Results: Of a total of 763 studies, 11 studies (3 were from Europe, 4 from North America, and 4 from Asia) were suitable for synthesis. A total of 931 RA patients were identified, 300 of whom had symptoms of RLS. The pooled prevalence of RLS among people with RA from 11 studies was 34% (95% CI: 26-43%). The pooled prevalence of RLS in Europe, Asia, and North America was 48% (95% CI: 32-65%), 32% (95% CI: 18-45%), and 28% (95% CI: 15-42%), respectively. RLS prevalence was dramatically high in RA women patients (32% CI: 23-41%) than RA men patients (3%; 95% CI: 2-5%).
Conclusion: This systematic review and meta-analysis indicates that the risk of RLS in RA patients was 34% and female patients with RA were more prone to having RLS than male patients.
{"title":"Prevalence of Restless Legs Syndrome in Rheumatoid Arthritis: A Systematic Review and Meta-Analysis","authors":"Mehrdad Bagherpour-kalo, Parvaneh Darabi, Ali Moghadas Jafari, Hamid Najafimehr, Kamal Azam, Mostafa Hosseini","doi":"10.18502/jbe.v9i1.13971","DOIUrl":"https://doi.org/10.18502/jbe.v9i1.13971","url":null,"abstract":"Introduction: Restless legs syndrome (RLS) is a common sensorimotor sleep disorder, and rheumatoid arthritis (RA) is an inflammatory autoimmune disease that causes disability. Previous studies showed that the prevalence of RLS varies in different populations of RA (13.2 – 68.4%). It raises the need for a pooled metaanalysis to determine a more reliable estimate. Therefore, we aimed to perform a meta-analysis to assess the pooled prevalence of RLS in RA patients.
 Methods: Meta-analysis was performed according to the PRISMA checklist. Embase, MEDLINE, Ovid, Web-of-Science, and Scopus databases were used for the systematic search, and eligible studies were analyzed using R version 4.0.3. For further review, we performed sensitivity analyzes to identify influential studies.
 Results: Of a total of 763 studies, 11 studies (3 were from Europe, 4 from North America, and 4 from Asia) were suitable for synthesis. A total of 931 RA patients were identified, 300 of whom had symptoms of RLS. The pooled prevalence of RLS among people with RA from 11 studies was 34% (95% CI: 26-43%). The pooled prevalence of RLS in Europe, Asia, and North America was 48% (95% CI: 32-65%), 32% (95% CI: 18-45%), and 28% (95% CI: 15-42%), respectively. RLS prevalence was dramatically high in RA women patients (32% CI: 23-41%) than RA men patients (3%; 95% CI: 2-5%).
 Conclusion: This systematic review and meta-analysis indicates that the risk of RLS in RA patients was 34% and female patients with RA were more prone to having RLS than male patients.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"392 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135871983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction:The bagging (BG) and random forest (RF) are famous supervised statistical learning methods based on the classification and regression trees. The BG and RF can deal with different types of responses such as categorical, continuous, etc. There are curves, time series, functional data, or observations that are related to each other based on their domain in many statistical applications. The RF methods are extended to some cases for functional data as covariates or responses in many pieces of literature. Among them, random-splitting is used to summarize the functional data to the multiple related summary statistics such as average, etc.
Methods: This research article extends this method and introduces the mixed data BG (MD-BG) and RF (MD-RF) algorithm for multiple functional and non-functional, or mixed and hybrid data, covariates and it calculates the variable importance plot (VIP) for each covariate.
Results: The main differences between MD-BG and MD-RF are in choosing the covariates that in the first, all covariates remain in the model but the second uses a random sample of covariates. The MD-RF helps to unmask the most important parts of functional covariates and the most important non-functional covariates.
Conclusion: We apply our methods on the two datasets of DTI and Tecator and compare their performances for continuous and categorical responses with developed R package (“RSRF”) in the GitHub.
{"title":"Random-Splitting Random Forest with Multiple Mixed-Data Covariates","authors":"Mohammad Fayaz, Alireza Abadi, Soheila Khodakarim","doi":"10.18502/jbe.v9i1.13974","DOIUrl":"https://doi.org/10.18502/jbe.v9i1.13974","url":null,"abstract":"Introduction:The bagging (BG) and random forest (RF) are famous supervised statistical learning methods based on the classification and regression trees. The BG and RF can deal with different types of responses such as categorical, continuous, etc. There are curves, time series, functional data, or observations that are related to each other based on their domain in many statistical applications. The RF methods are extended to some cases for functional data as covariates or responses in many pieces of literature. Among them, random-splitting is used to summarize the functional data to the multiple related summary statistics such as average, etc.
 Methods: This research article extends this method and introduces the mixed data BG (MD-BG) and RF (MD-RF) algorithm for multiple functional and non-functional, or mixed and hybrid data, covariates and it calculates the variable importance plot (VIP) for each covariate.
 Results: The main differences between MD-BG and MD-RF are in choosing the covariates that in the first, all covariates remain in the model but the second uses a random sample of covariates. The MD-RF helps to unmask the most important parts of functional covariates and the most important non-functional covariates.
 Conclusion: We apply our methods on the two datasets of DTI and Tecator and compare their performances for continuous and categorical responses with developed R package (“RSRF”) in the GitHub.","PeriodicalId":34310,"journal":{"name":"Journal of Biostatistics and Epidemiology","volume":"2015 29","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135813141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}