Pub Date : 2025-09-05eCollection Date: 2025-11-01DOI: 10.1515/ijb-2024-0106
Raju Dey, Arne C Bathke, Somesh Kumar
The quantification of overlap between two distributions has applications in various fields of biology, medical, genetic, and ecological research. In this article, new overlap and containment indices are considered for quantifying the niche overlap between two species/populations. Some new properties of these indices are established and the problem of estimation is studied, when the two distributions are exponential with different scale parameters. We propose several estimators and compare their relative performance with respect to different loss functions. The asymptotic normality of the maximum likelihood estimators of these indices is proved under certain conditions. We also obtain confidence intervals of the indices based on three different approaches and compare their average lengths and coverage probabilities. The point and confidence interval procedures developed here are applied on a breast cancer data set to analyze the similarity between the survival times of patients undergoing two different types of surgery. Additionally, the similarity between the relapse free times of these two sets of patients is also studied.
{"title":"Inference on overlap index: with an application to cancer data.","authors":"Raju Dey, Arne C Bathke, Somesh Kumar","doi":"10.1515/ijb-2024-0106","DOIUrl":"10.1515/ijb-2024-0106","url":null,"abstract":"<p><p>The quantification of overlap between two distributions has applications in various fields of biology, medical, genetic, and ecological research. In this article, new overlap and containment indices are considered for quantifying the niche overlap between two species/populations. Some new properties of these indices are established and the problem of estimation is studied, when the two distributions are exponential with different scale parameters. We propose several estimators and compare their relative performance with respect to different loss functions. The asymptotic normality of the maximum likelihood estimators of these indices is proved under certain conditions. We also obtain confidence intervals of the indices based on three different approaches and compare their average lengths and coverage probabilities. The point and confidence interval procedures developed here are applied on a breast cancer data set to analyze the similarity between the survival times of patients undergoing two different types of surgery. Additionally, the similarity between the relapse free times of these two sets of patients is also studied.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"357-383"},"PeriodicalIF":1.2,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145070713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-02eCollection Date: 2025-11-01DOI: 10.1515/ijb-2024-0075
Rawiyah Muneer Alraddadi, Mohamed Abd Allah El-Hadidy, Qin Shao, Qu Xianggui, Sadik Khuder
Hyponatremia, characterized by a serum sodium concentration below 135 mEq/L, is a prevalent electrolyte imbalance associated with increased morbidity and mortality across various clinical conditions. This study employs the Holt-Winters seasonal method, a robust time series forecasting model, to predict mortality rates attributed to hyponatremia. Leveraging retrospective mortality data from a cohort of hospitals in the United States, our analysis aims to elucidate temporal patterns and trends in hyponatremia-related deaths. The findings underscore the critical role of statistical forecasting in healthcare, facilitating proactive resource allocation and targeted interventions to mitigate mortality risks associated with electrolyte imbalances. Integrating predictive analytics into clinical practice holds promise for enhancing patient care and optimizing health outcomes in populations vulnerable to hyponatremia-related complications.
{"title":"Forecasting mortality rates in hyponatremia: a statistical approach using Holt-Winters models.","authors":"Rawiyah Muneer Alraddadi, Mohamed Abd Allah El-Hadidy, Qin Shao, Qu Xianggui, Sadik Khuder","doi":"10.1515/ijb-2024-0075","DOIUrl":"10.1515/ijb-2024-0075","url":null,"abstract":"<p><p>Hyponatremia, characterized by a serum sodium concentration below 135 mEq/L, is a prevalent electrolyte imbalance associated with increased morbidity and mortality across various clinical conditions. This study employs the Holt-Winters seasonal method, a robust time series forecasting model, to predict mortality rates attributed to hyponatremia. Leveraging retrospective mortality data from a cohort of hospitals in the United States, our analysis aims to elucidate temporal patterns and trends in hyponatremia-related deaths. The findings underscore the critical role of statistical forecasting in healthcare, facilitating proactive resource allocation and targeted interventions to mitigate mortality risks associated with electrolyte imbalances. Integrating predictive analytics into clinical practice holds promise for enhancing patient care and optimizing health outcomes in populations vulnerable to hyponatremia-related complications.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"463-471"},"PeriodicalIF":1.2,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-29eCollection Date: 2025-11-01DOI: 10.1515/ijb-2024-0016
Yichen Lou, Mingyue Du
This paper discusses regression analysis of interval-censored failure time data arising from semiparametric transformation models in the presence of covariates that are missing at random (MAR). We define a specific formulation of the MAR mechanism tailored to the interval censoring, where the timing of observation adds complexity to handling missing covariates. To overcome the limitations and computational challenges present in the existing methods, we propose a multiple imputation procedure that can be easily implemented with the use of the standard software. The proposed method makes use of two predictive scores for each individual and the distance defined by these scores. Furthermore, it utilizes partial information from incomplete observations and thus yields more efficient estimators than the complete-case analysis and the inverse probability weighting approach. An extensive simulation study is conducted to assess the performance of the proposed method and indicates that it performs well in practical situations. Finally we apply the proposed approach to an Alzheimer's Disease study that motivated this work.
{"title":"Regression analysis of interval-censored failure time data under semiparametric transformation models with missing covariates.","authors":"Yichen Lou, Mingyue Du","doi":"10.1515/ijb-2024-0016","DOIUrl":"10.1515/ijb-2024-0016","url":null,"abstract":"<p><p>This paper discusses regression analysis of interval-censored failure time data arising from semiparametric transformation models in the presence of covariates that are missing at random (MAR). We define a specific formulation of the MAR mechanism tailored to the interval censoring, where the timing of observation adds complexity to handling missing covariates. To overcome the limitations and computational challenges present in the existing methods, we propose a multiple imputation procedure that can be easily implemented with the use of the standard software. The proposed method makes use of two predictive scores for each individual and the distance defined by these scores. Furthermore, it utilizes partial information from incomplete observations and thus yields more efficient estimators than the complete-case analysis and the inverse probability weighting approach. An extensive simulation study is conducted to assess the performance of the proposed method and indicates that it performs well in practical situations. Finally we apply the proposed approach to an Alzheimer's Disease study that motivated this work.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"321-337"},"PeriodicalIF":1.2,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144976071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-05eCollection Date: 2025-05-01DOI: 10.1515/ijb-2023-0134
Quentin Edward Seifert, Anton Thielmann, Elisabeth Bergherr, Benjamin Säfken, Jakob Zierk, Manfred Rauh, Tobias Hepp
Mixture Density Networks (MDN) belong to a class of models that can be applied to data which cannot be sufficiently described by a single distribution since it originates from different components of the main unit and therefore needs to be described by a mixture of densities. In some situations, MDNs may have problems with the proper identification of the latent components. While these identification issues can to some extent be contained by using custom initialization strategies for the network weights, this solution is still less than ideal since it involves subjective opinions. We therefore suggest replacing the hidden layers between the model input and the output parameter vector of MDNs and estimating the respective distributional parameters with penalized cubic regression splines. Results on simulated data from both Gaussian and Gamma mixture distributions motivated by an application to indirect reference interval estimation drastically improved the identification performance with all splines reliably converging to their true parameter values.
{"title":"Penalized regression splines in Mixture Density Networks.","authors":"Quentin Edward Seifert, Anton Thielmann, Elisabeth Bergherr, Benjamin Säfken, Jakob Zierk, Manfred Rauh, Tobias Hepp","doi":"10.1515/ijb-2023-0134","DOIUrl":"10.1515/ijb-2023-0134","url":null,"abstract":"<p><p>Mixture Density Networks (MDN) belong to a class of models that can be applied to data which cannot be sufficiently described by a single distribution since it originates from different components of the main unit and therefore needs to be described by a mixture of densities. In some situations, MDNs may have problems with the proper identification of the latent components. While these identification issues can to some extent be contained by using custom initialization strategies for the network weights, this solution is still less than ideal since it involves subjective opinions. We therefore suggest replacing the hidden layers between the model input and the output parameter vector of MDNs and estimating the respective distributional parameters with penalized cubic regression splines. Results on simulated data from both Gaussian and Gamma mixture distributions motivated by an application to indirect reference interval estimation drastically improved the identification performance with all splines reliably converging to their true parameter values.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"239-253"},"PeriodicalIF":1.2,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-03eCollection Date: 2025-11-01DOI: 10.1515/ijb-2023-0040
Masahiro Kojima
Phase I trials aim to identify the maximum tolerated dose (MTD) early and proceed quickly to an expansion cohort or a Phase II trial to assess the efficacy of the treatment. We present an early completion method based on multiple dosages (adjacent dose information) to accelerate the identification of the MTD in model-assisted designs. By using not only toxicity data for the current dose but also toxicity data for the next higher and lower doses, the MTD can be identified early without compromising accuracy. The early completion method is performed based on dose-assignment probabilities for multiple dosages. These probabilities are straightforward to calculate. We evaluated the early completion method using from an actual clinical trial. In a simulation study, we evaluated the percentage of correct MTD selection and the impact of early completion on trial outcomes. The results indicate that our proposed early completion method maintains a high level of accuracy in MTD selection, with minimal reduction compared to the standard approach. In certain scenarios, the accuracy of MTD selection even improves under the early completion framework. We conclude that the use of this early completion method poses no issue when applied to model-assisted designs.
{"title":"Early completion based on adjacent dose information for model-assisted designs to accelerate maximum tolerated dose finding.","authors":"Masahiro Kojima","doi":"10.1515/ijb-2023-0040","DOIUrl":"10.1515/ijb-2023-0040","url":null,"abstract":"<p><p>Phase I trials aim to identify the maximum tolerated dose (MTD) early and proceed quickly to an expansion cohort or a Phase II trial to assess the efficacy of the treatment. We present an early completion method based on multiple dosages (adjacent dose information) to accelerate the identification of the MTD in model-assisted designs. By using not only toxicity data for the current dose but also toxicity data for the next higher and lower doses, the MTD can be identified early without compromising accuracy. The early completion method is performed based on dose-assignment probabilities for multiple dosages. These probabilities are straightforward to calculate. We evaluated the early completion method using from an actual clinical trial. In a simulation study, we evaluated the percentage of correct MTD selection and the impact of early completion on trial outcomes. The results indicate that our proposed early completion method maintains a high level of accuracy in MTD selection, with minimal reduction compared to the standard approach. In certain scenarios, the accuracy of MTD selection even improves under the early completion framework. We conclude that the use of this early completion method poses no issue when applied to model-assisted designs.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"411-421"},"PeriodicalIF":1.2,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-30eCollection Date: 2025-11-01DOI: 10.1515/ijb-2023-0027
Aya Kuchiba, Ran Gao, Molin Wang
A disease of interest can often be classified into subtypes based on its various molecular or pathological characteristics. Recent epidemiological studies have increasingly provided evidence that some molecular subtypes in a disease may have distinct etiologies, by assessing whether the associations of a potential risk factor vary by disease subtypes (i.e., etiologic heterogeneity). Case-control and case-case studies are popular study designs in molecular epidemiology, and both can be validly applied in studies of etiologic heterogeneity. This study compared the efficiency of the etiologic heterogeneity parameter estimation between these two study designs by theoretical and numerical examinations. In settings where the two study designs have the same number of cases, the results showed that, compared with the case-case study, case-control studies always provided more efficient estimates or estimates with at least equivalent efficiency for heterogeneity parameters. In addition, we illustrated both approaches in a study for aiming to evaluate the association between plasma free estradiol and breast cancer risk according to the status of tumor estrogen and progesterone receptors, the results of which were originally provided through case-control study data.
{"title":"Efficiency for evaluation of disease etiologic heterogeneity in case-case and case-control studies.","authors":"Aya Kuchiba, Ran Gao, Molin Wang","doi":"10.1515/ijb-2023-0027","DOIUrl":"10.1515/ijb-2023-0027","url":null,"abstract":"<p><p>A disease of interest can often be classified into subtypes based on its various molecular or pathological characteristics. Recent epidemiological studies have increasingly provided evidence that some molecular subtypes in a disease may have distinct etiologies, by assessing whether the associations of a potential risk factor vary by disease subtypes (i.e., etiologic heterogeneity). Case-control and case-case studies are popular study designs in molecular epidemiology, and both can be validly applied in studies of etiologic heterogeneity. This study compared the efficiency of the etiologic heterogeneity parameter estimation between these two study designs by theoretical and numerical examinations. In settings where the two study designs have the same number of cases, the results showed that, compared with the case-case study, case-control studies always provided more efficient estimates or estimates with at least equivalent efficiency for heterogeneity parameters. In addition, we illustrated both approaches in a study for aiming to evaluate the association between plasma free estradiol and breast cancer risk according to the status of tumor estrogen and progesterone receptors, the results of which were originally provided through case-control study data.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"339-356"},"PeriodicalIF":1.2,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12707193/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-23eCollection Date: 2025-05-01DOI: 10.1515/ijb-2024-0021
Juan Chen, Yingchun Zhou
With the increasing complexity of data, researchers in various fields have become increasingly interested in estimating the causal effect of a matrix exposure, which involves complex multivariate treatments, on an outcome. Balancing covariates for the matrix exposure is essential to achieve this goal. While exact balancing and approximate balancing methods have been proposed for multiple balancing constraints, dealing with a matrix treatment introduces a large number of constraints, making it challenging to achieve exact balance or select suitable threshold parameters for approximate balancing methods. To address this challenge, the weighted Euclidean balancing method is proposed, which offers an approximate balance of covariates from an overall perspective. In this study, both parametric and nonparametric methods for estimating the causal effect of a matrix treatment is proposed, along with providing theoretical properties of the two estimations. To validate the effectiveness of our approach, extensive simulation results demonstrate that the proposed method outperforms alternative approaches across various scenarios. Finally, we apply the method to analyze the causal impact of the omics variables on the drug sensitivity of Vandetanib. The results indicate that EGFR CNV has a significant positive causal effect on Vandetanib efficacy, whereas EGFR methylation exerts a significant negative causal effect.
{"title":"Weighted Euclidean balancing for a matrix exposure in estimating causal effect.","authors":"Juan Chen, Yingchun Zhou","doi":"10.1515/ijb-2024-0021","DOIUrl":"10.1515/ijb-2024-0021","url":null,"abstract":"<p><p>With the increasing complexity of data, researchers in various fields have become increasingly interested in estimating the causal effect of a matrix exposure, which involves complex multivariate treatments, on an outcome. Balancing covariates for the matrix exposure is essential to achieve this goal. While exact balancing and approximate balancing methods have been proposed for multiple balancing constraints, dealing with a matrix treatment introduces a large number of constraints, making it challenging to achieve exact balance or select suitable threshold parameters for approximate balancing methods. To address this challenge, the weighted Euclidean balancing method is proposed, which offers an approximate balance of covariates from an overall perspective. In this study, both parametric and nonparametric methods for estimating the causal effect of a matrix treatment is proposed, along with providing theoretical properties of the two estimations. To validate the effectiveness of our approach, extensive simulation results demonstrate that the proposed method outperforms alternative approaches across various scenarios. Finally, we apply the method to analyze the causal impact of the omics variables on the drug sensitivity of Vandetanib. The results indicate that EGFR CNV has a significant positive causal effect on Vandetanib efficacy, whereas EGFR methylation exerts a significant negative causal effect.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"219-237"},"PeriodicalIF":1.2,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-22eCollection Date: 2025-05-01DOI: 10.1515/ijb-2024-0005
Philippe Boileau, Ning Leng, Sandrine Dudoit
Individualized treatment rules, cornerstones of precision medicine, inform patient treatment decisions with the goal of optimizing patient outcomes. These rules are generally unknown functions of patients' pre-treatment covariates, meaning they must be estimated from clinical or observational study data. Myriad methods have been developed to learn these rules, and these procedures are demonstrably successful in traditional asymptotic settings with moderate number of covariates. The finite-sample performance of these methods in high-dimensional covariate settings, which are increasingly the norm in modern clinical trials, has not been well characterized, however. We perform a comprehensive comparison of state-of-the-art individualized treatment rule estimators, assessing performance on the basis of the estimators' rule quality, interpretability, and computational efficiency. Sixteen data-generating processes with continuous outcomes and binary treatment assignments are considered, reflecting a diversity of randomized and observational studies. We summarize our findings and provide succinct advice to practitioners needing to estimate individualized treatment rules in high dimensions. Owing to individualized treatment rule estimators' poor interpretability, we propose a novel pre-treatment covariate filtering procedure based on recent work for uncovering treatment effect modifiers. We show that it improves estimators' rule quality and interpretability. All code is made publicly available, facilitating modifications and extensions to our simulation study.
{"title":"Guidance on individualized treatment rule estimation in high dimensions.","authors":"Philippe Boileau, Ning Leng, Sandrine Dudoit","doi":"10.1515/ijb-2024-0005","DOIUrl":"10.1515/ijb-2024-0005","url":null,"abstract":"<p><p>Individualized treatment rules, cornerstones of precision medicine, inform patient treatment decisions with the goal of optimizing patient outcomes. These rules are generally unknown functions of patients' pre-treatment covariates, meaning they must be estimated from clinical or observational study data. Myriad methods have been developed to learn these rules, and these procedures are demonstrably successful in traditional asymptotic settings with moderate number of covariates. The finite-sample performance of these methods in high-dimensional covariate settings, which are increasingly the norm in modern clinical trials, has not been well characterized, however. We perform a comprehensive comparison of state-of-the-art individualized treatment rule estimators, assessing performance on the basis of the estimators' rule quality, interpretability, and computational efficiency. Sixteen data-generating processes with continuous outcomes and binary treatment assignments are considered, reflecting a diversity of randomized and observational studies. We summarize our findings and provide succinct advice to practitioners needing to estimate individualized treatment rules in high dimensions. Owing to individualized treatment rule estimators' poor interpretability, we propose a novel pre-treatment covariate filtering procedure based on recent work for uncovering treatment effect modifiers. We show that it improves estimators' rule quality and interpretability. All code is made publicly available, facilitating modifications and extensions to our simulation study.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"183-218"},"PeriodicalIF":1.2,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144151742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-22eCollection Date: 2025-05-01DOI: 10.1515/ijb-2023-0138
Xueqing Yin, Craig Anderson, Duncan Lee, Gary Napier
Bayesian hierarchical models with a spatially smooth conditional autoregressive prior distribution are commonly used to estimate the spatio-temporal pattern in disease risk from areal unit data. However, most of the modeling approaches do not take possible boundaries of step changes in disease risk between geographically neighbouring areas into consideration, which may lead to oversmoothing of the risk surfaces, prevent the detection of high-risk areas and yield biased estimation of disease risk. In this paper, we propose a two-stage method to jointly estimate the disease risk in small areas over time and detect the locations of boundaries that separate pairs of neighbouring areas exhibiting vastly different risks. In the first stage, we use a graph-based optimisation algorithm to construct a set of candidate neighbourhood matrices that represent a range of possible boundary structures for the disease data. In the second stage, a Bayesian hierarchical spatio-temporal model that takes the boundaries into account is fitted to the data. The performance of the methodology is evidenced by simulation, before being applied to a study of respiratory disease risk in Greater Glasgow, Scotland.
{"title":"Risk estimation and boundary detection in Bayesian disease mapping.","authors":"Xueqing Yin, Craig Anderson, Duncan Lee, Gary Napier","doi":"10.1515/ijb-2023-0138","DOIUrl":"10.1515/ijb-2023-0138","url":null,"abstract":"<p><p>Bayesian hierarchical models with a spatially smooth conditional autoregressive prior distribution are commonly used to estimate the spatio-temporal pattern in disease risk from areal unit data. However, most of the modeling approaches do not take possible boundaries of step changes in disease risk between geographically neighbouring areas into consideration, which may lead to oversmoothing of the risk surfaces, prevent the detection of high-risk areas and yield biased estimation of disease risk. In this paper, we propose a two-stage method to jointly estimate the disease risk in small areas over time and detect the locations of boundaries that separate pairs of neighbouring areas exhibiting vastly different risks. In the first stage, we use a graph-based optimisation algorithm to construct a set of candidate neighbourhood matrices that represent a range of possible boundary structures for the disease data. In the second stage, a Bayesian hierarchical spatio-temporal model that takes the boundaries into account is fitted to the data. The performance of the methodology is evidenced by simulation, before being applied to a study of respiratory disease risk in Greater Glasgow, Scotland.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"129-150"},"PeriodicalIF":1.2,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144151799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-30eCollection Date: 2025-05-01DOI: 10.1515/ijb-2024-0105
Toru Ogura, Takemi Yanagimoto
The logarithmic odds ratio is a well-known method for comparing binary data between two independent groups. Although various existing methods proposed for estimating a logarithmic odds ratio, most methods estimate two proportions in each group independently and then estimate the logarithmic odds ratio using the two estimated proportions. When using a logarithmic odds ratio, researchers are more interested in the logarithmic odds ratio than proportions for each group. Parameter estimations, generally, incur random and systematic errors. These errors in initially estimated parameter may affect later estimated parameter. We propose a Bayesian estimator to directly estimate a logarithmic odds ratio without using proportions for each group. Many existing methods need to estimate two parameters (two proportions in each group) to estimate a logarithmic odds ratio; however, the proposed method only estimates one parameter (logarithmic odds ratio). Therefore, the proposed estimator can be closer to the population's logarithmic odds ratio than existing estimators. Additionally, the validity of the proposed estimator is verified using numerical calculations and applications.
{"title":"An improved estimator of the logarithmic odds ratio for small sample sizes using a Bayesian approach.","authors":"Toru Ogura, Takemi Yanagimoto","doi":"10.1515/ijb-2024-0105","DOIUrl":"10.1515/ijb-2024-0105","url":null,"abstract":"<p><p>The logarithmic odds ratio is a well-known method for comparing binary data between two independent groups. Although various existing methods proposed for estimating a logarithmic odds ratio, most methods estimate two proportions in each group independently and then estimate the logarithmic odds ratio using the two estimated proportions. When using a logarithmic odds ratio, researchers are more interested in the logarithmic odds ratio than proportions for each group. Parameter estimations, generally, incur random and systematic errors. These errors in initially estimated parameter may affect later estimated parameter. We propose a Bayesian estimator to directly estimate a logarithmic odds ratio without using proportions for each group. Many existing methods need to estimate two parameters (two proportions in each group) to estimate a logarithmic odds ratio; however, the proposed method only estimates one parameter (logarithmic odds ratio). Therefore, the proposed estimator can be closer to the population's logarithmic odds ratio than existing estimators. Additionally, the validity of the proposed estimator is verified using numerical calculations and applications.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"151-163"},"PeriodicalIF":1.2,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144006634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}