首页 > 最新文献

International Journal of Biostatistics最新文献

英文 中文
Leveraging external information by guided adaptive shrinkage to improve variable selection in high-dimensional regression settings. 利用外部信息引导自适应收缩,以提高变量选择在高维回归设置。
IF 1.2 4区 数学 Pub Date : 2025-09-08 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2024-0108
Mark A van de Wiel, Wessel N van Wieringen

Variable selection is challenging for high-dimensional data, in particular when sample size is low. It is widely recognized that external information in the form of complementary data on the variables, 'co-data', may improve results. Examples are known variable groups or p-values from a related study. Such co-data are ubiquitous in genomics settings due to the availability of public repositories, and is likely equally relevant for other applications. Yet, the uptake of prediction methods that structurally use such co-data is limited. We review guided adaptive shrinkage methods: a class of regression-based learners that use co-data to adapt the shrinkage parameters, crucial for the performance of those learners. We discuss technical aspects, but also the applicability in terms of types of co-data that can be handled. This class of methods is contrasted with several others. In particular, group-adaptive shrinkage is compared with the better-known sparse group-lasso by evaluating variable selection. Moreover, we demonstrate the versatility of the guided shrinkage methodology by showing how to 'do-it-yourself': we integrate implementations of a co-data learner and the spike-and-slab prior for the purpose of improving variable selection in genetics studies. We conclude with a real data example.

对于高维数据,特别是当样本量较低时,变量选择是具有挑战性的。人们普遍认为,有关变量的补充数据形式的外部信息,即“协数据”,可能会改善结果。例如,相关研究中的已知变量组或p值。由于公共存储库的可用性,这种协同数据在基因组学设置中无处不在,并且可能与其他应用程序同样相关。然而,在结构上使用这种协同数据的预测方法的吸收是有限的。我们回顾了引导自适应收缩方法:一类基于回归的学习器,它使用协数据来适应收缩参数,这对这些学习器的性能至关重要。我们讨论了技术方面的问题,但也讨论了可处理的协同数据类型的适用性。这类方法与其他几种方法作了对比。特别是,通过评估变量选择,将群体自适应收缩与更著名的稀疏群体lasso进行比较。此外,我们通过展示如何“自己动手”来展示引导收缩方法的多功能性:我们整合了共同数据学习器的实现和尖钉-板先验,以改善遗传学研究中的变量选择。我们以一个真实的数据示例作为总结。
{"title":"Leveraging external information by guided adaptive shrinkage to improve variable selection in high-dimensional regression settings.","authors":"Mark A van de Wiel, Wessel N van Wieringen","doi":"10.1515/ijb-2024-0108","DOIUrl":"10.1515/ijb-2024-0108","url":null,"abstract":"<p><p>Variable selection is challenging for high-dimensional data, in particular when sample size is low. It is widely recognized that external information in the form of complementary data on the variables, 'co-data', may improve results. Examples are known variable groups or <i>p</i>-values from a related study. Such co-data are ubiquitous in genomics settings due to the availability of public repositories, and is likely equally relevant for other applications. Yet, the uptake of prediction methods that structurally use such co-data is limited. We review guided adaptive shrinkage methods: a class of regression-based learners that use co-data to adapt the shrinkage parameters, crucial for the performance of those learners. We discuss technical aspects, but also the applicability in terms of types of co-data that can be handled. This class of methods is contrasted with several others. In particular, group-adaptive shrinkage is compared with the better-known sparse group-lasso by evaluating variable selection. Moreover, we demonstrate the versatility of the guided shrinkage methodology by showing how to 'do-it-yourself': we integrate implementations of a co-data learner and the spike-and-slab prior for the purpose of improving variable selection in genetics studies. We conclude with a real data example.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"271-283"},"PeriodicalIF":1.2,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145076513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-sample empirical likelihood method for right censored data. 右截尾数据的两样本经验似然方法。
IF 1.2 4区 数学 Pub Date : 2025-09-05 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2024-0120
Leonora Pahirko, Janis Valeinis, Deivids Jēkabsons

In this paper, a two-sample empirical likelihood method for right censored data is established. This method allows for comparisons between various functionals of survival distributions, such as mean lifetimes, survival probabilities at a fixed time, restricted mean survival times, and other parameters of interest. It is demonstrated that under some regularity conditions, the scaled empirical likelihood statistic converges to a chi-squared distributed random variable with one degree of freedom. A consistent estimator for the scaling constant is proposed, involving the jackknife estimator of the asymptotic variance of the Kaplan-Meier integral. A simulation study is carried out to investigate the coverage accuracy of confidence intervals. Finally, two real datasets are analyzed to illustrate the application of the proposed method.

本文建立了右截尾数据的两样本经验似然方法。这种方法允许在生存分布的各种函数之间进行比较,例如平均寿命、固定时间的生存概率、受限的平均生存时间和其他感兴趣的参数。证明了在一定的正则性条件下,尺度经验似然统计量收敛于一个单自由度的卡方分布随机变量。给出了尺度常数的一个一致估计量,其中包括Kaplan-Meier积分渐近方差的刀切估计量。对置信区间的覆盖精度进行了仿真研究。最后,通过对两个实际数据集的分析来说明该方法的应用。
{"title":"Two-sample empirical likelihood method for right censored data.","authors":"Leonora Pahirko, Janis Valeinis, Deivids Jēkabsons","doi":"10.1515/ijb-2024-0120","DOIUrl":"10.1515/ijb-2024-0120","url":null,"abstract":"<p><p>In this paper, a two-sample empirical likelihood method for right censored data is established. This method allows for comparisons between various functionals of survival distributions, such as mean lifetimes, survival probabilities at a fixed time, restricted mean survival times, and other parameters of interest. It is demonstrated that under some regularity conditions, the scaled empirical likelihood statistic converges to a chi-squared distributed random variable with one degree of freedom. A consistent estimator for the scaling constant is proposed, involving the jackknife estimator of the asymptotic variance of the Kaplan-Meier integral. A simulation study is carried out to investigate the coverage accuracy of confidence intervals. Finally, two real datasets are analyzed to illustrate the application of the proposed method.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"299-319"},"PeriodicalIF":1.2,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145070810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference on overlap index: with an application to cancer data. 重叠指数的推理:与癌症数据的应用。
IF 1.2 4区 数学 Pub Date : 2025-09-05 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2024-0106
Raju Dey, Arne C Bathke, Somesh Kumar

The quantification of overlap between two distributions has applications in various fields of biology, medical, genetic, and ecological research. In this article, new overlap and containment indices are considered for quantifying the niche overlap between two species/populations. Some new properties of these indices are established and the problem of estimation is studied, when the two distributions are exponential with different scale parameters. We propose several estimators and compare their relative performance with respect to different loss functions. The asymptotic normality of the maximum likelihood estimators of these indices is proved under certain conditions. We also obtain confidence intervals of the indices based on three different approaches and compare their average lengths and coverage probabilities. The point and confidence interval procedures developed here are applied on a breast cancer data set to analyze the similarity between the survival times of patients undergoing two different types of surgery. Additionally, the similarity between the relapse free times of these two sets of patients is also studied.

两种分布之间重叠的量化在生物学、医学、遗传学和生态学研究的各个领域都有应用。本文考虑了新的重叠指数和遏制指数来量化两个物种/种群之间的生态位重叠。建立了这些指标的一些新性质,并研究了两种分布在不同尺度参数下呈指数分布时的估计问题。我们提出了几种估计器,并比较了它们相对于不同损失函数的相对性能。在一定条件下,证明了这些指标的极大似然估计的渐近正态性。我们还基于三种不同的方法获得了指数的置信区间,并比较了它们的平均长度和覆盖概率。本文提出的点和置信区间程序应用于乳腺癌数据集,以分析接受两种不同类型手术的患者生存时间之间的相似性。此外,还研究了两组患者无复发时间的相似性。
{"title":"Inference on overlap index: with an application to cancer data.","authors":"Raju Dey, Arne C Bathke, Somesh Kumar","doi":"10.1515/ijb-2024-0106","DOIUrl":"10.1515/ijb-2024-0106","url":null,"abstract":"<p><p>The quantification of overlap between two distributions has applications in various fields of biology, medical, genetic, and ecological research. In this article, new overlap and containment indices are considered for quantifying the niche overlap between two species/populations. Some new properties of these indices are established and the problem of estimation is studied, when the two distributions are exponential with different scale parameters. We propose several estimators and compare their relative performance with respect to different loss functions. The asymptotic normality of the maximum likelihood estimators of these indices is proved under certain conditions. We also obtain confidence intervals of the indices based on three different approaches and compare their average lengths and coverage probabilities. The point and confidence interval procedures developed here are applied on a breast cancer data set to analyze the similarity between the survival times of patients undergoing two different types of surgery. Additionally, the similarity between the relapse free times of these two sets of patients is also studied.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"357-383"},"PeriodicalIF":1.2,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145070713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Forecasting mortality rates in hyponatremia: a statistical approach using Holt-Winters models. 预测低钠血症死亡率:使用霍尔特-温特斯模型的统计方法
IF 1.2 4区 数学 Pub Date : 2025-09-02 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2024-0075
Rawiyah Muneer Alraddadi, Mohamed Abd Allah El-Hadidy, Qin Shao, Qu Xianggui, Sadik Khuder

Hyponatremia, characterized by a serum sodium concentration below 135 mEq/L, is a prevalent electrolyte imbalance associated with increased morbidity and mortality across various clinical conditions. This study employs the Holt-Winters seasonal method, a robust time series forecasting model, to predict mortality rates attributed to hyponatremia. Leveraging retrospective mortality data from a cohort of hospitals in the United States, our analysis aims to elucidate temporal patterns and trends in hyponatremia-related deaths. The findings underscore the critical role of statistical forecasting in healthcare, facilitating proactive resource allocation and targeted interventions to mitigate mortality risks associated with electrolyte imbalances. Integrating predictive analytics into clinical practice holds promise for enhancing patient care and optimizing health outcomes in populations vulnerable to hyponatremia-related complications.

低钠血症的特征是血清钠浓度低于135 mEq/L,是一种普遍的电解质失衡,与各种临床条件下发病率和死亡率增加有关。本研究采用霍尔特-温特斯季节性方法,一种稳健的时间序列预测模型,来预测低钠血症的死亡率。利用来自美国医院队列的回顾性死亡率数据,我们的分析旨在阐明低钠血症相关死亡的时间模式和趋势。研究结果强调了统计预测在医疗保健中的关键作用,促进了积极的资源分配和有针对性的干预,以减轻与电解质失衡相关的死亡风险。将预测分析整合到临床实践中,有望加强患者护理,并优化易受低钠血症相关并发症影响的人群的健康结果。
{"title":"Forecasting mortality rates in hyponatremia: a statistical approach using Holt-Winters models.","authors":"Rawiyah Muneer Alraddadi, Mohamed Abd Allah El-Hadidy, Qin Shao, Qu Xianggui, Sadik Khuder","doi":"10.1515/ijb-2024-0075","DOIUrl":"10.1515/ijb-2024-0075","url":null,"abstract":"<p><p>Hyponatremia, characterized by a serum sodium concentration below 135 mEq/L, is a prevalent electrolyte imbalance associated with increased morbidity and mortality across various clinical conditions. This study employs the Holt-Winters seasonal method, a robust time series forecasting model, to predict mortality rates attributed to hyponatremia. Leveraging retrospective mortality data from a cohort of hospitals in the United States, our analysis aims to elucidate temporal patterns and trends in hyponatremia-related deaths. The findings underscore the critical role of statistical forecasting in healthcare, facilitating proactive resource allocation and targeted interventions to mitigate mortality risks associated with electrolyte imbalances. Integrating predictive analytics into clinical practice holds promise for enhancing patient care and optimizing health outcomes in populations vulnerable to hyponatremia-related complications.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"463-471"},"PeriodicalIF":1.2,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regression analysis of interval-censored failure time data under semiparametric transformation models with missing covariates. 缺失协变量半参数变换模型下间隔截尾失效时间数据的回归分析。
IF 1.2 4区 数学 Pub Date : 2025-08-29 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2024-0016
Yichen Lou, Mingyue Du

This paper discusses regression analysis of interval-censored failure time data arising from semiparametric transformation models in the presence of covariates that are missing at random (MAR). We define a specific formulation of the MAR mechanism tailored to the interval censoring, where the timing of observation adds complexity to handling missing covariates. To overcome the limitations and computational challenges present in the existing methods, we propose a multiple imputation procedure that can be easily implemented with the use of the standard software. The proposed method makes use of two predictive scores for each individual and the distance defined by these scores. Furthermore, it utilizes partial information from incomplete observations and thus yields more efficient estimators than the complete-case analysis and the inverse probability weighting approach. An extensive simulation study is conducted to assess the performance of the proposed method and indicates that it performs well in practical situations. Finally we apply the proposed approach to an Alzheimer's Disease study that motivated this work.

本文讨论了在随机缺失协变量的情况下,由半参数变换模型产生的间隔截尾失效时间数据的回归分析。我们定义了一种针对区间审查的MAR机制的特定公式,其中观测时间增加了处理缺失协变量的复杂性。为了克服现有方法中存在的局限性和计算挑战,我们提出了一种可以通过使用标准软件轻松实现的多重imputation程序。提出的方法利用每个个体的两个预测分数和由这些分数定义的距离。此外,它利用来自不完全观测的部分信息,因此产生比完全案例分析和逆概率加权方法更有效的估计器。通过广泛的仿真研究来评估该方法的性能,并表明该方法在实际情况下具有良好的性能。最后,我们将提出的方法应用于一项阿尔茨海默病研究,该研究激发了这项工作。
{"title":"Regression analysis of interval-censored failure time data under semiparametric transformation models with missing covariates.","authors":"Yichen Lou, Mingyue Du","doi":"10.1515/ijb-2024-0016","DOIUrl":"10.1515/ijb-2024-0016","url":null,"abstract":"<p><p>This paper discusses regression analysis of interval-censored failure time data arising from semiparametric transformation models in the presence of covariates that are missing at random (MAR). We define a specific formulation of the MAR mechanism tailored to the interval censoring, where the timing of observation adds complexity to handling missing covariates. To overcome the limitations and computational challenges present in the existing methods, we propose a multiple imputation procedure that can be easily implemented with the use of the standard software. The proposed method makes use of two predictive scores for each individual and the distance defined by these scores. Furthermore, it utilizes partial information from incomplete observations and thus yields more efficient estimators than the complete-case analysis and the inverse probability weighting approach. An extensive simulation study is conducted to assess the performance of the proposed method and indicates that it performs well in practical situations. Finally we apply the proposed approach to an Alzheimer's Disease study that motivated this work.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"321-337"},"PeriodicalIF":1.2,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144976071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Penalized regression splines in Mixture Density Networks. 混合密度网络中的惩罚回归样条。
IF 1.2 4区 数学 Pub Date : 2025-06-05 eCollection Date: 2025-05-01 DOI: 10.1515/ijb-2023-0134
Quentin Edward Seifert, Anton Thielmann, Elisabeth Bergherr, Benjamin Säfken, Jakob Zierk, Manfred Rauh, Tobias Hepp

Mixture Density Networks (MDN) belong to a class of models that can be applied to data which cannot be sufficiently described by a single distribution since it originates from different components of the main unit and therefore needs to be described by a mixture of densities. In some situations, MDNs may have problems with the proper identification of the latent components. While these identification issues can to some extent be contained by using custom initialization strategies for the network weights, this solution is still less than ideal since it involves subjective opinions. We therefore suggest replacing the hidden layers between the model input and the output parameter vector of MDNs and estimating the respective distributional parameters with penalized cubic regression splines. Results on simulated data from both Gaussian and Gamma mixture distributions motivated by an application to indirect reference interval estimation drastically improved the identification performance with all splines reliably converging to their true parameter values.

混合密度网络(MDN)属于一类模型,可以应用于不能由单一分布充分描述的数据,因为它来自主要单元的不同组成部分,因此需要用混合密度来描述。在某些情况下,mdn可能在正确识别潜在成分方面存在问题。虽然这些识别问题可以通过使用网络权重的自定义初始化策略在一定程度上得到解决,但这种解决方案仍然不太理想,因为它涉及主观意见。因此,我们建议替换mdn的模型输入和输出参数向量之间的隐藏层,并用惩罚三次回归样条估计各自的分布参数。采用间接参考区间估计方法对高斯和伽马混合分布的模拟数据进行了分析,结果表明,所有样条曲线都可靠地收敛到它们的真实参数值,极大地提高了识别性能。
{"title":"Penalized regression splines in Mixture Density Networks.","authors":"Quentin Edward Seifert, Anton Thielmann, Elisabeth Bergherr, Benjamin Säfken, Jakob Zierk, Manfred Rauh, Tobias Hepp","doi":"10.1515/ijb-2023-0134","DOIUrl":"10.1515/ijb-2023-0134","url":null,"abstract":"<p><p>Mixture Density Networks (MDN) belong to a class of models that can be applied to data which cannot be sufficiently described by a single distribution since it originates from different components of the main unit and therefore needs to be described by a mixture of densities. In some situations, MDNs may have problems with the proper identification of the latent components. While these identification issues can to some extent be contained by using custom initialization strategies for the network weights, this solution is still less than ideal since it involves subjective opinions. We therefore suggest replacing the hidden layers between the model input and the output parameter vector of MDNs and estimating the respective distributional parameters with penalized cubic regression splines. Results on simulated data from both Gaussian and Gamma mixture distributions motivated by an application to indirect reference interval estimation drastically improved the identification performance with all splines reliably converging to their true parameter values.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"239-253"},"PeriodicalIF":1.2,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early completion based on adjacent dose information for model-assisted designs to accelerate maximum tolerated dose finding. 早期完成基于相邻剂量信息的模型辅助设计,以加速最大耐受剂量的发现。
IF 1.2 4区 数学 Pub Date : 2025-06-03 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2023-0040
Masahiro Kojima

Phase I trials aim to identify the maximum tolerated dose (MTD) early and proceed quickly to an expansion cohort or a Phase II trial to assess the efficacy of the treatment. We present an early completion method based on multiple dosages (adjacent dose information) to accelerate the identification of the MTD in model-assisted designs. By using not only toxicity data for the current dose but also toxicity data for the next higher and lower doses, the MTD can be identified early without compromising accuracy. The early completion method is performed based on dose-assignment probabilities for multiple dosages. These probabilities are straightforward to calculate. We evaluated the early completion method using from an actual clinical trial. In a simulation study, we evaluated the percentage of correct MTD selection and the impact of early completion on trial outcomes. The results indicate that our proposed early completion method maintains a high level of accuracy in MTD selection, with minimal reduction compared to the standard approach. In certain scenarios, the accuracy of MTD selection even improves under the early completion framework. We conclude that the use of this early completion method poses no issue when applied to model-assisted designs.

I期试验旨在尽早确定最大耐受剂量(MTD),并迅速进入扩展队列或II期试验,以评估治疗的疗效。我们提出了一种基于多剂量(相邻剂量信息)的早期完成方法,以加速模型辅助设计中MTD的识别。通过不仅使用当前剂量的毒性数据,而且使用下一个更高和更低剂量的毒性数据,可以在不影响准确性的情况下及早确定MTD。该早期完成方法是基于多个剂量的剂量分配概率来执行的。这些概率很容易计算。我们从实际临床试验中评估了早期完成方法。在模拟研究中,我们评估了正确选择MTD的百分比以及早期完成对试验结果的影响。结果表明,我们提出的早期完井方法在MTD选择方面保持了很高的准确性,与标准方法相比,减少的幅度最小。在某些情况下,在早期完井框架下,MTD选择的准确性甚至有所提高。我们得出的结论是,当应用于模型辅助设计时,使用这种早期完成方法没有问题。
{"title":"Early completion based on adjacent dose information for model-assisted designs to accelerate maximum tolerated dose finding.","authors":"Masahiro Kojima","doi":"10.1515/ijb-2023-0040","DOIUrl":"10.1515/ijb-2023-0040","url":null,"abstract":"<p><p>Phase I trials aim to identify the maximum tolerated dose (MTD) early and proceed quickly to an expansion cohort or a Phase II trial to assess the efficacy of the treatment. We present an early completion method based on multiple dosages (adjacent dose information) to accelerate the identification of the MTD in model-assisted designs. By using not only toxicity data for the current dose but also toxicity data for the next higher and lower doses, the MTD can be identified early without compromising accuracy. The early completion method is performed based on dose-assignment probabilities for multiple dosages. These probabilities are straightforward to calculate. We evaluated the early completion method using from an actual clinical trial. In a simulation study, we evaluated the percentage of correct MTD selection and the impact of early completion on trial outcomes. The results indicate that our proposed early completion method maintains a high level of accuracy in MTD selection, with minimal reduction compared to the standard approach. In certain scenarios, the accuracy of MTD selection even improves under the early completion framework. We conclude that the use of this early completion method poses no issue when applied to model-assisted designs.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"411-421"},"PeriodicalIF":1.2,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficiency for evaluation of disease etiologic heterogeneity in case-case and case-control studies. 在病例-病例和病例-对照研究中评估疾病病因异质性的效率。
IF 1.2 4区 数学 Pub Date : 2025-05-30 eCollection Date: 2025-11-01 DOI: 10.1515/ijb-2023-0027
Aya Kuchiba, Ran Gao, Molin Wang

A disease of interest can often be classified into subtypes based on its various molecular or pathological characteristics. Recent epidemiological studies have increasingly provided evidence that some molecular subtypes in a disease may have distinct etiologies, by assessing whether the associations of a potential risk factor vary by disease subtypes (i.e., etiologic heterogeneity). Case-control and case-case studies are popular study designs in molecular epidemiology, and both can be validly applied in studies of etiologic heterogeneity. This study compared the efficiency of the etiologic heterogeneity parameter estimation between these two study designs by theoretical and numerical examinations. In settings where the two study designs have the same number of cases, the results showed that, compared with the case-case study, case-control studies always provided more efficient estimates or estimates with at least equivalent efficiency for heterogeneity parameters. In addition, we illustrated both approaches in a study for aiming to evaluate the association between plasma free estradiol and breast cancer risk according to the status of tumor estrogen and progesterone receptors, the results of which were originally provided through case-control study data.

一种疾病通常可以根据其不同的分子或病理特征分为亚型。最近的流行病学研究越来越多地提供证据表明,通过评估潜在危险因素的关联是否因疾病亚型而异(即病因异质性),疾病的某些分子亚型可能具有不同的病因。病例对照和个案研究是分子流行病学中流行的研究设计,两者都可以有效地应用于病因异质性的研究。本研究通过理论和数值检验比较了这两种研究设计的病因异质性参数估计的效率。在两种研究设计的病例数相同的情况下,结果表明,与病例-病例研究相比,病例-对照研究总是提供更有效的估计或对异质性参数至少具有同等效率的估计。此外,我们在一项旨在根据肿瘤雌激素和孕激素受体状态评估血浆游离雌二醇与乳腺癌风险之间关系的研究中阐述了这两种方法,其结果最初是通过病例对照研究数据提供的。
{"title":"Efficiency for evaluation of disease etiologic heterogeneity in case-case and case-control studies.","authors":"Aya Kuchiba, Ran Gao, Molin Wang","doi":"10.1515/ijb-2023-0027","DOIUrl":"10.1515/ijb-2023-0027","url":null,"abstract":"<p><p>A disease of interest can often be classified into subtypes based on its various molecular or pathological characteristics. Recent epidemiological studies have increasingly provided evidence that some molecular subtypes in a disease may have distinct etiologies, by assessing whether the associations of a potential risk factor vary by disease subtypes (i.e., etiologic heterogeneity). Case-control and case-case studies are popular study designs in molecular epidemiology, and both can be validly applied in studies of etiologic heterogeneity. This study compared the efficiency of the etiologic heterogeneity parameter estimation between these two study designs by theoretical and numerical examinations. In settings where the two study designs have the same number of cases, the results showed that, compared with the case-case study, case-control studies always provided more efficient estimates or estimates with at least equivalent efficiency for heterogeneity parameters. In addition, we illustrated both approaches in a study for aiming to evaluate the association between plasma free estradiol and breast cancer risk according to the status of tumor estrogen and progesterone receptors, the results of which were originally provided through case-control study data.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"339-356"},"PeriodicalIF":1.2,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12707193/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted Euclidean balancing for a matrix exposure in estimating causal effect. 估计因果效应中矩阵暴露的加权欧几里得平衡。
IF 1.2 4区 数学 Pub Date : 2025-05-23 eCollection Date: 2025-05-01 DOI: 10.1515/ijb-2024-0021
Juan Chen, Yingchun Zhou

With the increasing complexity of data, researchers in various fields have become increasingly interested in estimating the causal effect of a matrix exposure, which involves complex multivariate treatments, on an outcome. Balancing covariates for the matrix exposure is essential to achieve this goal. While exact balancing and approximate balancing methods have been proposed for multiple balancing constraints, dealing with a matrix treatment introduces a large number of constraints, making it challenging to achieve exact balance or select suitable threshold parameters for approximate balancing methods. To address this challenge, the weighted Euclidean balancing method is proposed, which offers an approximate balance of covariates from an overall perspective. In this study, both parametric and nonparametric methods for estimating the causal effect of a matrix treatment is proposed, along with providing theoretical properties of the two estimations. To validate the effectiveness of our approach, extensive simulation results demonstrate that the proposed method outperforms alternative approaches across various scenarios. Finally, we apply the method to analyze the causal impact of the omics variables on the drug sensitivity of Vandetanib. The results indicate that EGFR CNV has a significant positive causal effect on Vandetanib efficacy, whereas EGFR methylation exerts a significant negative causal effect.

随着数据的日益复杂,各个领域的研究人员对估计矩阵暴露的因果效应越来越感兴趣,这涉及到复杂的多变量处理。平衡矩阵暴露的协变量对于实现这一目标至关重要。虽然已经提出了针对多个平衡约束的精确平衡和近似平衡方法,但处理矩阵处理引入了大量约束,使得实现精确平衡或为近似平衡方法选择合适的阈值参数具有挑战性。为了解决这一挑战,提出了加权欧几里得平衡方法,该方法从整体角度提供了协变量的近似平衡。在本研究中,提出了估计矩阵处理因果效应的参数和非参数方法,并提供了这两种估计的理论性质。为了验证我们方法的有效性,大量的仿真结果表明,所提出的方法在各种情况下优于其他方法。最后,我们应用该方法分析了组学变量对Vandetanib药物敏感性的因果影响。结果表明,EGFR CNV对Vandetanib疗效有显著的正向因果效应,而EGFR甲基化对Vandetanib疗效有显著的负向因果效应。
{"title":"Weighted Euclidean balancing for a matrix exposure in estimating causal effect.","authors":"Juan Chen, Yingchun Zhou","doi":"10.1515/ijb-2024-0021","DOIUrl":"10.1515/ijb-2024-0021","url":null,"abstract":"<p><p>With the increasing complexity of data, researchers in various fields have become increasingly interested in estimating the causal effect of a matrix exposure, which involves complex multivariate treatments, on an outcome. Balancing covariates for the matrix exposure is essential to achieve this goal. While exact balancing and approximate balancing methods have been proposed for multiple balancing constraints, dealing with a matrix treatment introduces a large number of constraints, making it challenging to achieve exact balance or select suitable threshold parameters for approximate balancing methods. To address this challenge, the weighted Euclidean balancing method is proposed, which offers an approximate balance of covariates from an overall perspective. In this study, both parametric and nonparametric methods for estimating the causal effect of a matrix treatment is proposed, along with providing theoretical properties of the two estimations. To validate the effectiveness of our approach, extensive simulation results demonstrate that the proposed method outperforms alternative approaches across various scenarios. Finally, we apply the method to analyze the causal impact of the omics variables on the drug sensitivity of Vandetanib. The results indicate that EGFR CNV has a significant positive causal effect on Vandetanib efficacy, whereas EGFR methylation exerts a significant negative causal effect.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"219-237"},"PeriodicalIF":1.2,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guidance on individualized treatment rule estimation in high dimensions. 高维个体化治疗规则估计指南。
IF 1.2 4区 数学 Pub Date : 2025-05-22 eCollection Date: 2025-05-01 DOI: 10.1515/ijb-2024-0005
Philippe Boileau, Ning Leng, Sandrine Dudoit

Individualized treatment rules, cornerstones of precision medicine, inform patient treatment decisions with the goal of optimizing patient outcomes. These rules are generally unknown functions of patients' pre-treatment covariates, meaning they must be estimated from clinical or observational study data. Myriad methods have been developed to learn these rules, and these procedures are demonstrably successful in traditional asymptotic settings with moderate number of covariates. The finite-sample performance of these methods in high-dimensional covariate settings, which are increasingly the norm in modern clinical trials, has not been well characterized, however. We perform a comprehensive comparison of state-of-the-art individualized treatment rule estimators, assessing performance on the basis of the estimators' rule quality, interpretability, and computational efficiency. Sixteen data-generating processes with continuous outcomes and binary treatment assignments are considered, reflecting a diversity of randomized and observational studies. We summarize our findings and provide succinct advice to practitioners needing to estimate individualized treatment rules in high dimensions. Owing to individualized treatment rule estimators' poor interpretability, we propose a novel pre-treatment covariate filtering procedure based on recent work for uncovering treatment effect modifiers. We show that it improves estimators' rule quality and interpretability. All code is made publicly available, facilitating modifications and extensions to our simulation study.

个性化治疗规则是精准医疗的基石,为患者的治疗决策提供信息,目标是优化患者的治疗结果。这些规则通常是患者治疗前协变量的未知函数,这意味着它们必须从临床或观察性研究数据中估计出来。已经开发了无数方法来学习这些规则,并且这些程序在具有中等数量协变量的传统渐近设置中明显成功。然而,这些方法在高维协变量设置中的有限样本性能(这在现代临床试验中越来越普遍 )尚未得到很好的表征。我们对最先进的个性化治疗规则估计器进行了全面的比较,在估计器的规则质量、可解释性和计算效率的基础上评估性能。考虑了具有连续结果和二元治疗分配的16个数据生成过程,反映了随机和观察性研究的多样性。我们总结了我们的发现,并提供简洁的建议,从业者需要估计个体化治疗规则在高维。由于个性化治疗规则估计器的可解释性较差,我们提出了一种新的预处理协变量过滤程序,基于最近的工作来揭示治疗效果修饰符。我们证明了它提高了估计器的规则质量和可解释性。所有代码都是公开的,便于修改和扩展我们的模拟研究。
{"title":"Guidance on individualized treatment rule estimation in high dimensions.","authors":"Philippe Boileau, Ning Leng, Sandrine Dudoit","doi":"10.1515/ijb-2024-0005","DOIUrl":"10.1515/ijb-2024-0005","url":null,"abstract":"<p><p>Individualized treatment rules, cornerstones of precision medicine, inform patient treatment decisions with the goal of optimizing patient outcomes. These rules are generally unknown functions of patients' pre-treatment covariates, meaning they must be estimated from clinical or observational study data. Myriad methods have been developed to learn these rules, and these procedures are demonstrably successful in traditional asymptotic settings with moderate number of covariates. The finite-sample performance of these methods in high-dimensional covariate settings, which are increasingly the norm in modern clinical trials, has not been well characterized, however. We perform a comprehensive comparison of state-of-the-art individualized treatment rule estimators, assessing performance on the basis of the estimators' rule quality, interpretability, and computational efficiency. Sixteen data-generating processes with continuous outcomes and binary treatment assignments are considered, reflecting a diversity of randomized and observational studies. We summarize our findings and provide succinct advice to practitioners needing to estimate individualized treatment rules in high dimensions. Owing to individualized treatment rule estimators' poor interpretability, we propose a novel pre-treatment covariate filtering procedure based on recent work for uncovering treatment effect modifiers. We show that it improves estimators' rule quality and interpretability. All code is made publicly available, facilitating modifications and extensions to our simulation study.</p>","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":" ","pages":"183-218"},"PeriodicalIF":1.2,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144151742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1