首页 > 最新文献

Biometrical Journal最新文献

英文 中文
A Principled Approach to Adjust for Unmeasured Time-Stable Confounding of Supervised Treatment 调整未测量的监督治疗时间稳定混杂因素的原则性方法。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70026
Jeppe Ekstrand Halkjær Madsen, Thomas Delvin, Thomas Scheike, Christian Pipper

We propose a novel method to adjust for unmeasured time-stable confounding when the time between consecutive treatment administrations is fixed. We achieve this by focusing on a new-user cohort. Furthermore, we envisage that all time-stable confounding goes through the potential time on treatment as dictated by the disease condition at the initiation of treatment. Following this logic, we may eliminate all unmeasured time-stable confounding by adjusting for the potential time on treatment. A challenge with this approach is that right censoring of the potential time on treatment occurs when treatment is terminated at the time of the event of interest, for example, if the event of interest is death. We show how this challenge may be solved by means of the expectation-maximization algorithm without imposing any further assumptions on the distribution of the potential time on treatment. The usefulness of the methodology is illustrated in a simulation study. We also apply the methodology to investigate the effect of depression/anxiety drugs on subsequent poisoning by other medications in the Danish population by means of national registries. We find a protective effect of treatment with selective serotonin reuptake inhibitors on the risk of poisoning by various medications (1- year risk difference of approximately 3%$-3%$) and a standard Cox model analysis shows a harming effect (1-year risk difference of approximately 2%$2%$), which is consistent with what we would expect due to confounding by indication. Unmeasured time-stable confounding can be entirely adjusted for when the time between consecutive treatment administrations is fixed.

我们提出了一种新方法,用于在连续治疗之间的时间固定时调整未测量的时间稳定混杂因素。我们通过关注新用户队列来实现这一目标。此外,我们还设想,所有时间稳定混杂因素都会随着开始治疗时的疾病状况所决定的潜在治疗时间而变化。根据这一逻辑,我们可以通过调整潜在的治疗时间来消除所有未测量的时间稳定混杂因素。这种方法面临的一个挑战是,当治疗在相关事件发生时终止(例如,如果相关事件是死亡),潜在的治疗时间就会发生正确的删减。我们展示了如何通过期望最大化算法来解决这一难题,而无需对潜在治疗时间的分布做任何进一步的假设。我们通过模拟研究说明了该方法的实用性。我们还应用该方法,通过国家登记资料调查了丹麦人口中抑郁/焦虑药物对后续其他药物中毒的影响。我们发现,使用选择性 5-羟色胺再摄取抑制剂治疗对各种药物的中毒风险具有保护作用(1 年的风险差异约为 - 3 % $-3%$),而标准 Cox 模型分析则显示出伤害作用(1 年的风险差异约为 2 % $2%$),这与我们预期的适应症混杂情况一致。当连续治疗之间的时间固定时,未测量的时间稳定混杂因素完全可以调整。
{"title":"A Principled Approach to Adjust for Unmeasured Time-Stable Confounding of Supervised Treatment","authors":"Jeppe Ekstrand Halkjær Madsen,&nbsp;Thomas Delvin,&nbsp;Thomas Scheike,&nbsp;Christian Pipper","doi":"10.1002/bimj.70026","DOIUrl":"10.1002/bimj.70026","url":null,"abstract":"<div>\u0000 \u0000 <p>We propose a novel method to adjust for unmeasured time-stable confounding when the time between consecutive treatment administrations is fixed. We achieve this by focusing on a new-user cohort. Furthermore, we envisage that all time-stable confounding goes through the potential time on treatment as dictated by the disease condition at the initiation of treatment. Following this logic, we may eliminate all unmeasured time-stable confounding by adjusting for the potential time on treatment. A challenge with this approach is that right censoring of the potential time on treatment occurs when treatment is terminated at the time of the event of interest, for example, if the event of interest is death. We show how this challenge may be solved by means of the expectation-maximization algorithm without imposing any further assumptions on the distribution of the potential time on treatment. The usefulness of the methodology is illustrated in a simulation study. We also apply the methodology to investigate the effect of depression/anxiety drugs on subsequent poisoning by other medications in the Danish population by means of national registries. We find a protective effect of treatment with selective serotonin reuptake inhibitors on the risk of poisoning by various medications (1- year risk difference of approximately <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>−</mo>\u0000 <mn>3</mn>\u0000 <mo>%</mo>\u0000 </mrow>\u0000 <annotation>$-3%$</annotation>\u0000 </semantics></math>) and a standard Cox model analysis shows a harming effect (1-year risk difference of approximately <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 <mo>%</mo>\u0000 </mrow>\u0000 <annotation>$2%$</annotation>\u0000 </semantics></math>), which is consistent with what we would expect due to confounding by indication. Unmeasured time-stable confounding can be entirely adjusted for when the time between consecutive treatment administrations is fixed.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing Balance of Baseline Time-Dependent Covariates via the Fréchet Distance 通过区间距离评估基准时间相关协变量的平衡。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70024
Mireya Díaz

Assessment of covariate balance is a key step when performing comparisons between groups particularly in real-world data. We generally evaluate it on baseline covariates, but rarely on longitudinal ones prior to a management decision. We could use pointwise standardized mean differences, standardized differences of slopes, or weights from the model for such purpose. Pointwise differences could be cumbersome for densely sampled longitudinal markers and/or measured at different points. Slopes are suitable for linear or transformable models but not for more complex curves. Weights do not identify the specific covariate(s) responsible for imbalances. This work presents the Fréchet distance as a viable alternative to assess balance of time-dependent covariates. A set of linear and nonlinear curves for which their standardized difference or differences in functional parameters were within 10% sought to identify the Fréchet distance equivalent to this threshold. This threshold is dependent on the level of noise present and thus within group heterogeneity and error variance are needed for its interpretation. Applied to a set of real curves representing the monthly trajectory of hemoglobin A1c from diabetic patients showed that the curves in the two groups were not balanced at the 10% mark. A Beta distribution represents the Fréchet distance distribution reasonably well in most scenarios. This assessment of covariate balance provides the following advantages: It can handle curves of different lengths, shapes, and arbitrary time points. Future work includes examining the utility of this measure under within-series missingness, within-group heterogeneity, its comparison with other approaches, and asymptotics.

在进行组间比较时,尤其是在实际数据中,评估协变量平衡是一个关键步骤。我们通常对基线协变量进行评估,但很少在管理决策前对纵向协变量进行评估。为此,我们可以使用标准化均值点差、标准化斜率差或模型权重。对于取样密集的纵向标记和/或在不同点测量的标记,点平均差可能比较麻烦。斜率适用于线性或可转换模型,但不适用于更复杂的曲线。权重不能确定造成不平衡的具体协变量。这项工作提出了弗雷谢特距离,作为评估随时间变化的协变量平衡的可行替代方法。一组线性和非线性曲线的标准化差异或功能参数差异在 10%以内,我们试图找出与这一阈值相当的弗雷谢特距离。该阈值取决于存在的噪声水平,因此在解释时需要考虑组内异质性和误差方差。对一组代表糖尿病患者血红蛋白 A1c 每月变化轨迹的真实曲线进行应用后发现,两组的曲线在 10%的界限处并不平衡。Beta 分布在大多数情况下都能很好地代表弗雷谢特距离分布。这种协变量平衡评估具有以下优点:它可以处理不同长度、形状和任意时间点的曲线。未来的工作包括研究这种测量方法在序列内缺失、组内异质性、与其他方法的比较以及渐近性等情况下的实用性。
{"title":"Assessing Balance of Baseline Time-Dependent Covariates via the Fréchet Distance","authors":"Mireya Díaz","doi":"10.1002/bimj.70024","DOIUrl":"10.1002/bimj.70024","url":null,"abstract":"<div>\u0000 \u0000 <p>Assessment of covariate balance is a key step when performing comparisons between groups particularly in real-world data. We generally evaluate it on baseline covariates, but rarely on longitudinal ones prior to a management decision. We could use pointwise standardized mean differences, standardized differences of slopes, or weights from the model for such purpose. Pointwise differences could be cumbersome for densely sampled longitudinal markers and/or measured at different points. Slopes are suitable for linear or transformable models but not for more complex curves. Weights do not identify the specific covariate(s) responsible for imbalances. This work presents the Fréchet distance as a viable alternative to assess balance of time-dependent covariates. A set of linear and nonlinear curves for which their standardized difference or differences in functional parameters were within 10% sought to identify the Fréchet distance equivalent to this threshold. This threshold is dependent on the level of noise present and thus within group heterogeneity and error variance are needed for its interpretation. Applied to a set of real curves representing the monthly trajectory of hemoglobin A1c from diabetic patients showed that the curves in the two groups were not balanced at the 10% mark. A Beta distribution represents the Fréchet distance distribution reasonably well in most scenarios. This assessment of covariate balance provides the following advantages: It can handle curves of different lengths, shapes, and arbitrary time points. Future work includes examining the utility of this measure under within-series missingness, within-group heterogeneity, its comparison with other approaches, and asymptotics.</p>\u0000 </div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Oncology Clinical Trial Design Planning Based on a Multistate Model That Jointly Models Progression-Free and Overall Survival Endpoints 基于多状态模型的肿瘤临床试验设计计划,该模型联合建模无进展和总生存终点。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70017
Alexandra Erdmann, Jan Beyersmann, Kaspar Rufibach

When planning an oncology clinical trial, the usual approach is to assume proportional hazards and even an exponential distribution for time-to-event endpoints. Often, besides the gold-standard endpoint overall survival (OS), progression-free survival (PFS) is considered as a second confirmatory endpoint. We use a survival multistate model to jointly model these two endpoints and find that neither exponential distribution nor proportional hazards will typically hold for both endpoints simultaneously. The multistate model provides a stochastic process approach to model the dependency of such endpoints neither requiring latent failure times nor explicit dependency modeling such as copulae. We use the multistate model framework to simulate clinical trials with endpoints OS and PFS and show how design planning questions can be answered using this approach. In particular, nonproportional hazards for at least one of the endpoints are a consequence of OS and PFS being dependent and are naturally modeled to improve planning. We then illustrate how clinical trial design can be based on simulations from a multistate model. Key applications are coprimary endpoints and group-sequential designs. Simulations for these applications show that the standard simplifying approach may very well lead to underpowered or overpowered clinical trials. Our approach is quite general and can be extended to more complex trial designs, further endpoints, and other therapeutic areas. An R package is available on CRAN.

当规划肿瘤临床试验时,通常的方法是假设成比例的风险,甚至是时间到事件终点的指数分布。通常,除了金标准终点总生存期(OS)外,无进展生存期(PFS)被认为是第二个验证终点。我们使用生存多状态模型来联合模拟这两个端点,并发现指数分布和比例风险通常不会同时适用于两个端点。多状态模型提供了一种随机过程方法来对这些端点的依赖性进行建模,既不需要潜在故障时间,也不需要显式的依赖性建模,例如copulae。我们使用多状态模型框架来模拟终点OS和PFS的临床试验,并展示如何使用这种方法来回答设计规划问题。特别是,至少一个终点的非比例风险是OS和PFS相互依赖的结果,并且自然地建模以改进计划。然后,我们说明了临床试验设计如何基于多状态模型的模拟。关键的应用是主要端点和组顺序设计。对这些应用程序的模拟表明,标准的简化方法很可能导致临床试验的动力不足或过度。我们的方法非常通用,可以扩展到更复杂的试验设计,进一步的终点和其他治疗领域。在CRAN上可以获得R包。
{"title":"Oncology Clinical Trial Design Planning Based on a Multistate Model That Jointly Models Progression-Free and Overall Survival Endpoints","authors":"Alexandra Erdmann,&nbsp;Jan Beyersmann,&nbsp;Kaspar Rufibach","doi":"10.1002/bimj.70017","DOIUrl":"10.1002/bimj.70017","url":null,"abstract":"<p>When planning an oncology clinical trial, the usual approach is to assume proportional hazards and even an exponential distribution for time-to-event endpoints. Often, besides the gold-standard endpoint overall survival (OS), progression-free survival (PFS) is considered as a second confirmatory endpoint. We use a survival multistate model to jointly model these two endpoints and find that neither exponential distribution nor proportional hazards will typically hold for both endpoints simultaneously. The multistate model provides a stochastic process approach to model the dependency of such endpoints neither requiring latent failure times nor explicit dependency modeling such as copulae. We use the multistate model framework to simulate clinical trials with endpoints OS and PFS and show how design planning questions can be answered using this approach. In particular, nonproportional hazards for at least one of the endpoints are a consequence of OS and PFS being dependent and are naturally modeled to improve planning. We then illustrate how clinical trial design can be based on simulations from a multistate model. Key applications are coprimary endpoints and group-sequential designs. Simulations for these applications show that the standard simplifying approach may very well lead to underpowered or overpowered clinical trials. Our approach is quite general and can be extended to more complex trial designs, further endpoints, and other therapeutic areas. An R package is available on CRAN.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Test Statistics and Statistical Inference for Data With Informative Cluster Sizes 具有信息簇大小的数据的检验统计和统计推断。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70021
Soyoung Kim, Michael J. Martens, Kwang Woo Ahn

In biomedical studies, investigators often encounter clustered data. The cluster sizes are said to be informative if the outcome depends on the cluster size. Ignoring informative cluster sizes in the analysis leads to biased parameter estimation in marginal and mixed-effect regression models. Several methods to analyze data with informative cluster sizes have been proposed; however, methods to test the informativeness of the cluster sizes are limited, particularly for the marginal model. In this paper, we propose a score test and a Wald test to examine the informativeness of the cluster sizes for a generalized linear model, a Cox model, and a proportional subdistribution hazards model. Statistical inference can be conducted through weighted estimating equations. The simulation results show that both tests control Type I error rates well, but the score test has higher power than the Wald test for right-censored data while the power of the Wald test is generally higher than the score test for the binary outcome. We apply the Wald and score tests to hematopoietic cell transplant data and compare regression analysis results with/without adjusting for informative cluster sizes.

在生物医学研究中,研究人员经常会遇到聚类数据。如果结果取决于聚类大小,聚类大小就被认为是有信息量的。在分析中忽略有信息的聚类大小会导致边际回归模型和混合效应回归模型的参数估计出现偏差。目前已经提出了几种方法来分析具有信息量聚类大小的数据;然而,检验聚类大小信息量的方法却很有限,尤其是在边际模型中。在本文中,我们提出了一种得分检验和一种 Wald 检验来检验广义线性模型、Cox 模型和比例子分布危险模型的聚类大小的信息性。统计推断可通过加权估计方程进行。模拟结果表明,两种检验都能很好地控制 I 类错误率,但对于右删失数据,得分检验的功率高于 Wald 检验,而对于二元结果,Wald 检验的功率一般高于得分检验。我们将 Wald 检验和得分检验应用于造血细胞移植数据,并比较了有/无信息群组大小调整的回归分析结果。
{"title":"Test Statistics and Statistical Inference for Data With Informative Cluster Sizes","authors":"Soyoung Kim,&nbsp;Michael J. Martens,&nbsp;Kwang Woo Ahn","doi":"10.1002/bimj.70021","DOIUrl":"10.1002/bimj.70021","url":null,"abstract":"<div>\u0000 \u0000 <p>In biomedical studies, investigators often encounter clustered data. The cluster sizes are said to be informative if the outcome depends on the cluster size. Ignoring informative cluster sizes in the analysis leads to biased parameter estimation in marginal and mixed-effect regression models. Several methods to analyze data with informative cluster sizes have been proposed; however, methods to test the informativeness of the cluster sizes are limited, particularly for the marginal model. In this paper, we propose a score test and a Wald test to examine the informativeness of the cluster sizes for a generalized linear model, a Cox model, and a proportional subdistribution hazards model. Statistical inference can be conducted through weighted estimating equations. The simulation results show that both tests control Type I error rates well, but the score test has higher power than the Wald test for right-censored data while the power of the Wald test is generally higher than the score test for the binary outcome. We apply the Wald and score tests to hematopoietic cell transplant data and compare regression analysis results with/without adjusting for informative cluster sizes.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Best Subset Solution Path for Linear Dimension Reduction Models Using Continuous Optimization 使用连续优化的线性降维模型的最佳子集求解路径
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70015
Benoit Liquet, Sarat Moka, Samuel Muller

The selection of best variables is a challenging problem in supervised and unsupervised learning, especially in high-dimensional contexts where the number of variables is usually much larger than the number of observations. In this paper, we focus on two multivariate statistical methods: principal components analysis and partial least squares. Both approaches are popular linear dimension-reduction methods with numerous applications in several fields including in genomics, biology, environmental science, and engineering. In particular, these approaches build principal components, new variables that are combinations of all the original variables. A main drawback of principal components is the difficulty to interpret them when the number of variables is large. To define principal components from the most relevant variables, we propose to cast the best subset solution path method into principal component analysis and partial least square frameworks. We offer a new alternative by exploiting a continuous optimization algorithm for best subset solution path. Empirical studies show the efficacy of our approach for providing the best subset solution path. The usage of our algorithm is further exposed through the analysis of two real data sets. The first data set is analyzed using the principle component analysis while the analysis of the second data set is based on partial least square framework.

在有监督和无监督学习中,最佳变量的选择是一个具有挑战性的问题,尤其是在高维情况下,变量的数量通常远远大于观测值的数量。本文重点讨论两种多元统计方法:主成分分析和偏最小二乘法。这两种方法都是流行的线性降维方法,在基因组学、生物学、环境科学和工程学等多个领域都有大量应用。特别是,这些方法可以建立主成分,即由所有原始变量组合而成的新变量。主成分的一个主要缺点是在变量数量较多时难以解释。为了从最相关的变量中定义主成分,我们建议将最佳子集求解路径法引入主成分分析和偏最小二乘法框架。我们利用最佳子集求解路径的连续优化算法,提供了一种新的选择。实证研究表明,我们的方法能有效提供最佳子集求解路径。通过对两个真实数据集的分析,进一步揭示了我们算法的用途。第一个数据集使用原理成分分析法进行分析,而第二个数据集的分析则基于偏最小二乘法框架。
{"title":"Best Subset Solution Path for Linear Dimension Reduction Models Using Continuous Optimization","authors":"Benoit Liquet,&nbsp;Sarat Moka,&nbsp;Samuel Muller","doi":"10.1002/bimj.70015","DOIUrl":"10.1002/bimj.70015","url":null,"abstract":"<div>\u0000 \u0000 <p>The selection of best variables is a challenging problem in supervised and unsupervised learning, especially in high-dimensional contexts where the number of variables is usually much larger than the number of observations. In this paper, we focus on two multivariate statistical methods: principal components analysis and partial least squares. Both approaches are popular linear dimension-reduction methods with numerous applications in several fields including in genomics, biology, environmental science, and engineering. In particular, these approaches build principal components, new variables that are combinations of all the original variables. A main drawback of principal components is the difficulty to interpret them when the number of variables is large. To define principal components from the most relevant variables, we propose to cast the best subset solution path method into principal component analysis and partial least square frameworks. We offer a new alternative by exploiting a continuous optimization algorithm for best subset solution path. Empirical studies show the efficacy of our approach for providing the best subset solution path. The usage of our algorithm is further exposed through the analysis of two real data sets. The first data set is analyzed using the principle component analysis while the analysis of the second data set is based on partial least square framework.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Goodness-of-Fit Testing for a Regression Model With a Doubly Truncated Response 双截断响应回归模型的拟合优度检验。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70022
Jacobo de Uña-Álvarez

In survival analysis and epidemiology, among other fields, interval sampling is often employed. With interval sampling, the individuals undergoing the event of interest within a calendar time interval are recruited. This results in doubly truncated event times. Double truncation, which may appear with other sampling designs too, induces a selection bias, so ordinary statistical methods are generally inconsistent. In this paper, we introduce goodness-of-fit procedures for a regression model when the response variable is doubly truncated. With this purpose, a marked empirical process based on weighted residuals is constructed and its weak convergence is established. Kolmogorov–Smirnov– and Cramér–von Mises–type tests are consequently derived from such core process, and a bootstrap approximation for their practical implementation is given. The performance of the proposed tests is investigated through simulations. An application to model selection for AIDS incubation time as depending on age at infection is provided.

在生存分析和流行病学等领域,经常采用间隔抽样。使用间隔抽样,在日历时间间隔内招募经历感兴趣事件的个体。这导致事件时间被双重截断。在其他抽样设计中也可能出现双截尾,这导致了选择偏差,因此普通的统计方法通常不一致。本文介绍了当响应变量被双重截断时回归模型的拟合优度过程。为此,构造了一个基于加权残差的标记经验过程,并证明了其弱收敛性。因此,从这种核心过程导出了Kolmogorov-Smirnov- type检验和cram -von Mises-type检验,并给出了其实际实施的自举近似。通过仿真研究了所提出的测试方法的性能。提供了一种应用于艾滋病潜伏期模型选择的方法,这取决于感染年龄。
{"title":"Goodness-of-Fit Testing for a Regression Model With a Doubly Truncated Response","authors":"Jacobo de Uña-Álvarez","doi":"10.1002/bimj.70022","DOIUrl":"10.1002/bimj.70022","url":null,"abstract":"<p>In survival analysis and epidemiology, among other fields, interval sampling is often employed. With interval sampling, the individuals undergoing the event of interest within a calendar time interval are recruited. This results in doubly truncated event times. Double truncation, which may appear with other sampling designs too, induces a selection bias, so ordinary statistical methods are generally inconsistent. In this paper, we introduce goodness-of-fit procedures for a regression model when the response variable is doubly truncated. With this purpose, a marked empirical process based on weighted residuals is constructed and its weak convergence is established. Kolmogorov–Smirnov– and Cramér–von Mises–type tests are consequently derived from such core process, and a bootstrap approximation for their practical implementation is given. The performance of the proposed tests is investigated through simulations. An application to model selection for AIDS incubation time as depending on age at infection is provided.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adjusted Inference for Multiple Testing Procedure in Group-Sequential Designs 组序贯设计中多重检验程序的调整推理。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-16 DOI: 10.1002/bimj.70020
Yujie Zhao, Qi Liu, Linda Z. Sun, Keaven M. Anderson

Adjustment of statistical significance levels for repeated analysis in group-sequential trials has been understood for some time. Adjustment accounting for testing multiple hypotheses is also well understood. There is limited research on simultaneously adjusting for both multiple hypothesis testing and repeated analyses of one or more hypotheses. We address this gap by proposing adjusted-sequential p-values that reject when they are less than or equal to the family-wise Type I error rate (FWER). We also propose sequential p$p$-values for intersection hypotheses to compute adjusted-sequential p$p$-values for elementary hypotheses. We demonstrate the application using weighted Bonferroni tests and weighted parametric tests for inference on each elementary hypothesis tested.

在分组序列试验中,对重复分析的统计显著性水平进行调整已经有一段时间了。对多重假设检验的调整也已广为人知。关于同时对多重假设检验和一个或多个假设的重复分析进行调整的研究还很有限。为了弥补这一不足,我们提出了调整后的序列 p 值,当其小于或等于族内 I 类错误率 (FWER) 时,就拒绝接受。我们还提出了交集假设的序列 p $p $ 值,以计算基本假设的调整序列 p $p $ 值。我们使用加权 Bonferroni 检验和加权参数检验来演示应用,以推断所检验的每个基本假设。
{"title":"Adjusted Inference for Multiple Testing Procedure in Group-Sequential Designs","authors":"Yujie Zhao,&nbsp;Qi Liu,&nbsp;Linda Z. Sun,&nbsp;Keaven M. Anderson","doi":"10.1002/bimj.70020","DOIUrl":"10.1002/bimj.70020","url":null,"abstract":"<div>\u0000 \u0000 <p>Adjustment of statistical significance levels for repeated analysis in group-sequential trials has been understood for some time. Adjustment accounting for testing multiple hypotheses is also well understood. There is limited research on simultaneously adjusting for both multiple hypothesis testing and repeated analyses of one or more hypotheses. We address this gap by proposing <i>adjusted-sequential p-values</i> that reject when they are less than or equal to the family-wise Type I error rate (FWER). We also propose sequential <span></span><math>\u0000 <semantics>\u0000 <mi>p</mi>\u0000 <annotation>$p$</annotation>\u0000 </semantics></math>-values for intersection hypotheses to compute adjusted-sequential <span></span><math>\u0000 <semantics>\u0000 <mi>p</mi>\u0000 <annotation>$p$</annotation>\u0000 </semantics></math>-values for elementary hypotheses. We demonstrate the application using weighted Bonferroni tests and weighted parametric tests for inference on each elementary hypothesis tested.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142840160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Issue Information: Biometrical Journal 1'25 期刊信息:biometic Journal 1'25
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-15 DOI: 10.1002/bimj.70027
{"title":"Issue Information: Biometrical Journal 1'25","authors":"","doi":"10.1002/bimj.70027","DOIUrl":"https://doi.org/10.1002/bimj.70027","url":null,"abstract":"","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142868580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Interactions in High-Dimensional Data Using Cross Leverage Scores 利用交叉杠杆分数检测高维数据中的相互作用
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-29 DOI: 10.1002/bimj.70014
Sven Teschke, Katja Ickstadt, Alexander Munteanu

We develop a variable selection method for interactions in regression models on large data in the context of genetics. The method is intended for investigating the influence of single-nucleotide polymorphisms (SNPs) and their interactions on health outcomes, which is a pn$pgg n$ problem. We introduce cross leverage scores (CLSs) to detect interactions of variables while maintaining interpretability. Using this method, it is not necessary to consider every possible interaction between variables individually, which would be very time-consuming even for moderate amounts of variables. Instead, we calculate the CLS for each variable and obtain a measure of importance for this variable. Calculating the scores remains time-consuming for large data sets. The key idea for scaling to large data is to divide the data into smaller random batches or consecutive windows of variables. This avoids complex and time-consuming computations on high-dimensional matrices by performing the computations only for small subsets of the data, which is less costly. We compare these methods to provable approximations of CLS based on sketching, which aims at summarizing data succinctly. In a simulation study, we show that the CLSs are directly linked to the importance of a variable in the sense of an interaction effect. We further show that the approximation approaches are appropriate for performing the calculations efficiently on arbitrarily large data while preserving the interaction detection effect of the CLS. This underlines their scalability to genome wide data. In addition, we evaluate the methods on real data from the HapMap project.

我们开发了一种变量选择方法,用于在遗传学背景下的大数据回归模型中的相互作用。该方法旨在研究单核苷酸多态性(snp)及其相互作用对健康结果的影响,这是一个p > n$ pgg n$的问题。我们引入交叉杠杆分数(cls)来检测变量的相互作用,同时保持可解释性。使用这种方法,不需要单独考虑变量之间的每个可能的相互作用,即使对于适量的变量,也会非常耗时。相反,我们计算每个变量的CLS,并获得该变量的重要性度量。对于大型数据集,计算分数仍然很耗时。扩展到大数据的关键思想是将数据分成更小的随机批次或连续的变量窗口。通过只对数据的小子集执行计算,这避免了在高维矩阵上进行复杂和耗时的计算,成本更低。我们将这些方法与基于草图的可证明的CLS近似进行比较,草图旨在简洁地总结数据。在模拟研究中,我们表明,在交互效应的意义上,cls与变量的重要性直接相关。我们进一步表明,近似方法适用于在任意大数据上有效地执行计算,同时保留CLS的相互作用检测效果。这强调了它们对全基因组数据的可扩展性。此外,我们还对来自HapMap项目的实际数据进行了评估。
{"title":"Detecting Interactions in High-Dimensional Data Using Cross Leverage Scores","authors":"Sven Teschke,&nbsp;Katja Ickstadt,&nbsp;Alexander Munteanu","doi":"10.1002/bimj.70014","DOIUrl":"https://doi.org/10.1002/bimj.70014","url":null,"abstract":"<p>We develop a variable selection method for interactions in regression models on large data in the context of genetics. The method is intended for investigating the influence of single-nucleotide polymorphisms (SNPs) and their interactions on health outcomes, which is a <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>p</mi>\u0000 <mo>≫</mo>\u0000 <mi>n</mi>\u0000 </mrow>\u0000 <annotation>$pgg n$</annotation>\u0000 </semantics></math> problem. We introduce cross leverage scores (CLSs) to detect interactions of variables while maintaining interpretability. Using this method, it is not necessary to consider every possible interaction between variables individually, which would be very time-consuming even for moderate amounts of variables. Instead, we calculate the CLS for each variable and obtain a measure of importance for this variable. Calculating the scores remains time-consuming for large data sets. The key idea for scaling to large data is to divide the data into smaller random batches or consecutive windows of variables. This avoids complex and time-consuming computations on high-dimensional matrices by performing the computations only for small subsets of the data, which is less costly. We compare these methods to provable approximations of CLS based on sketching, which aims at summarizing data succinctly. In a simulation study, we show that the CLSs are directly linked to the importance of a variable in the sense of an interaction effect. We further show that the approximation approaches are appropriate for performing the calculations efficiently on arbitrarily large data while preserving the interaction detection effect of the CLS. This underlines their scalability to genome wide data. In addition, we evaluate the methods on real data from the HapMap project.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142749303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model Selection for Ordinary Differential Equations: A Statistical Testing Approach 常微分方程的模型选择:统计检验方法》。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-28 DOI: 10.1002/bimj.70013
Itai Dattner, Shota Gugushvili, Oleksandr Laskorunskyi

Ordinary differential equations (ODEs) are foundational tools in modeling intricate dynamics across a gamut of scientific disciplines. Yet, a possibility to represent a single phenomenon through multiple ODE models, driven by different understandings of nuances in internal mechanisms or abstraction levels, presents a model selection challenge. This study introduces a testing-based approach for ODE model selection amidst statistical noise. Rooted in the model misspecification framework, we adapt classical statistical paradigms (Vuong and Hotelling) to the ODE context, allowing for the comparison and ranking of diverse causal explanations without the constraints of nested models. Our simulation studies numerically investigate the statistical properties of the test, demonstrating its attainment of the nominal size and power across various settings. Real-world data examples further underscore the algorithm's applicability in practice. To foster accessibility and encourage real-world applications, we provide a user-friendly Python implementation of our model selection algorithm, bridging theoretical advancements with hands-on tools for the scientific community.

常微分方程(ODEs)是各学科复杂动力学建模的基础工具。然而,由于对内部机制或抽象程度的细微差别有不同的理解,通过多个 ODE 模型表示单一现象的可能性给模型选择带来了挑战。本研究介绍了一种基于测试的方法,用于在统计噪声中选择 ODE 模型。植根于模型错配框架,我们将经典统计范式(Vuong 和 Hotelling)应用于 ODE,从而可以在不受嵌套模型限制的情况下对不同的因果解释进行比较和排序。我们的模拟研究从数值上研究了该检验的统计特性,证明它在各种环境下都能达到标称规模和功率。真实世界的数据实例进一步强调了该算法在实践中的适用性。为了提高可访问性并鼓励实际应用,我们为模型选择算法提供了用户友好的 Python 实现,为科学界架起了理论进展与实践工具之间的桥梁。
{"title":"Model Selection for Ordinary Differential Equations: A Statistical Testing Approach","authors":"Itai Dattner,&nbsp;Shota Gugushvili,&nbsp;Oleksandr Laskorunskyi","doi":"10.1002/bimj.70013","DOIUrl":"10.1002/bimj.70013","url":null,"abstract":"<p>Ordinary differential equations (ODEs) are foundational tools in modeling intricate dynamics across a gamut of scientific disciplines. Yet, a possibility to represent a single phenomenon through multiple ODE models, driven by different understandings of nuances in internal mechanisms or abstraction levels, presents a model selection challenge. This study introduces a testing-based approach for ODE model selection amidst statistical noise. Rooted in the model misspecification framework, we adapt classical statistical paradigms (Vuong and Hotelling) to the ODE context, allowing for the comparison and ranking of diverse causal explanations without the constraints of nested models. Our simulation studies numerically investigate the statistical properties of the test, demonstrating its attainment of the nominal size and power across various settings. Real-world data examples further underscore the algorithm's applicability in practice. To foster accessibility and encourage real-world applications, we provide a user-friendly Python implementation of our model selection algorithm, bridging theoretical advancements with hands-on tools for the scientific community.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrical Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1