首页 > 最新文献

Biometrical Journal最新文献

英文 中文
Meta-Analysis of Diagnostic Accuracy Studies With Multiple Thresholds: Comparison of Approaches in a Simulation Study 多阈值诊断准确性研究的元分析:模拟研究中各种方法的比较
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-27 DOI: 10.1002/bimj.202300101
Antonia Zapf, Cornelia Frömke, Juliane Hardt, Gerta Rücker, Dina Voeltz, Annika Hoyer

The development of methods for the meta-analysis of diagnostic test accuracy (DTA) studies is still an active area of research. While methods for the standard case where each study reports a single pair of sensitivity and specificity are nearly routinely applied nowadays, methods to meta-analyze receiver operating characteristic (ROC) curves are not widely used. This situation is more complex, as each primary DTA study may report on several pairs of sensitivity and specificity, each corresponding to a different threshold. In a case study published earlier, we applied a number of methods for meta-analyzing DTA studies with multiple thresholds to a real-world data example (Zapf et al., Biometrical Journal. 2021; 63(4): 699–711). To date, no simulation study exists that systematically compares different approaches with respect to their performance in various scenarios when the truth is known. In this article, we aim to fill this gap and present the results of a simulation study that compares three frequentist approaches for the meta-analysis of ROC curves. We performed a systematic simulation study, motivated by an example from medical research. In the simulations, all three approaches worked partially well. The approach by Hoyer and colleagues was slightly superior in most scenarios and is recommended in practice.

诊断测试准确性(DTA)元分析方法的开发仍是一个活跃的研究领域。在标准情况下,每项研究只报告一对敏感性和特异性,这种方法如今几乎已成为常规应用,但对接收者操作特征曲线(ROC)进行元分析的方法却没有得到广泛应用。这种情况更为复杂,因为每项主要的 DTA 研究都可能报告多对灵敏度和特异性,每对灵敏度和特异性都对应不同的阈值。在早前发表的一项案例研究中,我们在一个真实世界的数据示例中应用了多种方法对具有多个阈值的 DTA 研究进行元分析(Zapf 等人,《生物计量学杂志》。2021; 63(4):699-711).迄今为止,还没有模拟研究系统地比较不同方法在已知真相的各种情况下的性能。本文旨在填补这一空白,并介绍了一项模拟研究的结果,该研究比较了 ROC 曲线元分析的三种频数主义方法。我们根据医学研究中的一个例子进行了系统的模拟研究。在模拟中,所有三种方法都部分运行良好。霍耶及其同事的方法在大多数情况下略胜一筹,在实践中值得推荐。
{"title":"Meta-Analysis of Diagnostic Accuracy Studies With Multiple Thresholds: Comparison of Approaches in a Simulation Study","authors":"Antonia Zapf,&nbsp;Cornelia Frömke,&nbsp;Juliane Hardt,&nbsp;Gerta Rücker,&nbsp;Dina Voeltz,&nbsp;Annika Hoyer","doi":"10.1002/bimj.202300101","DOIUrl":"https://doi.org/10.1002/bimj.202300101","url":null,"abstract":"<p>The development of methods for the meta-analysis of diagnostic test accuracy (DTA) studies is still an active area of research. While methods for the standard case where each study reports a single pair of sensitivity and specificity are nearly routinely applied nowadays, methods to meta-analyze receiver operating characteristic (ROC) curves are not widely used. This situation is more complex, as each primary DTA study may report on several pairs of sensitivity and specificity, each corresponding to a different threshold. In a case study published earlier, we applied a number of methods for meta-analyzing DTA studies with multiple thresholds to a real-world data example (Zapf et al., <i>Biometrical Journal</i>. 2021; 63(4): 699–711). To date, no simulation study exists that systematically compares different approaches with respect to their performance in various scenarios when the truth is known. In this article, we aim to fill this gap and present the results of a simulation study that compares three frequentist approaches for the meta-analysis of ROC curves. We performed a systematic simulation study, motivated by an example from medical research. In the simulations, all three approaches worked partially well. The approach by Hoyer and colleagues was slightly superior in most scenarios and is recommended in practice.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300101","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142324593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Post-Estimation Shrinkage in Full and Selected Linear Regression Models in Low-Dimensional Data Revisited 再论低维数据中完全线性回归模型和选定线性回归模型的估计后收缩率
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-27 DOI: 10.1002/bimj.202300368
Edwin Kipruto, Willi Sauerbrei

The fit of a regression model to new data is often worse due to overfitting. Analysts use variable selection techniques to develop parsimonious regression models, which may introduce bias into regression estimates. Shrinkage methods have been proposed to mitigate overfitting and reduce bias in estimates. Post-estimation shrinkage is an alternative to penalized methods. This study evaluates effectiveness of post-estimation shrinkage in improving prediction performance of full and selected models. Through a simulation study, results were compared with ordinary least squares (OLS) and ridge in full models, and best subset selection (BSS) and lasso in selected models. We focused on prediction errors and the number of selected variables. Additionally, we proposed a modified version of the parameter-wise shrinkage (PWS) approach named non-negative PWS (NPWS) to address weaknesses of PWS. Results showed that no method was superior in all scenarios. In full models, NPWS outperformed global shrinkage, whereas PWS was inferior to OLS. In low correlation with moderate-to-high signal-to-noise ratio (SNR), NPWS outperformed ridge, but ridge performed best in small sample sizes, high correlation, and low SNR. In selected models, all post-estimation shrinkage performed similarly, with global shrinkage slightly inferior. Lasso outperformed BSS and post-estimation shrinkage in small sample sizes, low SNR, and high correlation but was inferior when the opposite was true. Our study suggests that, with sufficient information, NPWS is more effective than global shrinkage in improving prediction accuracy of models. However, in high correlation, small sample sizes, and low SNR, penalized methods generally outperform post-estimation shrinkage methods.

由于过度拟合,回归模型与新数据的拟合效果往往会变差。分析师使用变量选择技术来建立简洁的回归模型,这可能会给回归估算带来偏差。有人提出了收缩方法来缓解过度拟合,减少估计值的偏差。估计后收缩法是惩罚法的一种替代方法。本研究评估了估计后缩减法在提高完整模型和选定模型预测性能方面的有效性。通过模拟研究,将结果与完整模型中的普通最小二乘法(OLS)和岭法,以及选定模型中的最佳子集选择法(BSS)和套索法进行了比较。我们重点关注了预测误差和所选变量的数量。此外,我们还针对 PWS 的弱点,提出了一种名为非负 PWS(NPWS)的参数明智收缩(PWS)方法的改进版。结果表明,在所有情况下,没有一种方法更胜一筹。在完整模型中,NPWS 优于全局收缩法,而 PWS 则逊于 OLS。在低相关性和中高信噪比(SNR)的情况下,NPWS 的表现优于 Ridge,但 Ridge 在样本量小、高相关性和低信噪比的情况下表现最好。在选定的模型中,所有后估计缩减法的表现相似,全局缩减法略逊一筹。在样本量小、信噪比低和相关性高的情况下,Lasso 的表现优于 BSS 和估计后收缩法,但在相反的情况下,Lasso 的表现则较差。我们的研究表明,在信息充足的情况下,NPWS 在提高模型预测准确性方面比全局收缩更有效。然而,在高相关性、小样本量和低信噪比的情况下,惩罚法通常优于估计后收缩法。
{"title":"Post-Estimation Shrinkage in Full and Selected Linear Regression Models in Low-Dimensional Data Revisited","authors":"Edwin Kipruto,&nbsp;Willi Sauerbrei","doi":"10.1002/bimj.202300368","DOIUrl":"https://doi.org/10.1002/bimj.202300368","url":null,"abstract":"<p>The fit of a regression model to new data is often worse due to overfitting. Analysts use variable selection techniques to develop parsimonious regression models, which may introduce bias into regression estimates. Shrinkage methods have been proposed to mitigate overfitting and reduce bias in estimates. Post-estimation shrinkage is an alternative to penalized methods. This study evaluates effectiveness of post-estimation shrinkage in improving prediction performance of full and selected models. Through a simulation study, results were compared with ordinary least squares (OLS) and ridge in full models, and best subset selection (BSS) and lasso in selected models. We focused on prediction errors and the number of selected variables. Additionally, we proposed a modified version of the parameter-wise shrinkage (PWS) approach named non-negative PWS (NPWS) to address weaknesses of PWS. Results showed that no method was superior in all scenarios. In full models, NPWS outperformed global shrinkage, whereas PWS was inferior to OLS. In low correlation with moderate-to-high signal-to-noise ratio (SNR), NPWS outperformed ridge, but ridge performed best in small sample sizes, high correlation, and low SNR. In selected models, all post-estimation shrinkage performed similarly, with global shrinkage slightly inferior. Lasso outperformed BSS and post-estimation shrinkage in small sample sizes, low SNR, and high correlation but was inferior when the opposite was true. Our study suggests that, with sufficient information, NPWS is more effective than global shrinkage in improving prediction accuracy of models. However, in high correlation, small sample sizes, and low SNR, penalized methods generally outperform post-estimation shrinkage methods.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300368","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142324583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional Data Analysis: An Introduction and Recent Developments 功能数据分析:导论与最新发展
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-27 DOI: 10.1002/bimj.202300363
Jan Gertheiss, David Rügamer, Bernard X. W. Liew, Sonja Greven

Functional data analysis (FDA) is a statistical framework that allows for the analysis of curves, images, or functions on higher dimensional domains. The goals of FDA, such as descriptive analyses, classification, and regression, are generally the same as for statistical analyses of scalar-valued or multivariate data, but FDA brings additional challenges due to the high- and infinite dimensionality of observations and parameters, respectively. This paper provides an introduction to FDA, including a description of the most common statistical analysis techniques, their respective software implementations, and some recent developments in the field. The paper covers fundamental concepts such as descriptives and outliers, smoothing, amplitude and phase variation, and functional principal component analysis. It also discusses functional regression, statistical inference with functional data, functional classification and clustering, and machine learning approaches for functional data analysis. The methods discussed in this paper are widely applicable in fields such as medicine, biophysics, neuroscience, and chemistry and are increasingly relevant due to the widespread use of technologies that allow for the collection of functional data. Sparse functional data methods are also relevant for longitudinal data analysis. All presented methods are demonstrated using available software in R by analyzing a dataset on human motion and motor control. To facilitate the understanding of the methods, their implementation, and hands-on application, the code for these practical examples is made available through a code and data supplement and on GitHub.

函数数据分析(FDA)是一种统计框架,可用于分析高维域上的曲线、图像或函数。函数数据分析的目标,如描述性分析、分类和回归,与标量值或多变量数据统计分析的目标大致相同,但由于观测值和参数分别具有高维和无限维,函数数据分析带来了额外的挑战。本文介绍了 FDA,包括最常见的统计分析技术、各自的软件实现以及该领域的一些最新进展。本文涵盖了一些基本概念,如描述值和离群值、平滑、振幅和相位变化以及函数主成分分析。论文还讨论了功能回归、功能数据统计推断、功能分类和聚类,以及用于功能数据分析的机器学习方法。本文讨论的方法可广泛应用于医学、生物物理学、神经科学和化学等领域,而且由于可收集功能数据的技术的广泛应用,这些方法的相关性日益增强。稀疏功能数据方法也适用于纵向数据分析。通过分析人类运动和运动控制的数据集,使用现有的 R 软件演示了所有介绍的方法。为了便于理解这些方法、实现这些方法以及进行实际应用,我们通过代码和数据补充以及 GitHub 提供了这些实际示例的代码。
{"title":"Functional Data Analysis: An Introduction and Recent Developments","authors":"Jan Gertheiss,&nbsp;David Rügamer,&nbsp;Bernard X. W. Liew,&nbsp;Sonja Greven","doi":"10.1002/bimj.202300363","DOIUrl":"https://doi.org/10.1002/bimj.202300363","url":null,"abstract":"<p>Functional data analysis (FDA) is a statistical framework that allows for the analysis of curves, images, or functions on higher dimensional domains. The goals of FDA, such as descriptive analyses, classification, and regression, are generally the same as for statistical analyses of scalar-valued or multivariate data, but FDA brings additional challenges due to the high- and infinite dimensionality of observations and parameters, respectively. This paper provides an introduction to FDA, including a description of the most common statistical analysis techniques, their respective software implementations, and some recent developments in the field. The paper covers fundamental concepts such as descriptives and outliers, smoothing, amplitude and phase variation, and functional principal component analysis. It also discusses functional regression, statistical inference with functional data, functional classification and clustering, and machine learning approaches for functional data analysis. The methods discussed in this paper are widely applicable in fields such as medicine, biophysics, neuroscience, and chemistry and are increasingly relevant due to the widespread use of technologies that allow for the collection of functional data. Sparse functional data methods are also relevant for longitudinal data analysis. All presented methods are demonstrated using available software in R by analyzing a dataset on human motion and motor control. To facilitate the understanding of the methods, their implementation, and hands-on application, the code for these practical examples is made available through a code and data supplement and on GitHub.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300363","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142324584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stakeholders' Perspectives on Current Issues in Data Monitoring Committees 利益相关者对数据监测委员会当前问题的看法。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-22 DOI: 10.1002/bimj.202300384
Michael J. Cartwright, Tim Friede, David Lawrence, Emma May, Tobias Mütze, Kit Roes

Data Monitoring Committees (DMCs) are groups of experts that review accumulating data from one or more ongoing clinical studies and advise the Sponsor regarding the continuing safety of study subjects along with the continuing validity and scientific merit of the study. Although DMCs are widely used, considerable variability exists in their conduct. This paper offers recommendations, derived from sessions given at the 2023 Central European Network International Biometric and Statisticians in the Pharmaceutical Industry Conferences' and the authors' experiences. We focus on four topics that are part of the DMC process and where there is unclarity and inconsistency in current practices: (1) Communication with the DMC—We reflect on the importance of effective, proper communication channels between the DMC and relevant stakeholders to foster collaboration and exchange of critical information while retaining study integrity throughout. (2) Open sessions—We discuss the benefits of incorporating open sessions in DMC meetings to enhance transparency, inclusivity, and the consideration of diverse perspectives, as well as pitfalls of open sessions. (3) Access to efficacy data—We highlight the need for appropriate access to efficacy data by DMCs and discuss how to implement this in practice and how to address potential concerns regarding multiplicity. (4) Interactive data displays—We outline the utilization of interactive data displays to facilitate a more intuitive understanding of study results by the DMC. By addressing these topics, we aim to provide comprehensive practical recommendations that bridge the gap between current practices and optimal DMC functionality.

数据监测委员会(DMC)是由专家组成的小组,负责审查一项或多项正在进行的临床研究的累积数据,并就研究对象的持续安全性以及研究的持续有效性和科学价值向申办方提供建议。尽管 DMC 被广泛使用,但其行为方式存在相当大的差异。本文根据 2023 年中欧网络国际制药业生物计量和统计学家会议上的发言和作者的经验提出建议。我们重点讨论了 DMC 流程中的四个主题,以及当前实践中存在的不清晰和不一致之处:(1)与 DMC 的沟通--我们思考了 DMC 与相关利益方之间有效、适当的沟通渠道的重要性,以促进合作和重要信息的交流,同时在整个过程中保持研究的完整性。(2) 公开会议--我们讨论了将公开会议纳入 DMC 会议以提高透明度、包容性和考虑不同观点的好处,以及公开会议的缺陷。(3) 获取疗效数据--我们强调了区管会适当获取疗效数据的必要性,并讨论了如何在实践中落实这一点,以及如何解决潜在的多重性问题。(4) 交互式数据显示--我们概述了如何利用交互式数据显示来帮助 DMC 更直观地了解研究结果。通过讨论这些主题,我们旨在提供全面实用的建议,缩小当前实践与最佳 DMC 功能之间的差距。
{"title":"Stakeholders' Perspectives on Current Issues in Data Monitoring Committees","authors":"Michael J. Cartwright,&nbsp;Tim Friede,&nbsp;David Lawrence,&nbsp;Emma May,&nbsp;Tobias Mütze,&nbsp;Kit Roes","doi":"10.1002/bimj.202300384","DOIUrl":"10.1002/bimj.202300384","url":null,"abstract":"<div>\u0000 \u0000 <p>Data Monitoring Committees (DMCs) are groups of experts that review accumulating data from one or more ongoing clinical studies and advise the Sponsor regarding the continuing safety of study subjects along with the continuing validity and scientific merit of the study. Although DMCs are widely used, considerable variability exists in their conduct. This paper offers recommendations, derived from sessions given at the 2023 Central European Network International Biometric and Statisticians in the Pharmaceutical Industry Conferences' and the authors' experiences. We focus on four topics that are part of the DMC process and where there is unclarity and inconsistency in current practices: (1) Communication with the DMC—We reflect on the importance of effective, proper communication channels between the DMC and relevant stakeholders to foster collaboration and exchange of critical information while retaining study integrity throughout. (2) Open sessions—We discuss the benefits of incorporating open sessions in DMC meetings to enhance transparency, inclusivity, and the consideration of diverse perspectives, as well as pitfalls of open sessions. (3) Access to efficacy data—We highlight the need for appropriate access to efficacy data by DMCs and discuss how to implement this in practice and how to address potential concerns regarding multiplicity. (4) Interactive data displays—We outline the utilization of interactive data displays to facilitate a more intuitive understanding of study results by the DMC. By addressing these topics, we aim to provide comprehensive practical recommendations that bridge the gap between current practices and optimal DMC functionality.</p>\u0000 </div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142301496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Network-Constrain Weibull AFT Model for Biomarkers Discovery 用于生物标记物发现的网络应变 Weibull AFT 模型
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-22 DOI: 10.1002/bimj.202300272
Claudia Angelini, Daniela De Canditiis, Italia De Feis, Antonella Iuliano

We propose AFTNet, a novel network-constraint survival analysis method based on the Weibull accelerated failure time (AFT) model solved by a penalized likelihood approach for variable selection and estimation. When using the log-linear representation, the inference problem becomes a structured sparse regression problem for which we explicitly incorporate the correlation patterns among predictors using a double penalty that promotes both sparsity and grouping effect. Moreover, we establish the theoretical consistency for the AFTNet estimator and present an efficient iterative computational algorithm based on the proximal gradient descent method. Finally, we evaluate AFTNet performance both on synthetic and real data examples.

我们提出的 AFTNet 是一种新颖的网络约束生存分析方法,它基于 Weibull 加速失效时间(AFT)模型,通过惩罚似然法解决变量选择和估计问题。当使用对数线性表示时,推理问题就变成了一个结构稀疏回归问题,我们使用一种既能促进稀疏性又能促进分组效应的双重惩罚,明确地纳入了预测因子之间的相关模式。此外,我们还建立了 AFTNet 估计器的理论一致性,并提出了一种基于近似梯度下降法的高效迭代计算算法。最后,我们对 AFTNet 在合成数据和真实数据示例上的性能进行了评估。
{"title":"A Network-Constrain Weibull AFT Model for Biomarkers Discovery","authors":"Claudia Angelini,&nbsp;Daniela De Canditiis,&nbsp;Italia De Feis,&nbsp;Antonella Iuliano","doi":"10.1002/bimj.202300272","DOIUrl":"10.1002/bimj.202300272","url":null,"abstract":"<p>We propose AFTNet, a novel network-constraint survival analysis method based on the Weibull accelerated failure time (AFT) model solved by a penalized likelihood approach for variable selection and estimation. When using the log-linear representation, the inference problem becomes a structured sparse regression problem for which we explicitly incorporate the correlation patterns among predictors using a double penalty that promotes both sparsity and grouping effect. Moreover, we establish the theoretical consistency for the AFTNet estimator and present an efficient iterative computational algorithm based on the proximal gradient descent method. Finally, we evaluate AFTNet performance both on synthetic and real data examples.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300272","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142301494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multivariate Scalar on Multidimensional Distribution Regression With Application to Modeling the Association Between Physical Activity and Cognitive Functions 多维分布的多变量标量回归应用于体育锻炼与认知功能之间关系的建模。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-22 DOI: 10.1002/bimj.202400042
Rahul Ghosal, Marcos Matabuena

We develop a new method for multivariate scalar on multidimensional distribution regression. Traditional approaches typically analyze isolated univariate scalar outcomes or consider unidimensional distributional representations as predictors. However, these approaches are suboptimal because (i) they fail to utilize the dependence between the distributional predictors and (ii) neglect the correlation structure of the response. To overcome these limitations, we propose a multivariate distributional analysis framework that harnesses the power of multivariate density functions and multitask learning. We develop a computationally efficient semiparametric estimation method for modeling the effect of the latent joint density on the multivariate response of interest. Additionally, we introduce a new conformal prediction algorithm for quantifying the uncertainty of our multivariate predictions based on subject characteristics and individualized distributional predictors, providing valuable insights into the conditional distribution of the response. We validate the effectiveness of our proposed method through comprehensive numerical simulations, clearly demonstrating its superior performance compared to traditional methods. The application of the proposed method is demonstrated on triaxial accelerometer data from the National Health and Nutrition Examination Survey 2011–2014 for modeling the association between cognitive scores across various domains and distributional representation of physical activity among the older adult population. Our results highlight the advantages of the proposed approach, emphasizing the significance of incorporating multidimensional distributional information in the triaxial accelerometer data.

我们为多维标量的多维分布回归开发了一种新方法。传统方法通常分析孤立的单变量标量结果,或将单维分布表示视为预测因子。然而,这些方法并不理想,因为 (i) 它们未能利用分布预测因子之间的依赖关系,(ii) 忽视了响应的相关结构。为了克服这些局限性,我们提出了一个多变量分布分析框架,利用多变量密度函数和多任务学习的力量。我们开发了一种计算高效的半参数估计方法,用于模拟潜在联合密度对相关多元响应的影响。此外,我们还引入了一种新的共形预测算法,用于量化基于受试者特征和个性化分布预测因子的多元预测的不确定性,从而为了解反应的条件分布提供有价值的见解。我们通过全面的数值模拟验证了我们提出的方法的有效性,清楚地证明了它与传统方法相比的优越性能。我们在 2011-2014 年全国健康与营养调查的三轴加速度计数据上演示了所提方法的应用,以模拟老年人群中不同领域的认知分数与体力活动分布表示之间的关联。我们的结果凸显了所提方法的优势,强调了在三轴加速度计数据中纳入多维分布信息的重要性。
{"title":"Multivariate Scalar on Multidimensional Distribution Regression With Application to Modeling the Association Between Physical Activity and Cognitive Functions","authors":"Rahul Ghosal,&nbsp;Marcos Matabuena","doi":"10.1002/bimj.202400042","DOIUrl":"10.1002/bimj.202400042","url":null,"abstract":"<p>We develop a new method for multivariate scalar on multidimensional distribution regression. Traditional approaches typically analyze isolated univariate scalar outcomes or consider unidimensional distributional representations as predictors. However, these approaches are suboptimal because (i) they fail to utilize the dependence between the distributional predictors and (ii) neglect the correlation structure of the response. To overcome these limitations, we propose a multivariate distributional analysis framework that harnesses the power of multivariate density functions and multitask learning. We develop a computationally efficient semiparametric estimation method for modeling the effect of the latent joint density on the multivariate response of interest. Additionally, we introduce a new conformal prediction algorithm for quantifying the uncertainty of our multivariate predictions based on subject characteristics and individualized distributional predictors, providing valuable insights into the conditional distribution of the response. We validate the effectiveness of our proposed method through comprehensive numerical simulations, clearly demonstrating its superior performance compared to traditional methods. The application of the proposed method is demonstrated on triaxial accelerometer data from the National Health and Nutrition Examination Survey 2011–2014 for modeling the association between cognitive scores across various domains and distributional representation of physical activity among the older adult population. Our results highlight the advantages of the proposed approach, emphasizing the significance of incorporating multidimensional distributional information in the triaxial accelerometer data.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202400042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142301495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Issue Information: Biometrical Journal 7'24 发行信息:生物计量学杂志 7'24
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-17 DOI: 10.1002/bimj.202470007
{"title":"Issue Information: Biometrical Journal 7'24","authors":"","doi":"10.1002/bimj.202470007","DOIUrl":"https://doi.org/10.1002/bimj.202470007","url":null,"abstract":"","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202470007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142244539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the Heterogeneity of “Study Twins” 调查 "研究双胞胎 "的异质性。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-02 DOI: 10.1002/bimj.202300387
Christian Röver, Tim Friede

Meta-analyses are commonly performed based on random-effects models, while in certain cases one might also argue in favor of a common-effect model. One such case may be given by the example of two “study twins” that are performed according to a common (or at least very similar) protocol. Here we investigate the particular case of meta-analysis of a pair of studies, for example, summarizing the results of two confirmatory clinical trials in phase III of a clinical development program. Thereby, we focus on the question of to what extent homogeneity or heterogeneity may be discernible and include an empirical investigation of published (“twin”) pairs of studies. A pair of estimates from two studies only provide very little evidence of homogeneity or heterogeneity of effects, and ad hoc decision criteria may often be misleading.

元分析通常是基于随机效应模型进行的,但在某些情况下,我们也可以主张采用共效模型。两个 "双胞胎研究 "的例子就是这样一个例子,这两个 "双胞胎研究 "是按照共同(或至少非常相似)的方案进行的。在此,我们将研究对一对研究进行荟萃分析的特殊情况,例如,总结临床开发计划第三阶段两项确证性临床试验的结果。因此,我们将重点放在同质性或异质性在多大程度上可以辨别的问题上,并对已发表的("孪生")成对研究进行实证调查。来自两项研究的一对估计值只能提供很少的证据来证明效应的同质性或异质性,而特别的判定标准往往会产生误导。
{"title":"Investigating the Heterogeneity of “Study Twins”","authors":"Christian Röver,&nbsp;Tim Friede","doi":"10.1002/bimj.202300387","DOIUrl":"10.1002/bimj.202300387","url":null,"abstract":"<p>Meta-analyses are commonly performed based on random-effects models, while in certain cases one might also argue in favor of a common-effect model. One such case may be given by the example of two “study twins” that are performed according to a common (or at least very similar) protocol. Here we investigate the particular case of meta-analysis of a pair of studies, for example, summarizing the results of two confirmatory clinical trials in phase III of a clinical development program. Thereby, we focus on the question of to what extent homogeneity or heterogeneity may be discernible and include an empirical investigation of published (“twin”) pairs of studies. A pair of estimates from two studies only provide very little evidence of homogeneity or heterogeneity of effects, and ad hoc decision criteria may often be misleading.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300387","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
False Discovery Rate Control for Lesion-Symptom Mapping With Heterogeneous Data via Weighted p-Values 通过加权 p 值控制异构数据病变-症状映射的错误发现率
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-20 DOI: 10.1002/bimj.202300198
Siyu Zheng, Alexander C. McLain, Joshua Habiger, Christopher Rorden, Julius Fridriksson

Lesion-symptom mapping studies provide insight into what areas of the brain are involved in different aspects of cognition. This is commonly done via behavioral testing in patients with a naturally occurring brain injury or lesions (e.g., strokes or brain tumors). This results in high-dimensional observational data where lesion status (present/absent) is nonuniformly distributed, with some voxels having lesions in very few (or no) subjects. In this situation, mass univariate hypothesis tests have severe power heterogeneity where many tests are known a priori to have little to no power. Recent advancements in multiple testing methodologies allow researchers to weigh hypotheses according to side information (e.g., information on power heterogeneity). In this paper, we propose the use of p-value weighting for voxel-based lesion-symptom mapping studies. The weights are created using the distribution of lesion status and spatial information to estimate different non-null prior probabilities for each hypothesis test through some common approaches. We provide a monotone minimum weight criterion, which requires minimum a priori power information. Our methods are demonstrated on dependent simulated data and an aphasia study investigating which regions of the brain are associated with the severity of language impairment among stroke survivors. The results demonstrate that the proposed methods have robust error control and can increase power. Further, we showcase how weights can be used to identify regions that are inconclusive due to lack of power.

病变-症状图谱研究有助于深入了解大脑的哪些区域与认知的不同方面有关。这通常是通过对自然发生的脑损伤或病变(如中风或脑肿瘤)患者进行行为测试来实现的。这就产生了高维观察数据,其中病变状态(存在/不存在)分布不均匀,一些体素在极少数(或没有)受试者中存在病变。在这种情况下,大规模单变量假设检验具有严重的功率异质性,许多检验先验已知几乎没有功率。多重测试方法的最新进展使研究人员能够根据侧面信息(如功率异质性信息)对假设进行权衡。在本文中,我们建议在基于体素的病变症状图谱研究中使用 p 值加权法。权重是利用病变状态和空间信息的分布来创建的,通过一些常见的方法来估计每个假设检验的不同非空先验概率。我们提供了一个单调最小权重标准,它要求最小的先验功率信息。我们的方法在依赖性模拟数据和一项失语症研究中得到了验证,该研究调查了大脑的哪些区域与中风幸存者语言障碍的严重程度相关。结果表明,所提出的方法具有强大的误差控制能力,并能提高功率。此外,我们还展示了如何利用权重来识别由于缺乏力量而无法得出结论的区域。
{"title":"False Discovery Rate Control for Lesion-Symptom Mapping With Heterogeneous Data via Weighted p-Values","authors":"Siyu Zheng,&nbsp;Alexander C. McLain,&nbsp;Joshua Habiger,&nbsp;Christopher Rorden,&nbsp;Julius Fridriksson","doi":"10.1002/bimj.202300198","DOIUrl":"10.1002/bimj.202300198","url":null,"abstract":"<p>Lesion-symptom mapping studies provide insight into what areas of the brain are involved in different aspects of cognition. This is commonly done via behavioral testing in patients with a naturally occurring brain injury or lesions (e.g., strokes or brain tumors). This results in high-dimensional observational data where lesion status (present/absent) is nonuniformly distributed, with some voxels having lesions in very few (or no) subjects. In this situation, mass univariate hypothesis tests have severe power heterogeneity where many tests are known a priori to have little to no power. Recent advancements in multiple testing methodologies allow researchers to weigh hypotheses according to side information (e.g., information on power heterogeneity). In this paper, we propose the use of <i>p</i>-value weighting for voxel-based lesion-symptom mapping studies. The weights are created using the distribution of lesion status and spatial information to estimate different non-null prior probabilities for each hypothesis test through some common approaches. We provide a <i>monotone minimum weight</i> criterion, which requires minimum a priori power information. Our methods are demonstrated on dependent simulated data and an aphasia study investigating which regions of the brain are associated with the severity of language impairment among stroke survivors. The results demonstrate that the proposed methods have robust error control and can increase power. Further, we showcase how weights can be used to identify regions that are inconclusive due to lack of power.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300198","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142005983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Random Survival Forests With Competing Events: A Subdistribution-Based Imputation Approach 具有竞争事件的随机生存森林:基于子分布的估算方法
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-08-20 DOI: 10.1002/bimj.202400014
Charlotte Behning, Alexander Bigerl, Marvin N. Wright, Peggy Sekula, Moritz Berger, Matthias Schmid

Random survival forests (RSF) can be applied to many time-to-event research questions and are particularly useful in situations where the relationship between the independent variables and the event of interest is rather complex. However, in many clinical settings, the occurrence of the event of interest is affected by competing events, which means that a patient can experience an outcome other than the event of interest. Neglecting the competing event (i.e., regarding competing events as censoring) will typically result in biased estimates of the cumulative incidence function (CIF). A popular approach for competing events is Fine and Gray's subdistribution hazard model, which directly estimates the CIF by fitting a single-event model defined on a subdistribution timescale. Here, we integrate concepts from the subdistribution hazard modeling approach into the RSF. We develop several imputation strategies that use weights as in a discrete-time subdistribution hazard model to impute censoring times in cases where a competing event is observed. Our simulations show that the CIF is well estimated if the imputation already takes place outside the forest on the overall dataset. Especially in settings with a low rate of the event of interest or a high censoring rate, competing events must not be neglected, that is, treated as censoring. When applied to a real-world epidemiological dataset on chronic kidney disease, the imputation approach resulted in highly plausible predictor–response relationships and CIF estimates of renal events.

随机生存森林(RSF)可应用于许多从时间到事件的研究问题,尤其适用于自变量与相关事件之间关系相当复杂的情况。然而,在许多临床环境中,相关事件的发生会受到竞争事件的影响,这意味着患者可能会经历除相关事件之外的其他结果。忽略竞争事件(即把竞争事件视为普查)通常会导致对累积发病率函数(CIF)的估计出现偏差。针对竞争事件的一种流行方法是 Fine 和 Gray 的子分布危险模型,该模型通过拟合定义在子分布时间尺度上的单一事件模型来直接估计 CIF。在此,我们将亚分布危害建模方法的概念整合到 RSF 中。我们开发了几种估算策略,在观测到竞争事件的情况下,使用离散时间子分布危害模型中的权重来估算删减时间。我们的模拟结果表明,如果在整个数据集上的森林外已经进行了估算,那么 CIF 就能得到很好的估计。特别是在相关事件发生率较低或剔除率较高的情况下,竞争事件不应被忽视,即应被视为剔除事件。在应用于真实世界的慢性肾病流行病学数据集时,估算方法得出了高度可信的预测因子-响应关系和肾病事件的 CIF 估计值。
{"title":"Random Survival Forests With Competing Events: A Subdistribution-Based Imputation Approach","authors":"Charlotte Behning,&nbsp;Alexander Bigerl,&nbsp;Marvin N. Wright,&nbsp;Peggy Sekula,&nbsp;Moritz Berger,&nbsp;Matthias Schmid","doi":"10.1002/bimj.202400014","DOIUrl":"10.1002/bimj.202400014","url":null,"abstract":"<p>Random survival forests (RSF) can be applied to many time-to-event research questions and are particularly useful in situations where the relationship between the independent variables and the event of interest is rather complex. However, in many clinical settings, the occurrence of the event of interest is affected by competing events, which means that a patient can experience an outcome other than the event of interest. Neglecting the competing event (i.e., regarding competing events as censoring) will typically result in biased estimates of the cumulative incidence function (CIF). A popular approach for competing events is Fine and Gray's subdistribution hazard model, which directly estimates the CIF by fitting a single-event model defined on a subdistribution timescale. Here, we integrate concepts from the subdistribution hazard modeling approach into the RSF. We develop several imputation strategies that use weights as in a discrete-time subdistribution hazard model to impute censoring times in cases where a competing event is observed. Our simulations show that the CIF is well estimated if the imputation already takes place outside the forest on the overall dataset. Especially in settings with a low rate of the event of interest or a high censoring rate, competing events must not be neglected, that is, treated as censoring. When applied to a real-world epidemiological dataset on chronic kidney disease, the imputation approach resulted in highly plausible predictor–response relationships and CIF estimates of renal events.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202400014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142005984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrical Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1