首页 > 最新文献

Biometrical Journal最新文献

英文 中文
Conditional Variable Screening for Ultra-High Dimensional Longitudinal Data With Time Interactions 对具有时间交互作用的超高维纵向数据进行条件变量筛选。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-23 DOI: 10.1002/bimj.70005
Andrea Bratsberg, Abhik Ghosh, Magne Thoresen

In recent years, we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a moderately high dimension will now become computationally infeasible or unstable. Hence, there is a need for a prescreening of variables to reduce the dimension efficiently and accurately to a more moderate scale. There has been much work to develop such screening procedures for independent outcomes. However, much less work has been done for high-dimensional longitudinal data in which the observations can no longer be assumed to be independent. In addition, it is of interest to capture possible interactions between the genomic variable and time in many of these longitudinal studies. In this work, we propose a novel conditional screening procedure that ranks variables according to the likelihood value at the maximum likelihood estimates in a marginal linear mixed model, where the genomic variable and its interaction with time are included in the model. This is to our knowledge the first conditional screening approach for clustered data. We prove that this approach enjoys the sure screening property, and assess the finite sample performance of the method through simulations.

近年来,我们能够以极快的速度收集大量基因组数据,从而产生了变量数量大大超过观测数据数量的情况。在这种情况下,大多数能处理中等维度的模型在计算上都变得不可行或不稳定。因此,有必要对变量进行预筛选,以便有效、准确地将维度降低到更适中的程度。针对独立结果开发此类筛选程序的工作已经开展了很多。然而,针对高维度纵向数据的工作却少得多,因为在这种数据中,观察结果不能再假定是独立的。此外,在许多这类纵向研究中,捕捉基因组变量与时间之间可能存在的交互作用也很有意义。在这项工作中,我们提出了一种新颖的条件筛选程序,该程序根据边际线性混合模型中最大似然估计值的似然值对变量进行排序,其中基因组变量及其与时间的交互作用都包含在模型中。据我们所知,这是第一种针对聚类数据的条件筛选方法。我们证明了这种方法具有确定筛选属性,并通过模拟评估了该方法的有限样本性能。
{"title":"Conditional Variable Screening for Ultra-High Dimensional Longitudinal Data With Time Interactions","authors":"Andrea Bratsberg,&nbsp;Abhik Ghosh,&nbsp;Magne Thoresen","doi":"10.1002/bimj.70005","DOIUrl":"10.1002/bimj.70005","url":null,"abstract":"<p>In recent years, we have been able to gather large amounts of genomic data at a fast rate, creating situations where the number of variables greatly exceeds the number of observations. In these situations, most models that can handle a moderately high dimension will now become computationally infeasible or unstable. Hence, there is a need for a prescreening of variables to reduce the dimension efficiently and accurately to a more moderate scale. There has been much work to develop such screening procedures for independent outcomes. However, much less work has been done for high-dimensional longitudinal data in which the observations can no longer be assumed to be independent. In addition, it is of interest to capture possible interactions between the genomic variable and time in many of these longitudinal studies. In this work, we propose a novel conditional screening procedure that ranks variables according to the likelihood value at the maximum likelihood estimates in a marginal linear mixed model, where the genomic variable and its interaction with time are included in the model. This is to our knowledge the first conditional screening approach for clustered data. We prove that this approach enjoys the sure screening property, and assess the finite sample performance of the method through simulations.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incompletely Observed Nonparametric Factorial Designs With Repeated Measurements: A Wild Bootstrap Approach 具有重复测量的不完全观测非参数因子设计:野性引导法
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-23 DOI: 10.1002/bimj.70008
Lubna Amro, Frank Konietschke, Markus Pauly

In many life science experiments or medical studies, subjects are repeatedly observed and measurements are collected in factorial designs with multivariate data. The analysis of such multivariate data is typically based on multivariate analysis of variance (MANOVA) or mixed models, requiring complete data, and certain assumption on the underlying parametric distribution such as continuity or a specific covariance structure, for example, compound symmetry. However, these methods are usually not applicable when discrete data or even ordered categorical data are present. In such cases, nonparametric rank-based methods that do not require stringent distributional assumptions are the preferred choice. However, in the multivariate case, most rank-based approaches have only been developed for complete observations. It is the aim of this work to develop asymptotic correct procedures that are capable of handling missing values, allowing for singular covariance matrices and are applicable for ordinal or ordered categorical data. This is achieved by applying a wild bootstrap procedure in combination with quadratic form-type test statistics. Beyond proving their asymptotic correctness, extensive simulation studies validate their applicability for small samples. Finally, two real data examples are analyzed.

在许多生命科学实验或医学研究中,受试者会被反复观察,并在因子设计中收集多变量数据。对这类多变量数据的分析通常基于多变量方差分析(MANOVA)或混合模型,需要完整的数据,以及对基本参数分布的某些假设,如连续性或特定的协方差结构,例如复合对称性。然而,这些方法通常不适用于离散数据甚至有序分类数据。在这种情况下,无需严格分布假设的非参数秩方法是首选。然而,在多变量情况下,大多数基于秩的方法只针对完整的观测数据。这项工作的目的是开发能够处理缺失值、允许奇异协方差矩阵并适用于序数或有序分类数据的渐进正确程序。这是通过应用野生引导程序与二次型检验统计相结合来实现的。除了证明其渐近正确性之外,大量的模拟研究也验证了其对小样本的适用性。最后,还分析了两个真实数据实例。
{"title":"Incompletely Observed Nonparametric Factorial Designs With Repeated Measurements: A Wild Bootstrap Approach","authors":"Lubna Amro,&nbsp;Frank Konietschke,&nbsp;Markus Pauly","doi":"10.1002/bimj.70008","DOIUrl":"10.1002/bimj.70008","url":null,"abstract":"<p>In many life science experiments or medical studies, subjects are repeatedly observed and measurements are collected in factorial designs with multivariate data. The analysis of such multivariate data is typically based on multivariate analysis of variance (MANOVA) or mixed models, requiring complete data, and certain assumption on the underlying parametric distribution such as continuity or a specific covariance structure, for example, compound symmetry. However, these methods are usually not applicable when discrete data or even ordered categorical data are present. In such cases, nonparametric rank-based methods that do not require stringent distributional assumptions are the preferred choice. However, in the multivariate case, most rank-based approaches have only been developed for complete observations. It is the aim of this work to develop asymptotic correct procedures that are capable of handling missing values, allowing for singular covariance matrices and are applicable for ordinal or ordered categorical data. This is achieved by applying a wild bootstrap procedure in combination with quadratic form-type test statistics. Beyond proving their asymptotic correctness, extensive simulation studies validate their applicability for small samples. Finally, two real data examples are analyzed.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing Class Imbalance in Bayesian Classification Through Posterior Probability Adjustment 通过后验概率调整解决贝叶斯分类中的类不平衡问题
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-18 DOI: 10.1002/bimj.70004
Vahid Nassiri, Fetene Tekle, Kanaka Tatikola, Helena Geys

Class imbalance is a known issue in classification tasks that can lead to predictive bias toward dominant classes. This paper introduces a novel straightforward Bayesian framework that adjusts posterior probabilities to counteract the bias introduced by imbalanced data sets. Instead of relying on the mean posterior distribution of class probabilities, we propose a method that scales the posterior probability of each class according to their representation in the training data.

类不平衡是分类任务中的一个已知问题,它可能导致对优势类的预测偏差。本文介绍了一种新颖、直接的贝叶斯框架,它可以调整后验概率以抵消不平衡数据集带来的偏差。我们提出的方法不是依赖类概率的平均后验分布,而是根据每个类在训练数据中的代表性来调整它们的后验概率。
{"title":"Addressing Class Imbalance in Bayesian Classification Through Posterior Probability Adjustment","authors":"Vahid Nassiri,&nbsp;Fetene Tekle,&nbsp;Kanaka Tatikola,&nbsp;Helena Geys","doi":"10.1002/bimj.70004","DOIUrl":"10.1002/bimj.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Class imbalance is a known issue in classification tasks that can lead to predictive bias toward dominant classes. This paper introduces a novel straightforward Bayesian framework that adjusts posterior probabilities to counteract the bias introduced by imbalanced data sets. Instead of relying on the mean posterior distribution of class probabilities, we propose a method that scales the posterior probability of each class according to their representation in the training data.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse-Weighted Quantile Regression With Partially Interval-Censored Data 使用部分区间删失数据的反加权定量回归
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-14 DOI: 10.1002/bimj.70001
Yeji Kim, Taehwa Choi, Seohyeon Park, Sangbum Choi, Dipankar Bandyopadhyay

This paper introduces a novel approach to estimating censored quantile regression using inverse probability of censoring weighted (IPCW) methodology, specifically tailored for data sets featuring partially interval-censored data. Such data sets, often encountered in HIV/AIDS and cancer biomedical research, may include doubly censored (DC) and partly interval-censored (PIC) endpoints. DC responses involve either left-censoring or right-censoring alongside some exact failure time observations, while PIC responses are subject to interval-censoring. Despite the existence of complex estimating techniques for interval-censored quantile regression, we propose a simple and intuitive IPCW-based method, easily implementable by assigning suitable inverse-probability weights to subjects with exact failure time observations. The resulting estimator exhibits asymptotic properties, such as uniform consistency and weak convergence, and we explore an augmented-IPCW (AIPCW) approach to enhance efficiency. In addition, our method can be adapted for multivariate partially interval-censored data. Simulation studies demonstrate the new procedure's strong finite-sample performance. We illustrate the practical application of our approach through an analysis of progression-free survival endpoints in a phase III clinical trial focusing on metastatic colorectal cancer.

本文介绍了一种利用反删失概率加权(IPCW)方法估计删失量回归的新方法,该方法专门针对具有部分区间删失数据的数据集。此类数据集在艾滋病和癌症生物医学研究中经常遇到,可能包括双重删减(DC)和部分区间删减(PIC)终点。双删失反应涉及左删失或右删失以及一些精确的失败时间观测,而部分区间删失反应则受区间删失的影响。尽管存在复杂的区间校正量子回归估计技术,但我们提出了一种简单直观的基于 IPCW 的方法,通过为具有确切故障时间观测值的受试者分配合适的反概率权重,该方法很容易实现。由此产生的估计器具有渐近特性,如均匀一致性和弱收敛性,我们还探索了一种增强型 IPCW(AIPCW)方法来提高效率。此外,我们的方法还适用于多变量部分区间删失数据。仿真研究表明,新方法具有很强的有限样本性能。我们通过分析一项以转移性结直肠癌为重点的 III 期临床试验中的无进展生存终点来说明我们的方法的实际应用。
{"title":"Inverse-Weighted Quantile Regression With Partially Interval-Censored Data","authors":"Yeji Kim,&nbsp;Taehwa Choi,&nbsp;Seohyeon Park,&nbsp;Sangbum Choi,&nbsp;Dipankar Bandyopadhyay","doi":"10.1002/bimj.70001","DOIUrl":"10.1002/bimj.70001","url":null,"abstract":"<p>This paper introduces a novel approach to estimating censored quantile regression using inverse probability of censoring weighted (IPCW) methodology, specifically tailored for data sets featuring partially interval-censored data. Such data sets, often encountered in HIV/AIDS and cancer biomedical research, may include doubly censored (DC) and partly interval-censored (PIC) endpoints. DC responses involve either left-censoring or right-censoring alongside some exact failure time observations, while PIC responses are subject to interval-censoring. Despite the existence of complex estimating techniques for interval-censored quantile regression, we propose a simple and intuitive IPCW-based method, easily implementable by assigning suitable inverse-probability weights to subjects with exact failure time observations. The resulting estimator exhibits asymptotic properties, such as uniform consistency and weak convergence, and we explore an augmented-IPCW (AIPCW) approach to enhance efficiency. In addition, our method can be adapted for multivariate partially interval-censored data. Simulation studies demonstrate the new procedure's strong finite-sample performance. We illustrate the practical application of our approach through an analysis of progression-free survival endpoints in a phase III clinical trial focusing on metastatic colorectal cancer.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142632702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixture Cure Semiparametric Accelerated Failure Time Models With Partly Interval-Censored Data 具有部分区间缺失数据的混合物验证半参数加速失效时间模型
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-07 DOI: 10.1002/bimj.202300203
Isabel Li, Jun Ma, Benoit Liquet

In practical survival analysis, the situation of no event for a patient can arise even after a long period of waiting time, which means a portion of the population may never experience the event of interest. Under this circumstance, one remedy is to adopt a mixture cure Cox model to analyze the survival data. However, if there clearly exhibits an acceleration (or deceleration) factor among their survival times, then an accelerated failure time (AFT) model will be preferred, leading to a mixture cure AFT model. In this paper, we consider a penalized likelihood method to estimate the mixture cure semiparametric AFT models, where the unknown baseline hazard is approximated using Gaussian basis functions. We allow partly interval-censored survival data which can include event times and left-, right-, and interval-censoring times. The penalty function helps to achieve a smooth estimate of the baseline hazard function. We will also provide asymptotic properties to the estimates so that inferences can be made on regression parameters and hazard-related quantities. Simulation studies are conducted to evaluate the model performance, which includes a comparative study with an existing method from the smcure R package. The results show that our proposed penalized likelihood method has acceptable performance in general and produces less bias when faced with the identifiability issue compared to smcure. To illustrate the application of our method, a real case study involving melanoma recurrence is conducted and reported. Our model is implemented in our R package aftQnp which is available from https://github.com/Isabellee4555/aftQnP.

在实际的生存分析中,即使经过很长一段时间的等待,也可能会出现患者无事件发生的情况,这意味着有一部分人可能永远不会经历感兴趣的事件。在这种情况下,一种补救方法是采用混合治愈考克斯模型来分析生存数据。但是,如果他们的存活时间明显存在加速(或减速)因素,那么加速失效时间(AFT)模型将更受青睐,从而导致混合固化 AFT 模型。在本文中,我们考虑用惩罚似然法估计混合治愈半参数 AFT 模型,其中未知基线危害使用高斯基函数近似。我们允许部分区间校正的生存数据,这些数据可以包括事件时间、左校正时间、右校正时间和区间校正时间。惩罚函数有助于实现基线危害函数的平稳估计。我们还将提供估计值的渐近特性,以便对回归参数和危害相关量进行推断。我们进行了模拟研究来评估模型的性能,其中包括与 smcure R 软件包中现有方法的比较研究。结果表明,我们提出的惩罚似然法总体上具有可接受的性能,与 smcure 相比,在面临可识别性问题时产生的偏差较小。为了说明我们方法的应用,我们进行并报告了一个涉及黑色素瘤复发的真实案例研究。我们的模型在 R 软件包 aftQnp 中实现,该软件包可从 https://github.com/Isabellee4555/aftQnP 获取。
{"title":"Mixture Cure Semiparametric Accelerated Failure Time Models With Partly Interval-Censored Data","authors":"Isabel Li,&nbsp;Jun Ma,&nbsp;Benoit Liquet","doi":"10.1002/bimj.202300203","DOIUrl":"10.1002/bimj.202300203","url":null,"abstract":"<div>\u0000 \u0000 <p>In practical survival analysis, the situation of no event for a patient can arise even after a long period of waiting time, which means a portion of the population may never experience the event of interest. Under this circumstance, one remedy is to adopt a mixture cure Cox model to analyze the survival data. However, if there clearly exhibits an acceleration (or deceleration) factor among their survival times, then an accelerated failure time (AFT) model will be preferred, leading to a mixture cure AFT model. In this paper, we consider a penalized likelihood method to estimate the mixture cure semiparametric AFT models, where the unknown baseline hazard is approximated using Gaussian basis functions. We allow partly interval-censored survival data which can include event times and left-, right-, and interval-censoring times. The penalty function helps to achieve a smooth estimate of the baseline hazard function. We will also provide asymptotic properties to the estimates so that inferences can be made on regression parameters and hazard-related quantities. Simulation studies are conducted to evaluate the model performance, which includes a comparative study with an existing method from the <span>smcure</span> <span>R</span> package. The results show that our proposed penalized likelihood method has acceptable performance in general and produces less bias when faced with the identifiability issue compared to <span>smcure</span>. To illustrate the application of our method, a real case study involving melanoma recurrence is conducted and reported. Our model is implemented in our R package <span>aftQnp</span> which is available from https://github.com/Isabellee4555/aftQnP.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Replication of Equivalence Studies 等效研究的复制。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-29 DOI: 10.1002/bimj.202300232
Charlotte Micheloud, Leonhard Held

Replication studies are increasingly conducted to assess the credibility of scientific findings. Most of these replication attempts target studies with a superiority design, but there is a lack of methodology regarding the analysis of replication studies with alternative types of designs, such as equivalence. In order to fill this gap, we propose two approaches, the two-trials rule and the sceptical two one-sided tests (TOST) procedure, adapted from methods used in superiority settings. Both methods have the same overall Type-I error rate, but the sceptical TOST procedure allows replication success even for nonsignificant original or replication studies. This leads to a larger project power and other differences in relevant operating characteristics. Both methods can be used for sample size calculation of the replication study, based on the results from the original one. The two methods are applied to data from the Reproducibility Project: Cancer Biology.

为评估科学发现的可信度,越来越多地开展了复制研究。这些复制尝试大多以优越性设计的研究为目标,但对于采用其他类型设计(如等效设计)的复制研究,却缺乏分析方法。为了填补这一空白,我们提出了两种方法,即两两试验规则和怀疑性两单侧试验(TOST)程序,这两种方法改编自优越性设计中使用的方法。这两种方法的总体I类错误率相同,但怀疑性 TOST 程序允许即使是不显著的原始或复制研究也能复制成功。这就导致了更大的项目功率和相关操作特征的其他差异。这两种方法都可用于根据原始研究的结果计算复制研究的样本量。这两种方法适用于可重复性项目的数据:癌症生物学
{"title":"The Replication of Equivalence Studies","authors":"Charlotte Micheloud,&nbsp;Leonhard Held","doi":"10.1002/bimj.202300232","DOIUrl":"10.1002/bimj.202300232","url":null,"abstract":"<p>Replication studies are increasingly conducted to assess the credibility of scientific findings. Most of these replication attempts target studies with a superiority design, but there is a lack of methodology regarding the analysis of replication studies with alternative types of designs, such as equivalence. In order to fill this gap, we propose two approaches, the two-trials rule and the sceptical two one-sided tests (TOST) procedure, adapted from methods used in superiority settings. Both methods have the same overall Type-I error rate, but the sceptical TOST procedure allows replication success even for nonsignificant original or replication studies. This leads to a larger project power and other differences in relevant operating characteristics. Both methods can be used for sample size calculation of the replication study, based on the results from the original one. The two methods are applied to data from the Reproducibility Project: Cancer Biology.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300232","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group Integrative Dynamic Factor Models With Application to Multiple Subject Brain Connectivity 应用于多受试者大脑连接性的群体整合动态因子模型
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-29 DOI: 10.1002/bimj.202300370
Younghoon Kim, Zachary F. Fisher, Vladas Pipiras

This work introduces a novel framework for dynamic factor model-based group-level analysis of multiple subjects time-series data, called GRoup Integrative DYnamic factor (GRIDY) models. The framework identifies and characterizes intersubject similarities and differences between two predetermined groups by considering a combination of group spatial information and individual temporal dynamics. Furthermore, it enables the identification of intrasubject similarities and differences over time by employing different model configurations for each subject. Methodologically, the framework combines a novel principal angle-based rank selection algorithm and a noniterative integrative analysis framework. Inspired by simultaneous component analysis, this approach also reconstructs identifiable latent factor series with flexible covariance structures. The performance of the GRIDY models is evaluated through simulations conducted under various scenarios. An application is also presented to compare resting-state functional MRI data collected from multiple subjects in autism spectrum disorder and control groups.

这项工作介绍了一种基于动态因子模型的多受试者时间序列数据组级分析的新框架,称为 GRoup Integrative DYnamic factor (GRIDY) 模型。该框架通过综合考虑群体空间信息和个体时间动态,来识别和描述两个预定群体之间的对象间相似性和差异性。此外,它还能通过对每个受试者采用不同的模型配置来识别受试者内部随时间变化的相似性和差异性。在方法上,该框架结合了一种新颖的基于主角的秩选择算法和一种非迭代综合分析框架。受同步成分分析的启发,这种方法还能重建具有灵活协方差结构的可识别潜在因子序列。通过在各种情况下进行模拟,对 GRIDY 模型的性能进行了评估。此外,还介绍了一种应用方法,用于比较从多个自闭症谱系障碍受试者和对照组收集的静息态功能磁共振成像数据。
{"title":"Group Integrative Dynamic Factor Models With Application to Multiple Subject Brain Connectivity","authors":"Younghoon Kim,&nbsp;Zachary F. Fisher,&nbsp;Vladas Pipiras","doi":"10.1002/bimj.202300370","DOIUrl":"10.1002/bimj.202300370","url":null,"abstract":"<div>\u0000 \u0000 <p>This work introduces a novel framework for dynamic factor model-based group-level analysis of multiple subjects time-series data, called GRoup Integrative DYnamic factor (GRIDY) models. The framework identifies and characterizes intersubject similarities and differences between two predetermined groups by considering a combination of group spatial information and individual temporal dynamics. Furthermore, it enables the identification of intrasubject similarities and differences over time by employing different model configurations for each subject. Methodologically, the framework combines a novel principal angle-based rank selection algorithm and a noniterative integrative analysis framework. Inspired by simultaneous component analysis, this approach also reconstructs identifiable latent factor series with flexible covariance structures. The performance of the GRIDY models is evaluated through simulations conducted under various scenarios. An application is also presented to compare resting-state functional MRI data collected from multiple subjects in autism spectrum disorder and control groups.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Cohort Mixture Analysis: A Data Integration Approach With Applications on Gestational Age and DNA-Methylation-Derived Gestational Age Acceleration Metrics 跨队列混合分析:数据整合方法在妊娠年龄和 DNA 甲基化衍生妊娠年龄加速度指标中的应用
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-29 DOI: 10.1002/bimj.202300270
Elena Colicino, Roberto Ascari, Hachem Saddiki, Francheska Merced-Nieves, Nicolò Foppa Pedretti, Kathi Huddleston, Robert O Wright, Rosalind J Wright, Program Collaborators for Environmental Influences on Child Health Outcomes

Data integration of multiple studies can provide enhanced exposure contrast and statistical power to examine associations between environmental exposure mixtures and health outcomes. Extant research has combined populations and identified an overall mixture–outcome association, without accounting for differences across studies. We extended the Bayesian Weighted Quantile Sum (BWQS) regression to a hierarchical framework to analyze mixtures across cohorts. The hierarchical BWQS (HBWQS) approach aggregates sample size of multiple cohorts to calculate an overall mixture index, thereby identifying the most harmful exposure(s) across cohorts; and provides cohort-specific associations between the overall mixture index and the outcome. We showed results from 10 simulated scenarios including four mixture components in three, eight, and ten populations, and two real-case examples on the association between prenatal metal mixture exposure—comprising arsenic, cadmium, and lead—and both gestational age and epigenetic-derived gestational age acceleration metrics. Simulated scenarios showed good empirical coverage and little bias for all HBWQS-estimated parameters. The Watanabe–Akaike information criterion showed a better average performance for the HBWQS regression than the BWQS across scenarios. HBWQS results incorporating cohorts within the national Environmental influences on Child Health Outcomes (ECHO) program from three different sites showed that the environmental mixture was negatively associated with gestational age in a single site. The HBWQS approach facilitates the combination of multiple cohorts and accounts for individual cohort differences in mixture analyses. HBWQS findings can be used to develop regulations, policies, and interventions regarding multiple co-occurring environmental exposures and it will maximize the use of extant publicly available data.

对多项研究进行数据整合,可以增强暴露对比度和统计能力,从而检验环境暴露混合物与健康结果之间的关联。现有研究已将人群结合起来,并确定了总体的混合物-结果关联,但没有考虑不同研究之间的差异。我们将贝叶斯加权量子和(BWQS)回归扩展到分层框架,以分析不同队列的混合物。分层 BWQS(HBWQS)方法汇总了多个队列的样本量,以计算总体混合物指数,从而确定各队列中最有害的暴露;并提供总体混合物指数与结果之间的队列特异性关联。我们展示了 10 个模拟情景的结果,包括 3 个、8 个和 10 个人群中的 4 种混合物成分,以及两个关于产前金属混合物暴露(包括砷、镉和铅)与胎龄和表观遗传学衍生胎龄加速指标之间关系的真实案例。模拟情景显示,所有 HBWQS 估算参数都具有良好的经验覆盖性,偏差很小。Watanabe-Akaike信息标准显示,HBWQS回归在各种情况下的平均性能优于BWQS。HBWQS 的结果显示,在全国环境对儿童健康结果的影响(ECHO)项目中,来自三个不同地点的队列显示,在一个地点,环境混合物与胎龄呈负相关。HBWQS 方法有助于将多个队列结合起来,并在混合分析中考虑到个别队列的差异。HBWQS 的研究结果可用于制定有关多种并发环境暴露的法规、政策和干预措施,并将最大限度地利用现有的公开数据。
{"title":"Cross-Cohort Mixture Analysis: A Data Integration Approach With Applications on Gestational Age and DNA-Methylation-Derived Gestational Age Acceleration Metrics","authors":"Elena Colicino,&nbsp;Roberto Ascari,&nbsp;Hachem Saddiki,&nbsp;Francheska Merced-Nieves,&nbsp;Nicolò Foppa Pedretti,&nbsp;Kathi Huddleston,&nbsp;Robert O Wright,&nbsp;Rosalind J Wright,&nbsp;Program Collaborators for Environmental Influences on Child Health Outcomes","doi":"10.1002/bimj.202300270","DOIUrl":"10.1002/bimj.202300270","url":null,"abstract":"<div>\u0000 \u0000 <p>Data integration of multiple studies can provide enhanced exposure contrast and statistical power to examine associations between environmental exposure mixtures and health outcomes. Extant research has combined populations and identified an overall mixture–outcome association, without accounting for differences across studies. We extended the Bayesian Weighted Quantile Sum (BWQS) regression to a hierarchical framework to analyze mixtures across cohorts. The hierarchical BWQS (HBWQS) approach aggregates sample size of multiple cohorts to calculate an overall mixture index, thereby identifying the most harmful exposure(s) across cohorts; and provides cohort-specific associations between the overall mixture index and the outcome. We showed results from 10 simulated scenarios including four mixture components in three, eight, and ten populations, and two real-case examples on the association between prenatal metal mixture exposure—comprising arsenic, cadmium, and lead—and both gestational age and epigenetic-derived gestational age acceleration metrics. Simulated scenarios showed good empirical coverage and little bias for all HBWQS-estimated parameters. The Watanabe–Akaike information criterion showed a better average performance for the HBWQS regression than the BWQS across scenarios. HBWQS results incorporating cohorts within the national Environmental influences on Child Health Outcomes (ECHO) program from three different sites showed that the environmental mixture was negatively associated with gestational age in a single site. The HBWQS approach facilitates the combination of multiple cohorts and accounts for individual cohort differences in mixture analyses. HBWQS findings can be used to develop regulations, policies, and interventions regarding multiple co-occurring environmental exposures and it will maximize the use of extant publicly available data.</p>\u0000 </div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-Dimensional Bayesian Semiparametric Models for Small Samples: A Principled Approach to the Analysis of Cytokine Expression Data 小样本的高维贝叶斯半参数模型:细胞因子表达数据分析的原则性方法。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-29 DOI: 10.1002/bimj.70000
Giovanni Poli, Raffaele Argiento, Amedeo Amedei, Francesco C. Stingo

In laboratory medicine, due to the lack of sample availability and resources, measurements of many quantities of interest are commonly collected over a few samples, making statistical inference particularly challenging. In this context, several hypotheses can be tested, and studies are not often powered accordingly. We present a semiparametric Bayesian approach to effectively test multiple hypotheses applied to an experiment that aims to identify cytokines involved in Crohn's disease (CD) infection that may be ongoing in multiple tissues. We assume that the positive correlation commonly observed between cytokines is caused by latent groups of effects, which in turn result from a common cause. These clusters are effectively modeled through a Dirichlet Process (DP) that is one of the most popular choices as nonparametric prior in Bayesian statistics and has been proven to be a powerful tool for model-based clustering. We use a spike–slab distribution as the base measure of the DP. The nonparametric part has been included in an additive model whose parametric component is a Bayesian hierarchical model. We include simulations that empirically demonstrate the effectiveness of the proposed testing procedure in settings that mimic our application's sample size and data structure. Our CD data analysis shows strong evidence of a cytokine gradient in the external intestinal tissue.

在实验室医学中,由于缺乏样本供应和资源,许多相关量的测量通常都是在少数样本中收集的,这使得统计推断尤其具有挑战性。在这种情况下,可以对多个假设进行检验,而研究往往没有相应的动力。我们提出了一种半参数贝叶斯方法来有效地测试多个假设,该方法应用于一项实验,旨在确定可能在多个组织中持续存在的参与克罗恩病(CD)感染的细胞因子。我们假定细胞因子之间常见的正相关性是由潜在的效应群引起的,而这些效应群又是由共同的原因引起的。Dirichlet Process(DP)是贝叶斯统计中最流行的非参数先验选择之一,已被证明是基于模型聚类的强大工具。我们使用尖峰平板分布作为 DP 的基本度量。非参数部分包含在一个加法模型中,该模型的参数部分是一个贝叶斯分层模型。我们通过模拟实验证明了所建议的测试程序在模拟我们应用的样本大小和数据结构时的有效性。我们的 CD 数据分析显示了肠道外部组织中细胞因子梯度的有力证据。
{"title":"High-Dimensional Bayesian Semiparametric Models for Small Samples: A Principled Approach to the Analysis of Cytokine Expression Data","authors":"Giovanni Poli,&nbsp;Raffaele Argiento,&nbsp;Amedeo Amedei,&nbsp;Francesco C. Stingo","doi":"10.1002/bimj.70000","DOIUrl":"10.1002/bimj.70000","url":null,"abstract":"<p>In laboratory medicine, due to the lack of sample availability and resources, measurements of many quantities of interest are commonly collected over a few samples, making statistical inference particularly challenging. In this context, several hypotheses can be tested, and studies are not often powered accordingly. We present a semiparametric Bayesian approach to effectively test multiple hypotheses applied to an experiment that aims to identify cytokines involved in Crohn's disease (CD) infection that may be ongoing in multiple tissues. We assume that the positive correlation commonly observed between cytokines is caused by latent groups of effects, which in turn result from a common cause. These clusters are effectively modeled through a Dirichlet Process (DP) that is one of the most popular choices as nonparametric prior in Bayesian statistics and has been proven to be a powerful tool for model-based clustering. We use a spike–slab distribution as the base measure of the DP. The nonparametric part has been included in an additive model whose parametric component is a Bayesian hierarchical model. We include simulations that empirically demonstrate the effectiveness of the proposed testing procedure in settings that mimic our application's sample size and data structure. Our CD data analysis shows strong evidence of a cytokine gradient in the external intestinal tissue.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70000","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating the Sampling Distribution of Posterior Decision Summaries in Bayesian Clinical Trials 估计贝叶斯临床试验后验决策摘要的采样分布。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-29 DOI: 10.1002/bimj.70002
Shirin Golchi, James J. Willard

Bayesian inference and the use of posterior or posterior predictive probabilities for decision making have become increasingly popular in clinical trials. The current practice in Bayesian clinical trials relies on a hybrid Bayesian-frequentist approach where the design and decision criteria are assessed with respect to frequentist operating characteristics such as power and type I error rate conditioning on a given set of parameters. These operating characteristics are commonly obtained via simulation studies. The utility of Bayesian measures, such as “assurance,” that incorporate uncertainty about model parameters in estimating the probabilities of various decisions in trials has been demonstrated. However, the computational burden remains an obstacle toward wider use of such criteria. In this article, we propose methodology which utilizes large sample theory of the posterior distribution to define parametric models for the sampling distribution of the posterior summaries used for decision making. The parameters of these models are estimated using a small number of simulation scenarios, thereby refining these models to capture the sampling distribution for small to moderate sample size. The proposed approach toward the assessment of conditional and marginal operating characteristics and sample size determination can be considered as simulation-assisted rather than simulation-based. It enables formal incorporation of uncertainty about the trial assumptions via a design prior and significantly reduces the computational burden for the design of Bayesian trials in general.

在临床试验中,贝叶斯推断法和使用后验或后验预测概率进行决策的方法越来越受欢迎。目前,贝叶斯临床试验的实践依赖于贝叶斯-常模混合方法,即根据常模运行特征(如以给定参数集为条件的功率和 I 类错误率)来评估设计和决策标准。这些运行特征通常通过模拟研究获得。贝叶斯测量法(如 "保证")在估算试验中各种决策的概率时考虑了模型参数的不确定性,其实用性已得到证实。然而,计算负担仍然是广泛使用此类标准的障碍。在本文中,我们提出了利用后验分布的大样本理论来定义用于决策的后验摘要抽样分布参数模型的方法。这些模型的参数通过少量的模拟场景进行估算,从而完善这些模型,以捕捉中小规模样本的抽样分布。所提出的评估条件和边际运行特征以及确定样本量的方法可视为模拟辅助方法,而不是基于模拟的方法。它能通过设计先验正式纳入试验假设的不确定性,并大大减轻了一般贝叶斯试验设计的计算负担。
{"title":"Estimating the Sampling Distribution of Posterior Decision Summaries in Bayesian Clinical Trials","authors":"Shirin Golchi,&nbsp;James J. Willard","doi":"10.1002/bimj.70002","DOIUrl":"10.1002/bimj.70002","url":null,"abstract":"<p>Bayesian inference and the use of posterior or posterior predictive probabilities for decision making have become increasingly popular in clinical trials. The current practice in Bayesian clinical trials relies on a hybrid Bayesian-frequentist approach where the design and decision criteria are assessed with respect to frequentist operating characteristics such as power and type I error rate conditioning on a given set of parameters. These operating characteristics are commonly obtained via simulation studies. The utility of Bayesian measures, such as “assurance,” that incorporate uncertainty about model parameters in estimating the probabilities of various decisions in trials has been demonstrated. However, the computational burden remains an obstacle toward wider use of such criteria. In this article, we propose methodology which utilizes large sample theory of the posterior distribution to define parametric models for the sampling distribution of the posterior summaries used for decision making. The parameters of these models are estimated using a small number of simulation scenarios, thereby refining these models to capture the sampling distribution for small to moderate sample size. The proposed approach toward the assessment of conditional and marginal operating characteristics and sample size determination can be considered as simulation-assisted rather than simulation-based. It enables formal incorporation of uncertainty about the trial assumptions via a design prior and significantly reduces the computational burden for the design of Bayesian trials in general.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrical Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1