首页 > 最新文献

Biometrical Journal最新文献

英文 中文
Years of Life Lost to COVID-19 and Related Mortality Indicators: An Illustration in 30 Countries COVID-19 及相关死亡率指标造成的生命损失年数:30 个国家的说明。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-13 DOI: 10.1002/bimj.202300386
Valentin Rousson, Isabella Locatelli

The concept of (potential) years of life lost is a measure of premature mortality that can be used to compare the impacts of different specific causes of death. However, interpreting a given number of years of life lost at face value is more problematic because of the lack of a sensible reference value. In this paper, we propose three denominators to divide an excess years of life lost, thus obtaining three indicators, called average life lost, increase of life lost, and proportion of life lost, which should facilitate interpretation and comparisons. We study the links between these three indicators and classical mortality indicators, such as life expectancy and standardized mortality rate, introduce the concept of weighted standardized mortality rate, and calculate them in 30 countries to assess the impact of COVID-19 on mortality in the year 2020. Using any of the three indicators, a significant excess loss is found for both genders in 18 of the 30 countries.

潜在)寿命损失年数的概念是衡量过早死亡率的一个指标,可用来比较不同具体死因的影响。然而,由于缺乏合理的参考值,从表面价值来解释给定的生命损失年数问题较大。在本文中,我们提出了三个分母来划分一个超额寿命损失年数,从而得到三个指标,即平均寿命损失、寿命损失增加和寿命损失比例,这三个指标应有助于解释和比较。我们研究了这三个指标与传统死亡率指标(如预期寿命和标准化死亡率)之间的联系,引入了加权标准化死亡率的概念,并在 30 个国家进行了计算,以评估 COVID-19 对 2020 年死亡率的影响。使用这三个指标中的任何一个,都会发现在 30 个国家中的 18 个国家中,男女两性的超额损失都很大。
{"title":"Years of Life Lost to COVID-19 and Related Mortality Indicators: An Illustration in 30 Countries","authors":"Valentin Rousson,&nbsp;Isabella Locatelli","doi":"10.1002/bimj.202300386","DOIUrl":"10.1002/bimj.202300386","url":null,"abstract":"<p>The concept of (potential) years of life lost is a measure of premature mortality that can be used to compare the impacts of different specific causes of death. However, interpreting a given number of years of life lost at face value is more problematic because of the lack of a sensible reference value. In this paper, we propose three denominators to divide an excess years of life lost, thus obtaining three indicators, called <i>average life lost</i>, <i>increase of life lost</i>, and <i>proportion of life lost</i>, which should facilitate interpretation and comparisons. We study the links between these three indicators and classical mortality indicators, such as life expectancy and standardized mortality rate, introduce the concept of <i>weighted standardized mortality rate</i>, and calculate them in 30 countries to assess the impact of COVID-19 on mortality in the year 2020. Using any of the three indicators, a significant excess loss is found for both genders in 18 of the 30 countries.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300386","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Regression Techniques for Multiple Method Comparison and Transformation 用于多种方法比较和转换的稳健回归技术。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-13 DOI: 10.1002/bimj.202400027
Florian Dufey

A generalization of Passing–Bablok regression is proposed for comparing multiple measurement methods simultaneously. Possible applications include assay migration studies or interlaboratory trials. When comparing only two methods, the method boils down to the usual Passing–Bablok estimator. It is close in spirit to reduced major axis regression, which is, however, not robust. To obtain a robust estimator, the major axis is replaced by the (hyper-)spherical median axis. This technique has been applied to compare SARS-CoV-2 serological tests, bilirubin in neonates, and an in vitro diagnostic test using different instruments, sample preparations, and reagent lots. In addition, plots similar to the well-known Bland–Altman plots have been developed to represent the variance structure.

为同时比较多种测量方法,提出了 Passing-Bablok 回归的一般化方法。可能的应用包括化验迁移研究或实验室间试验。当只比较两种方法时,该方法可归结为通常的 Passing-Bablok 估计器。它在精神上接近于还原主轴回归,但不稳健。为了获得稳健的估计器,主轴被(超)球面中轴取代。这项技术已被应用于比较 SARS-CoV-2 血清学检测、新生儿胆红素以及使用不同仪器、样品制备和试剂批次的体外诊断检测。此外,还绘制了与著名的布兰-阿尔特曼图类似的图来表示方差结构。
{"title":"Robust Regression Techniques for Multiple Method Comparison and Transformation","authors":"Florian Dufey","doi":"10.1002/bimj.202400027","DOIUrl":"10.1002/bimj.202400027","url":null,"abstract":"<p>A generalization of Passing–Bablok regression is proposed for comparing multiple measurement methods simultaneously. Possible applications include assay migration studies or interlaboratory trials. When comparing only two methods, the method boils down to the usual Passing–Bablok estimator. It is close in spirit to reduced major axis regression, which is, however, not robust. To obtain a robust estimator, the major axis is replaced by the (hyper-)spherical median axis. This technique has been applied to compare SARS-CoV-2 serological tests, bilirubin in neonates, and an in vitro diagnostic test using different instruments, sample preparations, and reagent lots. In addition, plots similar to the well-known Bland–Altman plots have been developed to represent the variance structure.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202400027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Shared-Frailty Spatial Scan Statistic Model for Time-to-Event Data 时间-事件数据的共享-弱点空间扫描统计模型。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-11 DOI: 10.1002/bimj.202300200
Camille Frévent, Mohamed-Salem Ahmed, Sophie Dabo-Niang, Michaël Genin

Spatial scan statistics are well-known methods widely used to detect spatial clusters of events. Furthermore, several spatial scan statistics models have been applied to the spatial analysis of time-to-event data. However, these models do not take account of potential correlations between the observations of individuals within the same spatial unit or potential spatial dependence between spatial units. To overcome this problem, we have developed a scan statistic based on a Cox model with shared frailty and that takes account of the spatial dependence between spatial units. In simulation studies, we found that (i) conventional models of spatial scan statistics for time-to-event data fail to maintain the type I error in the presence of a correlation between the observations of individuals within the same spatial unit and (ii) our model performed well in the presence of such correlation and spatial dependence. We have applied our method to epidemiological data and the detection of spatial clusters of mortality in patients with end-stage renal disease in northern France.

空间扫描统计是众所周知的方法,被广泛用于检测事件的空间集群。此外,一些空间扫描统计模型已被应用于时间到事件数据的空间分析。然而,这些模型并没有考虑到同一空间单位内个体观测数据之间的潜在相关性,也没有考虑到空间单位之间的潜在空间依赖性。为了解决这个问题,我们开发了一种基于具有共同脆弱性的 Cox 模型的扫描统计量,它考虑到了空间单位之间的空间依赖性。在模拟研究中,我们发现:(i) 用于时间到事件数据的传统空间扫描统计模型,在同一空间单元内的个体观测值之间存在相关性的情况下,无法保持 I 型误差;(ii) 我们的模型在存在这种相关性和空间依赖性的情况下表现良好。我们已将我们的方法应用于流行病学数据和法国北部终末期肾病患者死亡率空间集群的检测。
{"title":"A Shared-Frailty Spatial Scan Statistic Model for Time-to-Event Data","authors":"Camille Frévent,&nbsp;Mohamed-Salem Ahmed,&nbsp;Sophie Dabo-Niang,&nbsp;Michaël Genin","doi":"10.1002/bimj.202300200","DOIUrl":"10.1002/bimj.202300200","url":null,"abstract":"<p>Spatial scan statistics are well-known methods widely used to detect spatial clusters of events. Furthermore, several spatial scan statistics models have been applied to the spatial analysis of time-to-event data. However, these models do not take account of potential correlations between the observations of individuals within the same spatial unit or potential spatial dependence between spatial units. To overcome this problem, we have developed a scan statistic based on a Cox model with shared frailty and that takes account of the spatial dependence between spatial units. In simulation studies, we found that (i) conventional models of spatial scan statistics for time-to-event data fail to maintain the type I error in the presence of a correlation between the observations of individuals within the same spatial unit and (ii) our model performed well in the presence of such correlation and spatial dependence. We have applied our method to epidemiological data and the detection of spatial clusters of mortality in patients with end-stage renal disease in northern France.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300200","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample Size Calculation for an Individual Stepped-Wedge Randomized Trial 个人阶梯楔形随机试验的样本量计算。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-11 DOI: 10.1002/bimj.202300167
Aude Allemang-Trivalle, Annabel Maruani, Bruno Giraudeau

In the individual stepped-wedge randomized trial (ISW-RT), subjects are allocated to sequences, each sequence being defined by a control period followed by an experimental period. The total follow-up time is the same for all sequences, but the duration of the control and experimental periods varies among sequences. To our knowledge, there is no validated sample size calculation formula for ISW-RTs unlike stepped-wedge cluster randomized trials (SW-CRTs). The objective of this study was to adapt the formula used for SW-CRTs to the case of individual randomization and to validate this adaptation using a Monte Carlo simulation study. The proposed sample size calculation formula for an ISW-RT design yielded satisfactory empirical power for most scenarios except scenarios with operating characteristic values near the boundary (i.e., smallest possible number of periods, very high or very low autocorrelation coefficient). Overall, the results provide useful insights into the sample size calculation for ISW-RTs.

在个体阶梯式楔形随机试验(ISW-RT)中,受试者被分配到序列中,每个序列由一个控制期和一个实验期组成。所有序列的总随访时间相同,但不同序列的对照期和实验期的持续时间各不相同。据我们所知,与阶梯式楔形分组随机试验(SW-CRT)不同,ISW-RT 没有经过验证的样本量计算公式。本研究的目的是将用于阶梯式分组随机试验的公式适用于个体随机化的情况,并通过蒙特卡罗模拟研究验证这种适用性。针对 ISW-RT 设计提出的样本量计算公式在大多数情况下都能产生令人满意的经验功率,但运行特征值接近边界的情况除外(即可能的最小周期数、自相关系数非常高或非常低)。总之,这些结果为 ISW-RT 的样本量计算提供了有益的启示。
{"title":"Sample Size Calculation for an Individual Stepped-Wedge Randomized Trial","authors":"Aude Allemang-Trivalle,&nbsp;Annabel Maruani,&nbsp;Bruno Giraudeau","doi":"10.1002/bimj.202300167","DOIUrl":"10.1002/bimj.202300167","url":null,"abstract":"<p>In the individual stepped-wedge randomized trial (ISW-RT), subjects are allocated to sequences, each sequence being defined by a control period followed by an experimental period. The total follow-up time is the same for all sequences, but the duration of the control and experimental periods varies among sequences. To our knowledge, there is no validated sample size calculation formula for ISW-RTs unlike stepped-wedge cluster randomized trials (SW-CRTs). The objective of this study was to adapt the formula used for SW-CRTs to the case of individual randomization and to validate this adaptation using a Monte Carlo simulation study. The proposed sample size calculation formula for an ISW-RT design yielded satisfactory empirical power for most scenarios except scenarios with operating characteristic values near the boundary (i.e., smallest possible number of periods, very high or very low autocorrelation coefficient). Overall, the results provide useful insights into the sample size calculation for ISW-RTs.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300167","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control 全基因组测序研究的生物统计方面:预处理和质量控制
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-11 DOI: 10.1002/bimj.202300278
Raphael O. Betschart, Cristian Riccio, Domingo Aguilera-Garcia, Stefan Blankenberg, Linlin Guo, Holger Moch, Dagmar Seidl, Hugo Solleder, Felix Thalén, Alexandre Thiéry, Raphael Twerenbold, Tanja Zeller, Martin Zoche, Andreas Ziegler

Rapid advances in high-throughput DNA sequencing technologies have enabled large-scale whole genome sequencing (WGS) studies. Before performing association analysis between phenotypes and genotypes, preprocessing and quality control (QC) of the raw sequence data need to be performed. Because many biostatisticians have not been working with WGS data so far, we first sketch Illumina's short-read sequencing technology. Second, we explain the general preprocessing pipeline for WGS studies. Third, we provide an overview of important QC metrics, which are applied to WGS data: on the raw data, after mapping and alignment, after variant calling, and after multisample variant calling. Fourth, we illustrate the QC with the data from the GENEtic SequencIng Study Hamburg–Davos (GENESIS-HD), a study involving more than 9000 human whole genomes. All samples were sequenced on an Illumina NovaSeq 6000 with an average coverage of 35× using a PCR-free protocol. For QC, one genome in a bottle (GIAB) trio was sequenced in four replicates, and one GIAB sample was successfully sequenced 70 times in different runs. Fifth, we provide empirical data on the compression of raw data using the DRAGEN original read archive (ORA). The most important quality metrics in the application were genetic similarity, sample cross-contamination, deviations from the expected Het/Hom ratio, relatedness, and coverage. The compression ratio of the raw files using DRAGEN ORA was 5.6:1, and compression time was linear by genome coverage. In summary, the preprocessing, joint calling, and QC of large WGS studies are feasible within a reasonable time, and efficient QC procedures are readily available.

高通量 DNA 测序技术的飞速发展促成了大规模的全基因组测序(WGS)研究。在进行表型与基因型之间的关联分析之前,需要对原始序列数据进行预处理和质量控制(QC)。由于许多生物统计学家至今尚未接触过 WGS 数据,因此我们首先简要介绍了 Illumina 的短线程测序技术。其次,我们解释了 WGS 研究的一般预处理流程。第三,我们概述了应用于 WGS 数据的重要 QC 指标:原始数据、映射和比对后、变异调用后和多样本变异调用后。第四,我们用汉堡-达沃斯基因测序研究(GENESIS-HD)的数据来说明质量控制,这项研究涉及 9000 多个人类全基因组。所有样本均在 Illumina NovaSeq 6000 上进行测序,采用无 PCR 方案,平均覆盖率为 35×。为了进行质量控制,对一个瓶中基因组(GIAB)三组进行了四次重复测序,一个 GIAB 样本在不同的运行中成功测序了 70 次。第五,我们提供了使用 DRAGEN 原始读存档(ORA)压缩原始数据的经验数据。应用中最重要的质量指标是遗传相似性、样本交叉污染、与预期 Het/Hom 比率的偏差、相关性和覆盖率。使用 DRAGEN ORA 对原始文件的压缩率为 5.6:1,压缩时间与基因组覆盖率成线性关系。总之,大型 WGS 研究的预处理、联合调用和质量控制在合理的时间内是可行的,高效的质量控制程序也是现成的。
{"title":"Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control","authors":"Raphael O. Betschart,&nbsp;Cristian Riccio,&nbsp;Domingo Aguilera-Garcia,&nbsp;Stefan Blankenberg,&nbsp;Linlin Guo,&nbsp;Holger Moch,&nbsp;Dagmar Seidl,&nbsp;Hugo Solleder,&nbsp;Felix Thalén,&nbsp;Alexandre Thiéry,&nbsp;Raphael Twerenbold,&nbsp;Tanja Zeller,&nbsp;Martin Zoche,&nbsp;Andreas Ziegler","doi":"10.1002/bimj.202300278","DOIUrl":"10.1002/bimj.202300278","url":null,"abstract":"<p>Rapid advances in high-throughput DNA sequencing technologies have enabled large-scale whole genome sequencing (WGS) studies. Before performing association analysis between phenotypes and genotypes, preprocessing and quality control (QC) of the raw sequence data need to be performed. Because many biostatisticians have not been working with WGS data so far, we first sketch Illumina's short-read sequencing technology. Second, we explain the general preprocessing pipeline for WGS studies. Third, we provide an overview of important QC metrics, which are applied to WGS data: on the raw data, after mapping and alignment, after variant calling, and after multisample variant calling. Fourth, we illustrate the QC with the data from the GENEtic SequencIng Study Hamburg–Davos (GENESIS-HD), a study involving more than 9000 human whole genomes. All samples were sequenced on an Illumina NovaSeq 6000 with an average coverage of 35× using a PCR-free protocol. For QC, one genome in a bottle (GIAB) trio was sequenced in four replicates, and one GIAB sample was successfully sequenced 70 times in different runs. Fifth, we provide empirical data on the compression of raw data using the DRAGEN original read archive (ORA). The most important quality metrics in the application were genetic similarity, sample cross-contamination, deviations from the expected Het/Hom ratio, relatedness, and coverage. The compression ratio of the raw files using DRAGEN ORA was 5.6:1, and compression time was linear by genome coverage. In summary, the preprocessing, joint calling, and QC of large WGS studies are feasible within a reasonable time, and efficient QC procedures are readily available.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300278","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141581613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional Multivariable Logistic Regression With an Application to HIV Viral Suppression Prediction 应用于 HIV 病毒抑制预测的功能性多变量 Logistic 回归。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-05 DOI: 10.1002/bimj.202300081
Siyuan Guo, Jiajia Zhang, Yichao Wu, Alexander C. McLain, James W. Hardin, Bankole Olatosi, Xiaoming Li

Motivated by improving the prediction of the human immunodeficiency virus (HIV) suppression status using electronic health records (EHR) data, we propose a functional multivariable logistic regression model, which accounts for the longitudinal binary process and continuous process simultaneously. Specifically, the longitudinal measurements for either binary or continuous variables are modeled by functional principal components analysis, and their corresponding functional principal component scores are used to build a logistic regression model for prediction. The longitudinal binary data are linked to underlying Gaussian processes. The estimation is done using penalized spline for the longitudinal continuous and binary data. Group-lasso is used to select longitudinal processes, and the multivariate functional principal components analysis is proposed to revise functional principal component scores with the correlation. The method is evaluated via comprehensive simulation studies and then applied to predict viral suppression using EHR data for people living with HIV in South Carolina.

为了利用电子健康记录(EHR)数据改进对人类免疫缺陷病毒(HIV)抑制状态的预测,我们提出了一种功能多变量逻辑回归模型,该模型同时考虑了纵向二元过程和连续过程。具体来说,二元变量或连续变量的纵向测量数据均通过功能主成分分析建模,并利用其相应的功能主成分得分建立逻辑回归模型进行预测。纵向二元数据与底层高斯过程相关联。对于纵向连续和二进制数据,使用惩罚性样条曲线进行估计。利用组-拉索来选择纵向过程,并提出了多元函数主成分分析法来修正函数主成分得分的相关性。通过综合模拟研究对该方法进行了评估,然后将其应用于利用南卡罗来纳州艾滋病毒感染者的电子病历数据预测病毒抑制情况。
{"title":"Functional Multivariable Logistic Regression With an Application to HIV Viral Suppression Prediction","authors":"Siyuan Guo,&nbsp;Jiajia Zhang,&nbsp;Yichao Wu,&nbsp;Alexander C. McLain,&nbsp;James W. Hardin,&nbsp;Bankole Olatosi,&nbsp;Xiaoming Li","doi":"10.1002/bimj.202300081","DOIUrl":"10.1002/bimj.202300081","url":null,"abstract":"<p>Motivated by improving the prediction of the human immunodeficiency virus (HIV) suppression status using electronic health records (EHR) data, we propose a functional multivariable logistic regression model, which accounts for the longitudinal binary process and continuous process simultaneously. Specifically, the longitudinal measurements for either binary or continuous variables are modeled by functional principal components analysis, and their corresponding functional principal component scores are used to build a logistic regression model for prediction. The longitudinal binary data are linked to underlying Gaussian processes. The estimation is done using penalized spline for the longitudinal continuous and binary data. Group-lasso is used to select longitudinal processes, and the multivariate functional principal components analysis is proposed to revise functional principal component scores with the correlation. The method is evaluated via comprehensive simulation studies and then applied to predict viral suppression using EHR data for people living with HIV in South Carolina.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300081","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141536065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Partial True Discovery Guarantee Procedures 结合部分真实发现保证程序。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-02 DOI: 10.1002/bimj.202300075
Ningning Xu, Aldo Solari, Jelle J. Goeman

Closed testing has recently been shown to be optimal for simultaneous true discovery proportion control. It is, however, challenging to construct true discovery guarantee procedures in such a way that it focuses power on some feature sets chosen by users based on their specific interest or expertise. We propose a procedure that allows users to target power on prespecified feature sets, that is, “focus sets.” Still, the method also allows inference for feature sets chosen post hoc, that is, “nonfocus sets,” for which we deduce a true discovery lower confidence bound by interpolation. Our procedure is built from partial true discovery guarantee procedures combined with Holm's procedure and is a conservative shortcut to the closed testing procedure. A simulation study confirms that the statistical power of our method is relatively high for focus sets, at the cost of power for nonfocus sets, as desired. In addition, we investigate its power property for sets with specific structures, for example, trees and directed acyclic graphs. We also compare our method with AdaFilter in the context of replicability analysis. The application of our method is illustrated with a gene ontology analysis in gene expression data.

最近的研究表明,封闭测试是同时进行真实发现比例控制的最佳方法。然而,如何构建真正的发现保证程序,使用户根据自己的兴趣或专长选择的特征集集中功率,是一项挑战。我们提出了一种程序,允许用户将功率集中在预先指定的特征集上,即 "重点集"。此外,该方法还允许推断临时选择的特征集,即 "非重点集",我们通过内插法推断出真实发现的置信度下限。我们的程序是由部分真实发现保证程序与霍尔姆程序相结合建立的,是封闭测试程序的保守捷径。模拟研究证实,我们的方法对焦点集的统计能力相对较高,但对非焦点集的统计能力却不如人意。此外,我们还研究了具有特定结构的集合(如树和有向无环图)的统计能力特性。在可复制性分析方面,我们还将我们的方法与 AdaFilter 进行了比较。我们以基因表达数据中的基因本体分析为例,说明了我们方法的应用。
{"title":"Combining Partial True Discovery Guarantee Procedures","authors":"Ningning Xu,&nbsp;Aldo Solari,&nbsp;Jelle J. Goeman","doi":"10.1002/bimj.202300075","DOIUrl":"10.1002/bimj.202300075","url":null,"abstract":"<p>Closed testing has recently been shown to be optimal for simultaneous true discovery proportion control. It is, however, challenging to construct true discovery guarantee procedures in such a way that it focuses power on some feature sets chosen by users based on their specific interest or expertise. We propose a procedure that allows users to target power on prespecified feature sets, that is, “focus sets.” Still, the method also allows inference for feature sets chosen post hoc, that is, “nonfocus sets,” for which we deduce a true discovery lower confidence bound by interpolation. Our procedure is built from partial true discovery guarantee procedures combined with Holm's procedure and is a conservative shortcut to the closed testing procedure. A simulation study confirms that the statistical power of our method is relatively high for focus sets, at the cost of power for nonfocus sets, as desired. In addition, we investigate its power property for sets with specific structures, for example, trees and directed acyclic graphs. We also compare our method with AdaFilter in the context of replicability analysis. The application of our method is illustrated with a gene ontology analysis in gene expression data.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300075","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous Inference of Multiple Binary Endpoints in Biomedical Research: Small Sample Properties of Multiple Marginal Models and a Resampling Approach 生物医学研究中多个二元终点的同时推断:多重边际模型的小样本特性和重采样方法。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-07-02 DOI: 10.1002/bimj.202300197
Sören Budig, Klaus Jung, Mario Hasler, Frank Schaarschmidt

In biomedical research, the simultaneous inference of multiple binary endpoints may be of interest. In such cases, an appropriate multiplicity adjustment is required that controls the family-wise error rate, which represents the probability of making incorrect test decisions. In this paper, we investigate two approaches that perform single-step p$p$-value adjustments that also take into account the possible correlation between endpoints. A rather novel and flexible approach known as multiple marginal models is considered, which is based on stacking of the parameter estimates of the marginal models and deriving their joint asymptotic distribution. We also investigate a nonparametric vector-based resampling approach, and we compare both approaches with the Bonferroni method by examining the family-wise error rate and power for different parameter settings, including low proportions and small sample sizes. The results show that the resampling-based approach consistently outperforms the other methods in terms of power, while still controlling the family-wise error rate. The multiple marginal models approach, on the other hand, shows a more conservative behavior. However, it offers more versatility in application, allowing for more complex models or straightforward computation of simultaneous confidence intervals. The practical application of the methods is demonstrated using a toxicological dataset from the National Toxicology Program.

在生物医学研究中,可能需要同时推断多个二元终点。在这种情况下,需要进行适当的多重性调整,以控制族内错误率,即做出错误测试决策的概率。在本文中,我们研究了两种进行单步 p $p$ 值调整的方法,它们还考虑到了终点之间可能存在的相关性。我们考虑了一种被称为多重边际模型的相当新颖和灵活的方法,它基于边际模型参数估计的堆叠,并推导出它们的联合渐近分布。我们还研究了一种基于向量的非参数重采样方法,并通过检验不同参数设置(包括低比例和小样本量)下的族内误差率和功率,将这两种方法与 Bonferroni 方法进行了比较。结果表明,基于重采样的方法在功率方面始终优于其他方法,同时还能控制族内误差率。而多重边际模型方法则表现得更为保守。不过,它的应用范围更广,可用于更复杂的模型或直接计算同步置信区间。我们使用国家毒理学计划的毒理学数据集演示了这些方法的实际应用。
{"title":"Simultaneous Inference of Multiple Binary Endpoints in Biomedical Research: Small Sample Properties of Multiple Marginal Models and a Resampling Approach","authors":"Sören Budig,&nbsp;Klaus Jung,&nbsp;Mario Hasler,&nbsp;Frank Schaarschmidt","doi":"10.1002/bimj.202300197","DOIUrl":"10.1002/bimj.202300197","url":null,"abstract":"<p>In biomedical research, the simultaneous inference of multiple binary endpoints may be of interest. In such cases, an appropriate multiplicity adjustment is required that controls the family-wise error rate, which represents the probability of making incorrect test decisions. In this paper, we investigate two approaches that perform single-step <span></span><math>\u0000 <semantics>\u0000 <mi>p</mi>\u0000 <annotation>$p$</annotation>\u0000 </semantics></math>-value adjustments that also take into account the possible correlation between endpoints. A rather novel and flexible approach known as multiple marginal models is considered, which is based on stacking of the parameter estimates of the marginal models and deriving their joint asymptotic distribution. We also investigate a nonparametric vector-based resampling approach, and we compare both approaches with the Bonferroni method by examining the family-wise error rate and power for different parameter settings, including low proportions and small sample sizes. The results show that the resampling-based approach consistently outperforms the other methods in terms of power, while still controlling the family-wise error rate. The multiple marginal models approach, on the other hand, shows a more conservative behavior. However, it offers more versatility in application, allowing for more complex models or straightforward computation of simultaneous confidence intervals. The practical application of the methods is demonstrated using a toxicological dataset from the National Toxicology Program.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300197","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Penalized Regression Methods With Modified Cross-Validation and Bootstrap Tuning Produce Better Prediction Models 采用修正交叉验证和 Bootstrap 调整的惩罚回归方法可生成更好的预测模型。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-06-24 DOI: 10.1002/bimj.202300245
Menelaos Pavlou, Rumana Z. Omar, Gareth Ambler

Risk prediction models fitted using maximum likelihood estimation (MLE) are often overfitted resulting in predictions that are too extreme and a calibration slope (CS) less than 1. Penalized methods, such as Ridge and Lasso, have been suggested as a solution to this problem as they tend to shrink regression coefficients toward zero, resulting in predictions closer to the average. The amount of shrinkage is regulated by a tuning parameter, λ,$lambda ,$ commonly selected via cross-validation (“standard tuning”). Though penalized methods have been found to improve calibration on average, they often over-shrink and exhibit large variability in the selected λ$lambda $ and hence the CS. This is a problem, particularly for small sample sizes, but also when using sample sizes recommended to control overfitting. We consider whether these problems are partly due to selecting λ$lambda $ using cross-validation with “training” datasets of reduced size compared to the original development sample, resulting in an over-estimation of λ$lambda $ and, hence, excessive shrinkage. We propose a modified cross-validation tuning method (“modified tuning”), which estimates λ$lambda $ from a pseudo-development dataset obtained via bootstrapping from the original dataset, albeit of larger size, such that the resulting cross-validation training datasets are of the same size as the original dataset. Modified tuning can be easily implemented in standard software and is closely related to bootstrap selection of the tuning parameter (“bootstrap tuning”). We evaluated modified and bootstrap tuning for Ridge and Lasso in simulated and real data using recommended sample sizes, and sizes slightly lower and higher. They substantially improved the selection of λ$lambda $, resulting in improved CS compared to the standard tuning method. They also improved predictions compared to MLE.

使用最大似然估计(MLE)拟合的风险预测模型通常会过度拟合,导致预测结果过于极端,校准斜率(CS)小于 1。有人建议使用 Ridge 和 Lasso 等惩罚方法来解决这一问题,因为这些方法倾向于将回归系数缩减为零,从而使预测结果更接近平均值。缩减量由一个调整参数 λ , $lambda ,$ 来调节,通常通过交叉验证("标准调整")来选择。虽然已发现惩罚法平均可改善校准,但它们经常过度收缩,并在所选 λ $lambda $ 以及 CS 方面表现出很大的变异性。这是一个问题,尤其是在样本量较小的情况下,但在使用为控制过度拟合而推荐的样本量时也是如此。我们考虑这些问题是否部分是由于使用比原始开发样本更小的 "训练 "数据集进行交叉验证来选择 λ $/lambda$,从而导致过高估计 λ $/lambda$,进而导致过度缩减。我们提出了一种修改后的交叉验证调整方法("修改后的调整"),这种方法通过从原始数据集中引导得到的伪开发数据集来估计 λ $lambda$,尽管这个数据集的规模更大,这样得到的交叉验证训练数据集与原始数据集的规模相同。修正调整可以很容易地在标准软件中实现,并且与调整参数的自举选择("自举调整")密切相关。我们在模拟数据和真实数据中,使用推荐样本量以及略低于或略高于推荐样本量的样本,对 Ridge 和 Lasso 的修正调整和自举调整进行了评估。与标准调整方法相比,它们大大改进了 λ $lambda $ 的选择,从而改进了 CS。与 MLE 相比,他们还改进了预测结果。
{"title":"Penalized Regression Methods With Modified Cross-Validation and Bootstrap Tuning Produce Better Prediction Models","authors":"Menelaos Pavlou,&nbsp;Rumana Z. Omar,&nbsp;Gareth Ambler","doi":"10.1002/bimj.202300245","DOIUrl":"10.1002/bimj.202300245","url":null,"abstract":"<p>Risk prediction models fitted using maximum likelihood estimation (MLE) are often overfitted resulting in predictions that are too extreme and a calibration slope (CS) less than 1. Penalized methods, such as Ridge and Lasso, have been suggested as a solution to this problem as they tend to shrink regression coefficients toward zero, resulting in predictions closer to the average. The amount of shrinkage is regulated by a tuning parameter, <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>λ</mi>\u0000 <mo>,</mo>\u0000 </mrow>\u0000 <annotation>$lambda ,$</annotation>\u0000 </semantics></math> commonly selected via cross-validation (“standard tuning”). Though penalized methods have been found to improve calibration on average, they often over-shrink and exhibit large variability in the selected <span></span><math>\u0000 <semantics>\u0000 <mi>λ</mi>\u0000 <annotation>$lambda $</annotation>\u0000 </semantics></math> and hence the CS. This is a problem, particularly for small sample sizes, but also when using sample sizes recommended to control overfitting. We consider whether these problems are partly due to selecting <span></span><math>\u0000 <semantics>\u0000 <mi>λ</mi>\u0000 <annotation>$lambda $</annotation>\u0000 </semantics></math> using cross-validation with “training” datasets of reduced size compared to the original development sample, resulting in an over-estimation of <span></span><math>\u0000 <semantics>\u0000 <mi>λ</mi>\u0000 <annotation>$lambda $</annotation>\u0000 </semantics></math> and, hence, excessive shrinkage. We propose a modified cross-validation tuning method (“modified tuning”), which estimates <span></span><math>\u0000 <semantics>\u0000 <mi>λ</mi>\u0000 <annotation>$lambda $</annotation>\u0000 </semantics></math> from a pseudo-development dataset obtained via bootstrapping from the original dataset, albeit of larger size, such that the resulting cross-validation training datasets are of the same size as the original dataset. Modified tuning can be easily implemented in standard software and is closely related to bootstrap selection of the tuning parameter (“bootstrap tuning”). We evaluated modified and bootstrap tuning for Ridge and Lasso in simulated and real data using recommended sample sizes, and sizes slightly lower and higher. They substantially improved the selection of <span></span><math>\u0000 <semantics>\u0000 <mi>λ</mi>\u0000 <annotation>$lambda $</annotation>\u0000 </semantics></math>, resulting in improved CS compared to the standard tuning method. They also improved predictions compared to MLE.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300245","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141460887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Issue Information: Biometrical Journal 5'24 发行信息:生物计量学杂志 5'24
IF 1.3 3区 生物学 Q2 Mathematics Pub Date : 2024-06-21 DOI: 10.1002/bimj.202470005
{"title":"Issue Information: Biometrical Journal 5'24","authors":"","doi":"10.1002/bimj.202470005","DOIUrl":"https://doi.org/10.1002/bimj.202470005","url":null,"abstract":"","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202470005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141439727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrical Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1