首页 > 最新文献

Biometrical Journal最新文献

英文 中文
Estimating the proportion of true null hypotheses and adaptive false discovery rate control in discrete paradigm 离散范式中真实无效假设比例的估计和自适应错误发现率控制。
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-02-14 DOI: 10.1002/bimj.202200204
Aniket Biswas, Gaurangadeb Chattopadhyay

Storey's estimator for the proportion of true null hypotheses, originally proposed under the continuous framework, has been modified in this work under the discrete framework. The modification results in improved estimation of the parameter of interest. The proposed estimator is used to formulate an adaptive version of the Benjamini–Hochberg procedure. Control over the false discovery rate by the proposed adaptive procedure has been proved analytically. The proposed estimate is also used to formulate an adaptive version of the Benjamini–Hochberg–Heyse procedure. Simulation experiments establish the conservative nature of this new adaptive procedure. Substantial amount of gain in power is observed for the new adaptive procedures over the standard procedures. For demonstration of the proposed method, two important real life gene expression data sets, one related to the study of HIV and the other related to methylation study, are used.

斯多里的真实零假设比例估计器最初是在连续框架下提出的,在这项工作中根据离散框架进行了修改。这一修改改进了对相关参数的估计。所提出的估计方法被用于制定本杰明-霍奇伯格程序的自适应版本。所提出的自适应程序对错误发现率的控制已得到分析证明。所提出的估计值还可用于制定自适应版本的 Benjamini-Hochberg-Heyse 程序。模拟实验证明了这种新的自适应程序的保守性。与标准程序相比,新的自适应程序获得了大量的功率增益。为了演示所提出的方法,我们使用了两个重要的真实基因表达数据集,一个与 HIV 研究有关,另一个与甲基化研究有关。
{"title":"Estimating the proportion of true null hypotheses and adaptive false discovery rate control in discrete paradigm","authors":"Aniket Biswas,&nbsp;Gaurangadeb Chattopadhyay","doi":"10.1002/bimj.202200204","DOIUrl":"10.1002/bimj.202200204","url":null,"abstract":"<p>Storey's estimator for the proportion of true null hypotheses, originally proposed under the continuous framework, has been modified in this work under the discrete framework. The modification results in improved estimation of the parameter of interest. The proposed estimator is used to formulate an adaptive version of the Benjamini–Hochberg procedure. Control over the false discovery rate by the proposed adaptive procedure has been proved analytically. The proposed estimate is also used to formulate an adaptive version of the Benjamini–Hochberg–Heyse procedure. Simulation experiments establish the conservative nature of this new adaptive procedure. Substantial amount of gain in power is observed for the new adaptive procedures over the standard procedures. For demonstration of the proposed method, two important real life gene expression data sets, one related to the study of HIV and the other related to methylation study, are used.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139736834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review on statistical and machine learning competing risks methods 统计和机器学习竞争风险方法综述。
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-02-13 DOI: 10.1002/bimj.202300060
Karla Monterrubio-Gómez, Nathan Constantine-Cooke, Catalina A. Vallejos

When modeling competing risks (CR) survival data, several techniques have been proposed in both the statistical and machine learning literature. State-of-the-art methods have extended classical approaches with more flexible assumptions that can improve predictive performance, allow high-dimensional data and missing values, among others. Despite this, modern approaches have not been widely employed in applied settings. This article aims to aid the uptake of such methods by providing a condensed compendium of CR survival methods with a unified notation and interpretation across approaches. We highlight available software and, when possible, demonstrate their usage via reproducible R vignettes. Moreover, we discuss two major concerns that can affect benchmark studies in this context: the choice of performance metrics and reproducibility.

在对竞争风险(CR)生存数据建模时,统计和机器学习文献中都提出了几种技术。最先进的方法对经典方法进行了扩展,采用了更灵活的假设,可以提高预测性能,允许高维数据和缺失值等。尽管如此,现代方法尚未在应用环境中得到广泛应用。本文旨在通过提供一份简明的 CR 生存方法简编,对各种方法进行统一的符号和解释,从而帮助这些方法的应用。我们重点介绍了可用的软件,并在可能的情况下通过可重现的 R 小节演示了这些软件的用法。此外,我们还讨论了在这种情况下可能影响基准研究的两个主要问题:性能指标的选择和可重复性。
{"title":"A review on statistical and machine learning competing risks methods","authors":"Karla Monterrubio-Gómez,&nbsp;Nathan Constantine-Cooke,&nbsp;Catalina A. Vallejos","doi":"10.1002/bimj.202300060","DOIUrl":"10.1002/bimj.202300060","url":null,"abstract":"<p>When modeling competing risks (CR) survival data, several techniques have been proposed in both the statistical and machine learning literature. State-of-the-art methods have extended classical approaches with more flexible assumptions that can improve predictive performance, allow high-dimensional data and missing values, among others. Despite this, modern approaches have not been widely employed in applied settings. This article aims to aid the uptake of such methods by providing a condensed compendium of CR survival methods with a unified notation and interpretation across approaches. We highlight available software and, when possible, demonstrate their usage via reproducible R vignettes. Moreover, we discuss two major concerns that can affect benchmark studies in this context: the choice of performance metrics and reproducibility.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300060","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139731093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial Board: Biometrical Journal 2'24 编辑委员会:《生物计量学杂志》2'24
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-02-02 DOI: 10.1002/bimj.202470002
{"title":"Editorial Board: Biometrical Journal 2'24","authors":"","doi":"10.1002/bimj.202470002","DOIUrl":"https://doi.org/10.1002/bimj.202470002","url":null,"abstract":"","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202470002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139676560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian hierarchical approach to account for evidence and uncertainty in the modeling of infectious diseases: An application to COVID-19 在传染病建模中考虑证据和不确定性的贝叶斯分层方法:应用于 COVID-19。
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-28 DOI: 10.1002/bimj.202200341
Raphael Rehms, Nicole Ellenbach, Eva Rehfuess, Jacob Burns, Ulrich Mansmann, Sabine Hoffmann

Infectious disease models can serve as critical tools to predict the development of cases and associated healthcare demand and to determine the set of nonpharmaceutical interventions (NPIs) that is most effective in slowing the spread of an infectious agent. Current approaches to estimate NPI effects typically focus on relatively short time periods and either on the number of reported cases, deaths, intensive care occupancy, or hospital occupancy as a single indicator of disease transmission. In this work, we propose a Bayesian hierarchical model that integrates multiple outcomes and complementary sources of information in the estimation of the true and unknown number of infections while accounting for time-varying underreporting and weekday-specific delays in reported cases and deaths, allowing us to estimate the number of infections on a daily basis rather than having to smooth the data. To address dynamic changes occurring over long periods of time, we account for the spread of new variants, seasonality, and time-varying differences in host susceptibility. We implement a Markov chain Monte Carlo algorithm to conduct Bayesian inference and illustrate the proposed approach with data on COVID-19 from 20 European countries. The approach shows good performance on simulated data and produces posterior predictions that show a good fit to reported cases, deaths, hospital, and intensive care occupancy.

传染病模型是预测病例发展和相关医疗需求的重要工具,也是确定最有效减缓传染病传播的非药物干预措施(NPI)的重要工具。目前估算非药物干预效果的方法通常侧重于相对较短的时间段,并将报告病例数、死亡数、重症监护占用率或医院占用率作为疾病传播的单一指标。在这项工作中,我们提出了一种贝叶斯分层模型,该模型在估算真实和未知感染人数时整合了多种结果和互补信息源,同时考虑了时变的漏报以及特定工作日报告病例和死亡人数的延迟,使我们能够估算出每天的感染人数,而不必对数据进行平滑处理。为了应对长期发生的动态变化,我们考虑了新变种的传播、季节性以及宿主易感性的时变差异。我们采用马尔科夫链蒙特卡洛算法进行贝叶斯推断,并用来自 20 个欧洲国家的 COVID-19 数据说明了所提出的方法。该方法在模拟数据上表现出了良好的性能,其产生的后验预测结果与报告病例、死亡人数、住院人数和重症监护占用率非常吻合。
{"title":"A Bayesian hierarchical approach to account for evidence and uncertainty in the modeling of infectious diseases: An application to COVID-19","authors":"Raphael Rehms,&nbsp;Nicole Ellenbach,&nbsp;Eva Rehfuess,&nbsp;Jacob Burns,&nbsp;Ulrich Mansmann,&nbsp;Sabine Hoffmann","doi":"10.1002/bimj.202200341","DOIUrl":"10.1002/bimj.202200341","url":null,"abstract":"<p>Infectious disease models can serve as critical tools to predict the development of cases and associated healthcare demand and to determine the set of nonpharmaceutical interventions (NPIs) that is most effective in slowing the spread of an infectious agent. Current approaches to estimate NPI effects typically focus on relatively short time periods and either on the number of reported cases, deaths, intensive care occupancy, or hospital occupancy as a single indicator of disease transmission. In this work, we propose a Bayesian hierarchical model that integrates multiple outcomes and complementary sources of information in the estimation of the true and unknown number of infections while accounting for time-varying underreporting and weekday-specific delays in reported cases and deaths, allowing us to estimate the number of infections on a daily basis rather than having to smooth the data. To address dynamic changes occurring over long periods of time, we account for the spread of new variants, seasonality, and time-varying differences in host susceptibility. We implement a Markov chain Monte Carlo algorithm to conduct Bayesian inference and illustrate the proposed approach with data on COVID-19 from 20 European countries. The approach shows good performance on simulated data and produces posterior predictions that show a good fit to reported cases, deaths, hospital, and intensive care occupancy.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200341","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139572218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comparison of strategies for selecting auxiliary variables for multiple imputation 多重估算辅助变量选择策略比较
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-23 DOI: 10.1002/bimj.202200291
Rheanna M. Mainzer, Cattram D. Nguyen, John B. Carlin, Margarita Moreno-Betancur, Ian R. White, Katherine J. Lee

Multiple imputation (MI) is a popular method for handling missing data. Auxiliary variables can be added to the imputation model(s) to improve MI estimates. However, the choice of which auxiliary variables to include is not always straightforward. Several data-driven auxiliary variable selection strategies have been proposed, but there has been limited evaluation of their performance. Using a simulation study we evaluated the performance of eight auxiliary variable selection strategies: (1, 2) two versions of selection based on correlations in the observed data; (3) selection using hypothesis tests of the “missing completely at random” assumption; (4) replacing auxiliary variables with their principal components; (5, 6) forward and forward stepwise selection; (7) forward selection based on the estimated fraction of missing information; and (8) selection via the least absolute shrinkage and selection operator (LASSO). A complete case analysis and an MI analysis using all auxiliary variables (the “full model”) were included for comparison. We also applied all strategies to a motivating case study. The full model outperformed all auxiliary variable selection strategies in the simulation study, with the LASSO strategy the best performing auxiliary variable selection strategy overall. All MI analysis strategies that we were able to apply to the case study led to similar estimates, although computational time was substantially reduced when variable selection was employed. This study provides further support for adopting an inclusive auxiliary variable strategy where possible. Auxiliary variable selection using the LASSO may be a promising alternative when the full model fails or is too burdensome.

多重估算(MI)是处理缺失数据的一种常用方法。可以在估算模型中加入辅助变量来改进 MI 估计值。然而,选择加入哪些辅助变量并不总是那么简单。目前已经提出了几种数据驱动的辅助变量选择策略,但对其性能的评估还很有限。通过模拟研究,我们评估了八种辅助变量选择策略的性能:(1, 2) 基于观测数据相关性的两种选择版本;(3) 使用 "完全随机缺失 "假设的假设检验进行选择;(4) 用主成分替换辅助变量;(5, 6) 向前和向前逐步选择;(7) 基于缺失信息估计分数的向前选择;(8) 通过最小绝对收缩和选择算子(LASSO)进行选择。为了进行比较,我们纳入了完整病例分析和使用所有辅助变量的 MI 分析("完整模型")。我们还将所有策略应用于一项激励性案例研究。在模拟研究中,完整模型的表现优于所有辅助变量选择策略,而 LASSO 策略是整体表现最好的辅助变量选择策略。我们在案例研究中采用的所有 MI 分析策略都得出了相似的估算结果,不过在采用变量选择策略时,计算时间大大缩短。这项研究为尽可能采用包容性辅助变量策略提供了进一步支持。当完整模型失效或过于繁琐时,使用 LASSO 进行辅助变量选择可能是一种很有前途的替代方法。
{"title":"A comparison of strategies for selecting auxiliary variables for multiple imputation","authors":"Rheanna M. Mainzer,&nbsp;Cattram D. Nguyen,&nbsp;John B. Carlin,&nbsp;Margarita Moreno-Betancur,&nbsp;Ian R. White,&nbsp;Katherine J. Lee","doi":"10.1002/bimj.202200291","DOIUrl":"https://doi.org/10.1002/bimj.202200291","url":null,"abstract":"<p>Multiple imputation (MI) is a popular method for handling missing data. Auxiliary variables can be added to the imputation model(s) to improve MI estimates. However, the choice of which auxiliary variables to include is not always straightforward. Several data-driven auxiliary variable selection strategies have been proposed, but there has been limited evaluation of their performance. Using a simulation study we evaluated the performance of eight auxiliary variable selection strategies: (1, 2) two versions of selection based on correlations in the observed data; (3) selection using hypothesis tests of the “missing completely at random” assumption; (4) replacing auxiliary variables with their principal components; (5, 6) forward and forward stepwise selection; (7) forward selection based on the estimated fraction of missing information; and (8) selection via the least absolute shrinkage and selection operator (LASSO). A complete case analysis and an MI analysis using all auxiliary variables (the “full model”) were included for comparison. We also applied all strategies to a motivating case study. The full model outperformed all auxiliary variable selection strategies in the simulation study, with the LASSO strategy the best performing auxiliary variable selection strategy overall. All MI analysis strategies that we were able to apply to the case study led to similar estimates, although computational time was substantially reduced when variable selection was employed. This study provides further support for adopting an inclusive auxiliary variable strategy where possible. Auxiliary variable selection using the LASSO may be a promising alternative when the full model fails or is too burdensome.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200291","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139550466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parametric modal regression with error in covariates 带有协变量误差的参数模态回归
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-19 DOI: 10.1002/bimj.202200348
Qingyang Liu, Xianzheng Huang

An inference procedure is proposed to provide consistent estimators of parameters in a modal regression model with a covariate prone to measurement error. A score-based diagnostic tool exploiting parametric bootstrap is developed to assess adequacy of parametric assumptions imposed on the regression model. The proposed estimation method and diagnostic tool are applied to synthetic data generated from simulation experiments and data from real-world applications to demonstrate their implementation and performance. These empirical examples illustrate the importance of adequately accounting for measurement error in the error-prone covariate when inferring the association between a response and covariates based on a modal regression model that is especially suitable for skewed and heavy-tailed response data.

提出了一种推理程序,以提供带有易出现测量误差的协变量的模态回归模型中参数的一致估计值。利用参数自举法开发了一种基于分数的诊断工具,用于评估对回归模型施加的参数假设是否充分。所提出的估计方法和诊断工具被应用于模拟实验生成的合成数据和实际应用中的数据,以展示它们的实施和性能。这些实证例子说明,在根据模态回归模型推断响应与协变量之间的关联时,充分考虑易出错协变量的测量误差非常重要,而模态回归模型尤其适用于偏斜和重尾响应数据。
{"title":"Parametric modal regression with error in covariates","authors":"Qingyang Liu,&nbsp;Xianzheng Huang","doi":"10.1002/bimj.202200348","DOIUrl":"10.1002/bimj.202200348","url":null,"abstract":"<p>An inference procedure is proposed to provide consistent estimators of parameters in a modal regression model with a covariate prone to measurement error. A score-based diagnostic tool exploiting parametric bootstrap is developed to assess adequacy of parametric assumptions imposed on the regression model. The proposed estimation method and diagnostic tool are applied to synthetic data generated from simulation experiments and data from real-world applications to demonstrate their implementation and performance. These empirical examples illustrate the importance of adequately accounting for measurement error in the error-prone covariate when inferring the association between a response and covariates based on a modal regression model that is especially suitable for skewed and heavy-tailed response data.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139492257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finite mixtures in capture–recapture surveys for modeling residency patterns in marine wildlife populations 用于模拟海洋野生动物种群居住模式的捕获-再捕获调查有限混合物
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-16 DOI: 10.1002/bimj.202200350
Gianmarco Caruso, Pierfrancesco Alaimo Di Loro, Marco Mingione, Luca Tardella, Daniela Silvia Pace, Giovanna Jona Lasinio

This work aims to show how prior knowledge about the structure of a heterogeneous animal population can be leveraged to improve the abundance estimation from capture–recapture survey data. We combine the Open Jolly-Seber model with finite mixtures and propose a parsimonious specification tailored to the residency patterns of the common bottlenose dolphin. We employ a Bayesian framework for our inference, discussing the appropriate choice of priors to mitigate label-switching and nonidentifiability issues, commonly associated with finite mixture models. We conduct a series of simulation experiments to illustrate the competitive advantage of our proposal over less specific alternatives. The proposed approach is applied to data collected on the common bottlenose dolphin population inhabiting the Tiber River estuary (Mediterranean Sea). Our results provide novel insights into this population's size and structure, shedding light on some of the ecological processes governing its dynamics.

这项工作旨在展示如何利用有关异质性动物种群结构的先验知识来改进捕获-再捕获调查数据的丰度估算。我们将开放式乔利-塞伯模型与有限混合物相结合,并提出了适合普通瓶鼻海豚居住模式的简明规范。我们采用贝叶斯框架进行推断,并讨论了如何适当选择先验值,以缓解有限混合物模型中常见的标签转换和非可识别性问题。我们进行了一系列模拟实验,以说明我们的建议与不太具体的替代方法相比具有竞争优势。我们将提议的方法应用于台伯河河口(地中海)普通瓶鼻海豚种群的数据收集。我们的研究结果为该种群的规模和结构提供了新的见解,并揭示了支配其动态的一些生态过程。
{"title":"Finite mixtures in capture–recapture surveys for modeling residency patterns in marine wildlife populations","authors":"Gianmarco Caruso,&nbsp;Pierfrancesco Alaimo Di Loro,&nbsp;Marco Mingione,&nbsp;Luca Tardella,&nbsp;Daniela Silvia Pace,&nbsp;Giovanna Jona Lasinio","doi":"10.1002/bimj.202200350","DOIUrl":"https://doi.org/10.1002/bimj.202200350","url":null,"abstract":"<p>This work aims to show how prior knowledge about the structure of a heterogeneous animal population can be leveraged to improve the abundance estimation from capture–recapture survey data. We combine the Open Jolly-Seber model with finite mixtures and propose a parsimonious specification tailored to the residency patterns of the common bottlenose dolphin. We employ a Bayesian framework for our inference, discussing the appropriate choice of priors to mitigate label-switching and nonidentifiability issues, commonly associated with finite mixture models. We conduct a series of simulation experiments to illustrate the competitive advantage of our proposal over less specific alternatives. The proposed approach is applied to data collected on the common bottlenose dolphin population inhabiting the Tiber River estuary (Mediterranean Sea). Our results provide novel insights into this population's size and structure, shedding light on some of the ecological processes governing its dynamics.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200350","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139473964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neutralise: An open science initiative for neutral comparison of two-sample tests 中和:对双样本测试进行中性比较的开放科学倡议
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-12 DOI: 10.1002/bimj.202200237
Leyla Kodalci, Olivier Thas

The two-sample problem is one of the earliest problems in statistics: given two samples, the question is whether or not the observations were sampled from the same distribution. Many statistical tests have been developed for this problem, and many tests have been evaluated in simulation studies, but hardly any study has tried to set up a neutral comparison study. In this paper, we introduce an open science initiative that potentially allows for neutral comparisons of two-sample tests. It is designed as an open-source R package, a repository, and an online R Shiny app. This paper describes the principles, the design of the system and illustrates the use of the system.

双样本问题是统计学中最早出现的问题之一:在给定两个样本的情况下,问题在于观测值是否从相同的分布中抽取。针对这个问题已经开发了许多统计检验方法,并在模拟研究中对许多检验方法进行了评估,但几乎没有任何研究试图建立一个中立的比较研究。在本文中,我们介绍了一项开放科学计划,该计划有可能实现双样本检验的中性比较。它被设计成一个开源 R 软件包、一个资源库和一个在线 R Shiny 应用程序。本文介绍了该系统的原理和设计,并说明了该系统的使用方法。
{"title":"Neutralise: An open science initiative for neutral comparison of two-sample tests","authors":"Leyla Kodalci,&nbsp;Olivier Thas","doi":"10.1002/bimj.202200237","DOIUrl":"https://doi.org/10.1002/bimj.202200237","url":null,"abstract":"<p>The two-sample problem is one of the earliest problems in statistics: given two samples, the question is whether or not the observations were sampled from the same distribution. Many statistical tests have been developed for this problem, and many tests have been evaluated in simulation studies, but hardly any study has tried to set up a neutral comparison study. In this paper, we introduce an open science initiative that potentially allows for neutral comparisons of two-sample tests. It is designed as an open-source R package, a repository, and an online R Shiny app. This paper describes the principles, the design of the system and illustrates the use of the system.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200237","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139435302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of odds ratio from group testing data with misclassified exposure 从误分暴露的分组测试数据中估算几率比例
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-12 DOI: 10.1002/bimj.202200254
Surupa Roy, Sumanta Adhya, Subrata Rana

For low prevalence disease, we consider estimation of the odds ratio for two specified groups of individuals using group testing data. Broadly the two groups may be classified as “the exposed” and “the unexposed.” Often in observational studies, the exposure status is not correctly recorded. In addition, diagnostic tests are rarely completely accurate. The proposed model accounts for imperfect sensitivity and specificity of diagnostic tests along with the misclassification in the exposure status. For model identifiability, we make use of internal validation data, where a subsample of reasonably small size is selected from the original sample by simple random sampling without replacement. Pseudo-maximum likelihood method is employed for the estimation of the model parameters. The performance of group testing methodology is compared with individual testing for different parametric configurations. A limited data study related to COVID-19 prevalence is performed to illustrate the methodology.

对于低流行率疾病,我们考虑使用群体检测数据估算两个特定群体的几率比例。这两组人大致可分为 "暴露者 "和 "未暴露者"。在观察性研究中,暴露状态往往没有得到正确记录。此外,诊断测试也很少完全准确。所提出的模型考虑了诊断测试的不完全灵敏度和特异性以及暴露状态的错误分类。为了确保模型的可识别性,我们使用了内部验证数据,即从原始样本中通过简单随机抽样(不替换)选取一个规模相当小的子样本。模型参数的估计采用伪极大似然法。针对不同的参数配置,比较了分组测试方法与单独测试方法的性能。为了说明该方法,对 COVID-19 发病率进行了有限的数据研究。
{"title":"Estimation of odds ratio from group testing data with misclassified exposure","authors":"Surupa Roy,&nbsp;Sumanta Adhya,&nbsp;Subrata Rana","doi":"10.1002/bimj.202200254","DOIUrl":"https://doi.org/10.1002/bimj.202200254","url":null,"abstract":"<p>For low prevalence disease, we consider estimation of the odds ratio for two specified groups of individuals using group testing data. Broadly the two groups may be classified as “the exposed” and “the unexposed.” Often in observational studies, the exposure status is not correctly recorded. In addition, diagnostic tests are rarely completely accurate. The proposed model accounts for imperfect sensitivity and specificity of diagnostic tests along with the misclassification in the exposure status. For model identifiability, we make use of internal validation data, where a subsample of reasonably small size is selected from the original sample by simple random sampling without replacement. Pseudo-maximum likelihood method is employed for the estimation of the model parameters. The performance of group testing methodology is compared with individual testing for different parametric configurations. A limited data study related to COVID-19 prevalence is performed to illustrate the methodology.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139435301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mediation analysis with case–control sampling: Identification and estimation in the presence of a binary mediator 病例对照抽样的中介分析:存在二元调解因子时的识别和估算
IF 1.7 3区 生物学 Q2 Mathematics Pub Date : 2024-01-12 DOI: 10.1002/bimj.202300089
Marco Doretti, Minna Genbäck, Elena Stanghellini

With reference to a stratified case–control (CC) procedure based on a binary variable of primary interest, we derive the expression of the distortion induced by the sampling design on the parameters of the logistic model of a secondary variable. This is particularly relevant when performing mediation analysis (possibly in a causal framework) with stratified case–control (SCC) data in settings where both the outcome and the mediator are binary. Despite being designed for parametric identification, our strategy is general and can be used also in a nonparametric context. With reference to parametric estimation, we derive the maximum likelihood (ML) estimator and the M-estimator of the joint outcome–mediator parameter vector. We then conduct a simulation study focusing on the main causal mediation quantities (i.e., natural effects) and comparing M- and ML estimation to existing methods, based on weighting. As an illustrative example, we reanalyze a German CC data set in order to investigate whether the effect of reduced immunocompetency on listeriosis onset is mediated by the intake of gastric acid suppressors.

参照基于二元主变量的分层病例对照(CC)程序,我们推导出抽样设计对次要变量逻辑模型参数的扭曲表达式。在结果和中介变量都是二元变量的情况下,利用分层病例对照(SCC)数据进行中介分析(可能在因果框架内)时,这一点尤为重要。尽管我们的策略是为参数识别而设计的,但它具有通用性,也可用于非参数环境。参照参数估计,我们推导出了最大似然(ML)估计器和结果-中介联合参数向量的 M-估计器。然后,我们以主要因果中介量(即自然效应)为重点进行了模拟研究,并将 M-估计法和 ML 估计法与基于加权的现有方法进行了比较。作为一个示例,我们重新分析了德国的 CC 数据集,以研究免疫能力下降对李斯特菌病发病的影响是否由胃酸抑制剂的摄入起中介作用。
{"title":"Mediation analysis with case–control sampling: Identification and estimation in the presence of a binary mediator","authors":"Marco Doretti,&nbsp;Minna Genbäck,&nbsp;Elena Stanghellini","doi":"10.1002/bimj.202300089","DOIUrl":"https://doi.org/10.1002/bimj.202300089","url":null,"abstract":"<p>With reference to a stratified case–control (CC) procedure based on a binary variable of primary interest, we derive the expression of the distortion induced by the sampling design on the parameters of the logistic model of a secondary variable. This is particularly relevant when performing mediation analysis (possibly in a causal framework) with stratified case–control (SCC) data in settings where both the outcome and the mediator are binary. Despite being designed for parametric identification, our strategy is general and can be used also in a nonparametric context. With reference to parametric estimation, we derive the maximum likelihood (ML) estimator and the M-estimator of the joint outcome–mediator parameter vector. We then conduct a simulation study focusing on the main causal mediation quantities (i.e., natural effects) and comparing M- and ML estimation to existing methods, based on weighting. As an illustrative example, we reanalyze a German CC data set in order to investigate whether the effect of reduced immunocompetency on listeriosis onset is mediated by the intake of gastric acid suppressors.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300089","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139435303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrical Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1