首页 > 最新文献

Biometrics最新文献

英文 中文
Distance weighted directional regression for Fréchet sufficient dimension reduction. 距离加权方向回归法用于fracei的充分降维。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf051
Chao Ying, Zhou Yu, Xin Zhang

Analysis of non-Euclidean data accumulated from human longevity studies, brain functional network studies, and many other areas has become an important issue in modern statistics. Fréchet sufficient dimension reduction aims to identify dependencies between non-Euclidean object-valued responses and multivariate predictors while simultaneously reducing the dimensionality of the predictors. We introduce the distance weighted directional regression method for both linear and nonlinear Fréchet sufficient dimension reduction. We propose a new formulation of the classical directional regression method in sufficient dimension reduction. The new formulation is based on distance weighting, thus providing a unified approach for sufficient dimension reduction with Euclidean and non-Euclidean responses, and is further extended to nonlinear Fréchet sufficient dimension reduction. We derive the asymptotic normality of the linear Fréchet directional regression estimator and the convergence rate of the nonlinear estimator. Simulation studies are presented to demonstrate the empirical performance of the proposed methods and to support our theoretical findings. The application to human mortality modeling and diabetes prevalence analysis show that our proposal can improve interpretation and out-of-sample prediction.

从人类寿命研究、脑功能网络研究和许多其他领域积累的非欧几里得数据的分析已成为现代统计学中的一个重要问题。fr充分降维旨在识别非欧几里得对象值响应与多变量预测因子之间的依赖关系,同时降低预测因子的维数。本文介绍了距离加权方向回归方法,用于线性和非线性网格的充分降维。在充分降维的情况下,提出了经典方向回归方法的一种新的表述。新公式基于距离加权,为欧几里得和非欧几里得响应的充分降维提供了统一的方法,并进一步推广到非线性的fr充分降维。我们得到了线性fr定向回归估计量的渐近正态性和非线性估计量的收敛速率。模拟研究被提出,以证明所提出的方法的经验性能,并支持我们的理论发现。应用于人类死亡率建模和糖尿病患病率分析表明,我们的建议可以提高解释和样本外预测。
{"title":"Distance weighted directional regression for Fréchet sufficient dimension reduction.","authors":"Chao Ying, Zhou Yu, Xin Zhang","doi":"10.1093/biomtc/ujaf051","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf051","url":null,"abstract":"<p><p>Analysis of non-Euclidean data accumulated from human longevity studies, brain functional network studies, and many other areas has become an important issue in modern statistics. Fréchet sufficient dimension reduction aims to identify dependencies between non-Euclidean object-valued responses and multivariate predictors while simultaneously reducing the dimensionality of the predictors. We introduce the distance weighted directional regression method for both linear and nonlinear Fréchet sufficient dimension reduction. We propose a new formulation of the classical directional regression method in sufficient dimension reduction. The new formulation is based on distance weighting, thus providing a unified approach for sufficient dimension reduction with Euclidean and non-Euclidean responses, and is further extended to nonlinear Fréchet sufficient dimension reduction. We derive the asymptotic normality of the linear Fréchet directional regression estimator and the convergence rate of the nonlinear estimator. Simulation studies are presented to demonstrate the empirical performance of the proposed methods and to support our theoretical findings. The application to human mortality modeling and diabetes prevalence analysis show that our proposal can improve interpretation and out-of-sample prediction.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143975586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PDC-MAKES: a conditional screening method for controlling false discoveries in high-dimensional multi-response setting. PDC-MAKES:一种控制高维多响应环境中错误发现的条件筛选方法。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf042
Wei Xiong, Han Pan, Tong Shen

The coexistences of high dimensionality and strong correlation in both responses and predictors pose unprecedented challenges in identifying important predictors. In this paper, we propose a model-free conditional feature screening method with false discovery rate (FDR) control for ultrahigh-dimensional multi-response setting. The proposed method is built upon partial distance correlation, which measures the dependence between two random vectors while controlling effect for a multivariate random vector. This screening approach is robust against heavy-tailed data and can select predictors in instances of high correlation among predictors. Additionally, it can identify predictors that are marginally unrelated but conditionally related with the response. Leveraging the advantageous properties of partial distance correlation, our method allows for high-dimensional variables to be conditioned upon, distinguishing it from current research in this field. To further achieve FDR control, we apply derandomized knockoff-e-values to establish the threshold for feature screening more stably. The proposed FDR control method is shown to enjoy sure screening property while maintaining FDR control as well as achieving higher power under mild conditions. The superior performance of these methods is demonstrated through simulation examples and a real data application.

响应和预测因子的高维性和强相关性共存,对重要预测因子的识别提出了前所未有的挑战。针对超高维多响应设置,提出了一种具有错误发现率(FDR)控制的无模型条件特征筛选方法。该方法建立在部分距离相关的基础上,测量了两个随机向量之间的相关性,同时控制了多变量随机向量的效果。这种筛选方法对重尾数据是稳健的,并且可以在预测因子之间高度相关的情况下选择预测因子。此外,它还可以识别与响应无关但有条件相关的预测因子。利用部分距离相关的优势特性,我们的方法允许对高维变量进行条件设置,从而将其与该领域的现有研究区分开来。为了进一步实现FDR控制,我们采用非随机仿冒值来建立更稳定的特征筛选阈值。所提出的FDR控制方法在保持FDR控制的同时具有一定的筛选性能,并在温和条件下获得更高的功率。通过仿真算例和实际数据应用,证明了这些方法的优越性。
{"title":"PDC-MAKES: a conditional screening method for controlling false discoveries in high-dimensional multi-response setting.","authors":"Wei Xiong, Han Pan, Tong Shen","doi":"10.1093/biomtc/ujaf042","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf042","url":null,"abstract":"<p><p>The coexistences of high dimensionality and strong correlation in both responses and predictors pose unprecedented challenges in identifying important predictors. In this paper, we propose a model-free conditional feature screening method with false discovery rate (FDR) control for ultrahigh-dimensional multi-response setting. The proposed method is built upon partial distance correlation, which measures the dependence between two random vectors while controlling effect for a multivariate random vector. This screening approach is robust against heavy-tailed data and can select predictors in instances of high correlation among predictors. Additionally, it can identify predictors that are marginally unrelated but conditionally related with the response. Leveraging the advantageous properties of partial distance correlation, our method allows for high-dimensional variables to be conditioned upon, distinguishing it from current research in this field. To further achieve FDR control, we apply derandomized knockoff-e-values to establish the threshold for feature screening more stably. The proposed FDR control method is shown to enjoy sure screening property while maintaining FDR control as well as achieving higher power under mild conditions. The superior performance of these methods is demonstrated through simulation examples and a real data application.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143962459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discrete-time competing-risks regression with or without penalization. 有或没有惩罚的离散时间竞争风险回归。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf040
Tomer Meir, Malka Gorfine

Many studies employ the analysis of time-to-event data that incorporates competing risks and right censoring. Most methods and software packages are geared towards analyzing data that comes from a continuous failure time distribution. However, failure-time data may sometimes be discrete either because time is inherently discrete or due to imprecise measurement. This paper introduces a new estimation procedure for discrete-time survival analysis with competing events. The proposed approach offers a major key advantage over existing procedures and allows for straightforward integration and application of widely used regularized regression and screening-features methods. We illustrate the benefits of our proposed approach by a comprehensive simulation study. Additionally, we showcase the utility of the proposed procedure by estimating a survival model for the length of stay of patients hospitalized in the intensive care unit, considering 3 competing events: discharge to home, transfer to another medical facility, and in-hospital death. A Python package, PyDTS, is available for applying the proposed method with additional features.

许多研究采用时间到事件数据的分析,其中包括竞争风险和正确审查。大多数方法和软件包都是面向分析来自连续故障时间分布的数据的。然而,故障时间数据有时可能是离散的,这要么是因为时间本身是离散的,要么是因为测量不精确。本文介绍了一种新的具有竞争事件的离散时间生存分析估计方法。所提出的方法提供了一个主要的关键优势超过现有的程序,并允许直接集成和应用广泛使用的正则化回归和筛选特征方法。我们通过一个全面的模拟研究来说明我们提出的方法的好处。此外,我们通过考虑3个相互竞争的事件:出院回家、转到其他医疗机构和院内死亡,估算重症监护室住院患者住院时间的生存模型,展示了所建议程序的实用性。Python包PyDTS可用于将建议的方法与其他功能一起应用。
{"title":"Discrete-time competing-risks regression with or without penalization.","authors":"Tomer Meir, Malka Gorfine","doi":"10.1093/biomtc/ujaf040","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf040","url":null,"abstract":"<p><p>Many studies employ the analysis of time-to-event data that incorporates competing risks and right censoring. Most methods and software packages are geared towards analyzing data that comes from a continuous failure time distribution. However, failure-time data may sometimes be discrete either because time is inherently discrete or due to imprecise measurement. This paper introduces a new estimation procedure for discrete-time survival analysis with competing events. The proposed approach offers a major key advantage over existing procedures and allows for straightforward integration and application of widely used regularized regression and screening-features methods. We illustrate the benefits of our proposed approach by a comprehensive simulation study. Additionally, we showcase the utility of the proposed procedure by estimating a survival model for the length of stay of patients hospitalized in the intensive care unit, considering 3 competing events: discharge to home, transfer to another medical facility, and in-hospital death. A Python package, PyDTS, is available for applying the proposed method with additional features.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143959189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probabilistic exponential family inverse regression and its applications. 概率指数族逆回归及其应用。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf065
Daolin Pang, Ruoqing Zhu, Hongyu Zhao, Tao Wang

Rapid advances in high-throughput sequencing technologies have led to the fast accumulation of high-dimensional data, which is harnessed for understanding the implications of various factors on human disease and health. While dimension reduction plays an essential role in high-dimensional regression and classification, existing methods often require the predictors to be continuous, making them unsuitable for discrete data, such as presence-absence records of species in community ecology and sequencing reads in single-cell studies. To identify and estimate sufficient reductions in regressions with discrete predictors, we introduce probabilistic exponential family inverse regression (PrEFIR), assuming that, given the response and a set of latent factors, the predictors follow one-parameter exponential families. We show that the low-dimensional reductions result not only from the response variable but also from the latent factors. We further extend the latent factor modeling framework to the double exponential family by including an additional parameter to account for the dispersion. This versatile framework encompasses regressions with all categorical or a mixture of categorical and continuous predictors. We propose the method of maximum hierarchical likelihood for estimation, and develop a highly parallelizable algorithm for its computation. The effectiveness of PrEFIR is demonstrated through simulation studies and real data examples.

高通量测序技术的快速发展导致了高维数据的快速积累,这些数据被用来理解各种因素对人类疾病和健康的影响。虽然降维在高维回归和分类中起着至关重要的作用,但现有的方法往往要求预测因子是连续的,这使得它们不适合离散数据,例如群落生态学中物种的存在-缺失记录和单细胞研究中的测序读取。为了识别和估计具有离散预测因子的回归的充分减少,我们引入了概率指数族逆回归(PrEFIR),假设给定响应和一组潜在因素,预测因子遵循单参数指数族。结果表明,低维降维不仅是由响应变量引起的,而且是由潜在因素引起的。我们进一步将潜在因素建模框架扩展到双指数族,包括一个额外的参数来解释分散。这个通用框架包括所有分类或混合分类和连续预测因子的回归。我们提出了最大层次似然估计方法,并开发了一种高度并行化的计算算法。通过仿真研究和实际数据算例验证了算法的有效性。
{"title":"Probabilistic exponential family inverse regression and its applications.","authors":"Daolin Pang, Ruoqing Zhu, Hongyu Zhao, Tao Wang","doi":"10.1093/biomtc/ujaf065","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf065","url":null,"abstract":"<p><p>Rapid advances in high-throughput sequencing technologies have led to the fast accumulation of high-dimensional data, which is harnessed for understanding the implications of various factors on human disease and health. While dimension reduction plays an essential role in high-dimensional regression and classification, existing methods often require the predictors to be continuous, making them unsuitable for discrete data, such as presence-absence records of species in community ecology and sequencing reads in single-cell studies. To identify and estimate sufficient reductions in regressions with discrete predictors, we introduce probabilistic exponential family inverse regression (PrEFIR), assuming that, given the response and a set of latent factors, the predictors follow one-parameter exponential families. We show that the low-dimensional reductions result not only from the response variable but also from the latent factors. We further extend the latent factor modeling framework to the double exponential family by including an additional parameter to account for the dispersion. This versatile framework encompasses regressions with all categorical or a mixture of categorical and continuous predictors. We propose the method of maximum hierarchical likelihood for estimation, and develop a highly parallelizable algorithm for its computation. The effectiveness of PrEFIR is demonstrated through simulation studies and real data examples.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144126551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical inference on the relative risk following covariate-adaptive randomization. 协变量自适应随机化后相对风险的统计推断。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf036
Fengyu Zhao, Yang Liu, Feifang Hu

Covariate-adaptive randomization (CAR) is widely adopted in clinical trials to ensure balanced treatment allocations across key baseline covariates. Although much research has focused on analyzing average treatment effects, the inference of relative risk under CAR experiments has been less thoroughly explored. In this study, we examine a covariate-adjusted estimate of relative risk and investigate the properties of its associated hypothesis tests under CAR. We first derive the theoretical properties of the covariate-adjusted relative risk for a broad class of CAR procedures. Our findings indicate that conventional tests for relative risk tend to be conservative, leading to reduced type I error rates. To mitigate this issue, we introduce model-based and model-robust methods that enhance the estimation of standard errors. We demonstrate the validity and usage of model-robust and model-based adjusted tests. Extensive numerical studies have been conducted to demonstrate our theoretical findings and the favorable properties of the proposed adjustment methods.

临床试验中广泛采用协变量自适应随机化(CAR),以确保关键基线协变量的治疗分配均衡。尽管很多研究都集中于分析平均治疗效果,但对 CAR 试验下相对风险的推断探讨得还不够深入。在本研究中,我们研究了经协变因素调整的相对风险估计值,并探讨了其在 CAR 条件下的相关假设检验特性。首先,我们推导出了一大类 CAR 程序的协变量调整后相对风险的理论属性。我们的研究结果表明,传统的相对风险检验趋于保守,导致 I 类错误率降低。为了缓解这一问题,我们引入了基于模型和模型稳健的方法,以加强对标准误差的估计。我们展示了基于模型和基于模型的调整检验的有效性和使用方法。我们进行了广泛的数值研究,以证明我们的理论发现和所提出的调整方法的有利特性。
{"title":"Statistical inference on the relative risk following covariate-adaptive randomization.","authors":"Fengyu Zhao, Yang Liu, Feifang Hu","doi":"10.1093/biomtc/ujaf036","DOIUrl":"10.1093/biomtc/ujaf036","url":null,"abstract":"<p><p>Covariate-adaptive randomization (CAR) is widely adopted in clinical trials to ensure balanced treatment allocations across key baseline covariates. Although much research has focused on analyzing average treatment effects, the inference of relative risk under CAR experiments has been less thoroughly explored. In this study, we examine a covariate-adjusted estimate of relative risk and investigate the properties of its associated hypothesis tests under CAR. We first derive the theoretical properties of the covariate-adjusted relative risk for a broad class of CAR procedures. Our findings indicate that conventional tests for relative risk tend to be conservative, leading to reduced type I error rates. To mitigate this issue, we introduce model-based and model-robust methods that enhance the estimation of standard errors. We demonstrate the validity and usage of model-robust and model-based adjusted tests. Extensive numerical studies have been conducted to demonstrate our theoretical findings and the favorable properties of the proposed adjustment methods.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143794498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal dynamic treatment regime estimation in the presence of nonadherence. 存在不依从的最优动态治疗方案估计。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf041
Dylan Spicker, Michael P Wallace, Grace Y Yi

Dynamic treatment regimes (DTRs) are sequences of functions that formalize the process of precision medicine. DTRs take as input patient information and output treatment recommendations. A major focus of the DTR literature has been on the estimation of optimal DTRs, the sequences of decision rules that result in the best outcome in expectation, across the complete population if they were to be applied. While there is a rich literature on optimal DTR estimation, to date, there has been minimal consideration of the impacts of nonadherence on these estimation procedures. Nonadherence refers to any process through which an individual's prescribed treatment does not match their true treatment. We explore the impacts of nonadherence and demonstrate that, generally, when nonadherence is ignored, suboptimal regimes will be estimated. In light of these findings, we propose a method for estimating optimal DTRs in the presence of nonadherence. The resulting estimators are consistent and asymptotically normal, with a double robustness property. Using simulations, we demonstrate the reliability of these results, and illustrate comparable performance between the proposed estimation procedure adjusting for the impacts of nonadherence and estimators that are computed on data without nonadherence.

动态治疗机制(DTRs)是一系列功能,使精准医疗的过程形式化。dtr以患者信息为输入,输出治疗建议。DTR文献的一个主要焦点是对最优DTR的估计,即在整个人群中产生最佳预期结果的决策规则序列,如果它们被应用的话。虽然关于最佳DTR估计有丰富的文献,但迄今为止,对不遵守这些估计程序的影响的考虑很少。不依从指的是任何过程中,个人的规定治疗不符合他们的真实治疗。我们探讨了不遵守的影响,并证明,一般来说,当不遵守被忽略时,次优制度将被估计出来。根据这些发现,我们提出了一种估计不依从存在的最佳dtr的方法。所得到的估计量是一致且渐近正态的,具有双鲁棒性。通过模拟,我们证明了这些结果的可靠性,并说明了在调整不遵守影响的估计过程和在没有不遵守的数据上计算的估计器之间的可比较性能。
{"title":"Optimal dynamic treatment regime estimation in the presence of nonadherence.","authors":"Dylan Spicker, Michael P Wallace, Grace Y Yi","doi":"10.1093/biomtc/ujaf041","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf041","url":null,"abstract":"<p><p>Dynamic treatment regimes (DTRs) are sequences of functions that formalize the process of precision medicine. DTRs take as input patient information and output treatment recommendations. A major focus of the DTR literature has been on the estimation of optimal DTRs, the sequences of decision rules that result in the best outcome in expectation, across the complete population if they were to be applied. While there is a rich literature on optimal DTR estimation, to date, there has been minimal consideration of the impacts of nonadherence on these estimation procedures. Nonadherence refers to any process through which an individual's prescribed treatment does not match their true treatment. We explore the impacts of nonadherence and demonstrate that, generally, when nonadherence is ignored, suboptimal regimes will be estimated. In light of these findings, we propose a method for estimating optimal DTRs in the presence of nonadherence. The resulting estimators are consistent and asymptotically normal, with a double robustness property. Using simulations, we demonstrate the reliability of these results, and illustrate comparable performance between the proposed estimation procedure adjusting for the impacts of nonadherence and estimators that are computed on data without nonadherence.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143953088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformal predictive intervals in survival analysis: a resampling approach. 生存分析中的适形预测区间:重采样方法。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf063
Jing Qin, Jin Piao, Jing Ning, Yu Shen

The distribution-free method of conformal prediction has gained considerable attention in computer science, machine learning, and statistics. Candès et al. extended this method to right-censored survival data, addressing right-censoring complexity by creating a covariate shift setting, extracting a subcohort of subjects with censoring times exceeding a fixed threshold. Their approach only estimates the lower prediction bound for type I censoring, where all subjects have available censoring times regardless of their failure status. In medical applications, we often encounter more general right-censored data, observing only the minimum of failure time and censoring time. Subjects with observed failure times have unavailable censoring times. To address this, we propose a bootstrap method to construct 1- as well as 2-sided conformal predictive intervals for general right-censored survival data under different working regression models. Through simulations, our method demonstrates excellent average coverage for the lower bound and good coverage for the 2-sided predictive interval, regardless of working model is correctly specified or not, particularly under moderate censoring. We further extend the proposed method to several directions in medical applications. We apply this method to predict breast cancer patients' future survival times based on tumor characteristics and treatment.

保形预测的无分布方法在计算机科学、机器学习和统计学中获得了相当大的关注。cand等人将该方法扩展到右审查生存数据,通过创建协变量移位设置来解决右审查复杂性,提取审查次数超过固定阈值的受试者亚队列。他们的方法只估计类型I审查的较低预测界限,其中所有受试者都有可用的审查时间,而不管他们的失败状态。在医疗应用中,我们经常遇到更一般的右审查数据,只观察到最小的故障时间和审查时间。观察到失败时间的对象无法获得审查时间。为了解决这个问题,我们提出了一种自举方法来构建在不同工作回归模型下的一般右截尾生存数据的1侧和2侧共形预测区间。通过模拟,无论工作模型是否正确指定,特别是在适度审查下,我们的方法都证明了下界的良好平均覆盖率和双侧预测区间的良好覆盖率。我们进一步将所提出的方法扩展到医学应用的几个方向。我们应用该方法根据肿瘤特征和治疗来预测乳腺癌患者的未来生存时间。
{"title":"Conformal predictive intervals in survival analysis: a resampling approach.","authors":"Jing Qin, Jin Piao, Jing Ning, Yu Shen","doi":"10.1093/biomtc/ujaf063","DOIUrl":"10.1093/biomtc/ujaf063","url":null,"abstract":"<p><p>The distribution-free method of conformal prediction has gained considerable attention in computer science, machine learning, and statistics. Candès et al. extended this method to right-censored survival data, addressing right-censoring complexity by creating a covariate shift setting, extracting a subcohort of subjects with censoring times exceeding a fixed threshold. Their approach only estimates the lower prediction bound for type I censoring, where all subjects have available censoring times regardless of their failure status. In medical applications, we often encounter more general right-censored data, observing only the minimum of failure time and censoring time. Subjects with observed failure times have unavailable censoring times. To address this, we propose a bootstrap method to construct 1- as well as 2-sided conformal predictive intervals for general right-censored survival data under different working regression models. Through simulations, our method demonstrates excellent average coverage for the lower bound and good coverage for the 2-sided predictive interval, regardless of working model is correctly specified or not, particularly under moderate censoring. We further extend the proposed method to several directions in medical applications. We apply this method to predict breast cancer patients' future survival times based on tumor characteristics and treatment.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12104816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144141216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating optimally tailored active surveillance strategy under interval censoring. 区间审查下的最优主动监控策略估计。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf067
Muxuan Liang, Yingqi Zhao, Daniel W Lin, Matthew Cooperberg, Yingye Zheng

Active surveillance (AS) using repeated biopsies to monitor disease progression has been a popular alternative to immediate surgical intervention in cancer care. However, a biopsy procedure is invasive and sometimes leads to severe side effects of infection and bleeding. To reduce the burden of repeated surveillance biopsies, biomarker-assistant decision rules are sought to replace the fix-for-all regimen with tailored biopsy intensity for individual patients. Constructing or evaluating such decision rules is challenging. The key AS outcome is often ascertained subject to interval censoring. Furthermore, patients will discontinue participation in the AS study once they receive a positive surveillance biopsy. Thus, patient dropout is affected by the outcomes of these biopsies. This work proposes a non-parametric kernel-based method to estimate a tailored AS strategy's true positive rates (TPRs) and true negative rates (TNRs), accounting for interval censoring and immediate dropouts. We develop a weighted classification framework based on these estimates to estimate the optimally tailored AS strategy and further incorporate the cost-benefit ratio for cost-effectiveness in medical decision-making. Theoretically, we provide a uniform generalization error bound of the derived AS strategy, accommodating all possible trade-offs between TPRs and TNRs. Simulation and application to a prostate cancer surveillance study show the superiority of the proposed method.

主动监测(AS)使用重复活检来监测疾病进展已成为癌症治疗中立即手术干预的流行替代方案。然而,活检过程是侵入性的,有时会导致严重的副作用,如感染和出血。为了减轻重复监测活检的负担,寻求生物标志物辅助决策规则,以针对个体患者量身定制活检强度来取代所有固定方案。构建或评估这样的决策规则是具有挑战性的。关键的AS结果往往要经过间隔审查才能确定。此外,一旦患者接受了阳性监测活检,他们将停止参与AS研究。因此,患者退出受这些活检结果的影响。这项工作提出了一种基于非参数核的方法来估计定制的AS策略的真阳性率(tpr)和真负率(tnr),考虑间隔审查和即时退出。我们根据这些估计开发了一个加权分类框架,以估计最佳定制的AS策略,并进一步将成本效益比纳入医疗决策的成本效益。理论上,我们为衍生的AS策略提供了统一的泛化误差界,以适应tpr和tnr之间的所有可能权衡。在前列腺癌监测研究中的仿真和应用表明了该方法的优越性。
{"title":"Estimating optimally tailored active surveillance strategy under interval censoring.","authors":"Muxuan Liang, Yingqi Zhao, Daniel W Lin, Matthew Cooperberg, Yingye Zheng","doi":"10.1093/biomtc/ujaf067","DOIUrl":"10.1093/biomtc/ujaf067","url":null,"abstract":"<p><p>Active surveillance (AS) using repeated biopsies to monitor disease progression has been a popular alternative to immediate surgical intervention in cancer care. However, a biopsy procedure is invasive and sometimes leads to severe side effects of infection and bleeding. To reduce the burden of repeated surveillance biopsies, biomarker-assistant decision rules are sought to replace the fix-for-all regimen with tailored biopsy intensity for individual patients. Constructing or evaluating such decision rules is challenging. The key AS outcome is often ascertained subject to interval censoring. Furthermore, patients will discontinue participation in the AS study once they receive a positive surveillance biopsy. Thus, patient dropout is affected by the outcomes of these biopsies. This work proposes a non-parametric kernel-based method to estimate a tailored AS strategy's true positive rates (TPRs) and true negative rates (TNRs), accounting for interval censoring and immediate dropouts. We develop a weighted classification framework based on these estimates to estimate the optimally tailored AS strategy and further incorporate the cost-benefit ratio for cost-effectiveness in medical decision-making. Theoretically, we provide a uniform generalization error bound of the derived AS strategy, accommodating all possible trade-offs between TPRs and TNRs. Simulation and application to a prostate cancer surveillance study show the superiority of the proposed method.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144186381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards efficient and interpretable assumption-lean generalized linear modeling of continuous exposure effects. 面向连续暴露效应的有效和可解释的假设精益广义线性模型。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf071
Stijn Vansteelandt

Advances in causal inference have largely ignored continuous exposures, apart from model-based approaches, which face criticism due to potential model misspecification. Model-free approaches based on modified treatment policies, such as uniformly shifting each subject's observed exposure, have emerged as promising alternatives. However, because such interventions are impractical, it is necessary to evaluate a range of possible shifts to generate actionable insights. To address this, we introduce models that parameterize the effects of shift interventions across varying magnitudes, coupled with assumption-lean estimation strategies. To ensure validity and interpretability under model misspecification, we tailor these to minimize (squared) bias in estimating the effects of realistic shifts. We employ debiased machine learning procedures for this but observe them to exhibit erratic behavior under certain data-generating mechanisms, prompting two key innovations. First, we propose a broadly applicable debiasing procedure that yields estimators with significantly improved finite-sample properties and is of independent methodological interest. Second, we develop debiased machine learning estimators for estimands with a more favorable efficiency bound, but more nuanced interpretation when models are misspecified. Unlike existing projection estimators, our methods avoid inverse exposure density weighting and do not demand tailored shift interventions to address positivity violations. Extensive simulations and a re-analysis of the Bangladesh Wash Benefits study demonstrate the effectiveness, stability, and utility of our approach. This work advances assumption-lean methods that balance validity, interpretability, and efficiency.

因果推理的进展在很大程度上忽略了持续暴露,除了基于模型的方法,由于潜在的模型错误规范而面临批评。基于修改治疗政策的无模型方法,如均匀地改变每个受试者的观察暴露,已成为有希望的替代方案。然而,由于这种干预是不切实际的,有必要评估一系列可能的转变,以产生可操作的见解。为了解决这个问题,我们引入了一些模型,这些模型将不同程度的转移干预的影响参数化,并结合了假设精益估计策略。为了确保模型错误说明下的有效性和可解释性,我们对这些进行了调整,以最大限度地减少估计实际变化影响的(平方)偏差。为此,我们采用了无偏见的机器学习程序,但观察到它们在某些数据生成机制下表现出不稳定的行为,这促使了两个关键的创新。首先,我们提出了一种广泛适用的除偏程序,该程序产生具有显著改进的有限样本性质的估计器,并且具有独立的方法兴趣。其次,我们开发了无偏差的机器学习估计器,用于具有更有利的效率界限的估计,但当模型被错误指定时,会有更细微的解释。与现有的投影估计器不同,我们的方法避免了反向暴露密度加权,并且不需要量身定制的轮班干预来解决阳性违规。广泛的模拟和对孟加拉国Wash福利研究的重新分析证明了我们方法的有效性、稳定性和实用性。这项工作提出了假设精益方法,平衡有效性,可解释性和效率。
{"title":"Towards efficient and interpretable assumption-lean generalized linear modeling of continuous exposure effects.","authors":"Stijn Vansteelandt","doi":"10.1093/biomtc/ujaf071","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf071","url":null,"abstract":"<p><p>Advances in causal inference have largely ignored continuous exposures, apart from model-based approaches, which face criticism due to potential model misspecification. Model-free approaches based on modified treatment policies, such as uniformly shifting each subject's observed exposure, have emerged as promising alternatives. However, because such interventions are impractical, it is necessary to evaluate a range of possible shifts to generate actionable insights. To address this, we introduce models that parameterize the effects of shift interventions across varying magnitudes, coupled with assumption-lean estimation strategies. To ensure validity and interpretability under model misspecification, we tailor these to minimize (squared) bias in estimating the effects of realistic shifts. We employ debiased machine learning procedures for this but observe them to exhibit erratic behavior under certain data-generating mechanisms, prompting two key innovations. First, we propose a broadly applicable debiasing procedure that yields estimators with significantly improved finite-sample properties and is of independent methodological interest. Second, we develop debiased machine learning estimators for estimands with a more favorable efficiency bound, but more nuanced interpretation when models are misspecified. Unlike existing projection estimators, our methods avoid inverse exposure density weighting and do not demand tailored shift interventions to address positivity violations. Extensive simulations and a re-analysis of the Bangladesh Wash Benefits study demonstrate the effectiveness, stability, and utility of our approach. This work advances assumption-lean methods that balance validity, interpretability, and efficiency.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144324437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian covariate-dependent graph learning with a dual group spike-and-slab prior. 具有双群峰-板先验的贝叶斯协变量相关图学习。
IF 1.4 4区 数学 Q3 BIOLOGY Pub Date : 2025-04-02 DOI: 10.1093/biomtc/ujaf053
Zijian Zeng, Meng Li, Marina Vannucci

Covariate-dependent graph learning has gained increasing interest in the graphical modeling literature for the analysis of heterogeneous data. This task, however, poses challenges to modeling, computational efficiency, and interpretability. The parameter of interest can be naturally represented as a 3-dimensional array with elements that can be grouped according to 2 directions, corresponding to node level and covariate level, respectively. In this article, we propose a novel dual group spike-and-slab prior that enables multi-level selection at covariate-level and node-level, as well as individual (local) level sparsity. We introduce a nested strategy with specific choices to address distinct challenges posed by the various grouping directions. For posterior inference, we develop a full Gibbs sampler for all parameters, which mitigates the difficulties of parameter tuning often encountered in high-dimensional graphical models and facilitates routine implementation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of graph recovery. We show the practical utility of our model via an application to microbiome data where we seek to better understand the interactions among microbes as well as how these are affected by relevant covariates.

协变量相关图学习在异构数据分析的图形建模文献中获得了越来越多的兴趣。然而,这项任务对建模、计算效率和可解释性提出了挑战。感兴趣的参数可以自然地表示为一个三维数组,其中的元素可以按照2个方向分组,分别对应于节点级别和协变量级别。在本文中,我们提出了一种新的双群尖峰-板先验,它可以在协变量水平和节点水平以及个体(局部)水平稀疏度上进行多级选择。我们引入了一个具有特定选择的嵌套策略,以解决各种分组方向带来的不同挑战。对于后验推理,我们为所有参数开发了一个完整的Gibbs采样器,这减轻了在高维图形模型中经常遇到的参数调整困难,并便于日常实现。通过仿真研究,我们证明了该模型在图恢复精度上优于现有方法。我们通过微生物组数据的应用程序展示了我们模型的实际效用,我们试图更好地了解微生物之间的相互作用以及这些相互作用如何受到相关协变量的影响。
{"title":"Bayesian covariate-dependent graph learning with a dual group spike-and-slab prior.","authors":"Zijian Zeng, Meng Li, Marina Vannucci","doi":"10.1093/biomtc/ujaf053","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf053","url":null,"abstract":"<p><p>Covariate-dependent graph learning has gained increasing interest in the graphical modeling literature for the analysis of heterogeneous data. This task, however, poses challenges to modeling, computational efficiency, and interpretability. The parameter of interest can be naturally represented as a 3-dimensional array with elements that can be grouped according to 2 directions, corresponding to node level and covariate level, respectively. In this article, we propose a novel dual group spike-and-slab prior that enables multi-level selection at covariate-level and node-level, as well as individual (local) level sparsity. We introduce a nested strategy with specific choices to address distinct challenges posed by the various grouping directions. For posterior inference, we develop a full Gibbs sampler for all parameters, which mitigates the difficulties of parameter tuning often encountered in high-dimensional graphical models and facilitates routine implementation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of graph recovery. We show the practical utility of our model via an application to microbiome data where we seek to better understand the interactions among microbes as well as how these are affected by relevant covariates.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143962246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1