首页 > 最新文献

Statistics in Medicine最新文献

英文 中文
A Commentary on Chatterjee Et Al. (2018): A Corrected Framework for Group Sparsity in Zero-Inflated Negative Binomial Models. 对Chatterjee等人(2018)的评论:零膨胀负二项模型中群体稀疏性的修正框架。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70356
Adam Iqbal, Himel Mallick, Emmanuel O Ogundimu

We reexamine GOOOGLE, the group-regularized zero-inflated negative binomial (ZINB) approach of Chatterjee et al. We show that in the released implementation, the tuning parameter is selected using a Bayesian information criterion (BIC) computed on a Gaussian surrogate. Because the unpenalized model fits this surrogate exactly (with zero residual sum of squares), the BIC always favors essentially unpenalized solutions and fails to induce group sparsity. This results in zero group specificity in simulations mirroring the original paper's design. We demonstrate that this issue is resolved by selecting the tuning parameter using the true ZINB log-likelihood. Furthermore, we propose the fully iterative group broken adaptive ridge (grBAR) estimator as a more robust alternative. Our open-source R package, group regularization algorithms for zero-inflated models (GRAZIMs), provides these tools to enable reliable group selection in ZINB models.

我们重新审视google, Chatterjee等人的群正则化零膨胀负二项(ZINB)方法。我们展示了在发布的实现中,使用在高斯代理上计算的贝叶斯信息准则(BIC)来选择调优参数。因为无惩罚模型完全适合这个代理(残差平方和为零),所以BIC总是倾向于本质上无惩罚的解决方案,并且无法诱导群体稀疏性。这导致模拟中的零组特异性反映了原始论文的设计。我们演示了通过使用真正的ZINB对数似然选择调优参数来解决这个问题。此外,我们提出了全迭代群破碎自适应脊估计(grBAR)作为一个更鲁棒的替代方法。我们的开源R包,零膨胀模型的组正则化算法(GRAZIMs),提供了这些工具,可以在ZINB模型中实现可靠的组选择。
{"title":"A Commentary on Chatterjee Et Al. (2018): A Corrected Framework for Group Sparsity in Zero-Inflated Negative Binomial Models.","authors":"Adam Iqbal, Himel Mallick, Emmanuel O Ogundimu","doi":"10.1002/sim.70356","DOIUrl":"10.1002/sim.70356","url":null,"abstract":"<p><p>We reexamine GOOOGLE, the group-regularized zero-inflated negative binomial (ZINB) approach of Chatterjee et al. We show that in the released implementation, the tuning parameter is selected using a Bayesian information criterion (BIC) computed on a Gaussian surrogate. Because the unpenalized model fits this surrogate exactly (with zero residual sum of squares), the BIC always favors essentially unpenalized solutions and fails to induce group sparsity. This results in zero group specificity in simulations mirroring the original paper's design. We demonstrate that this issue is resolved by selecting the tuning parameter using the true ZINB log-likelihood. Furthermore, we propose the fully iterative group broken adaptive ridge (grBAR) estimator as a more robust alternative. Our open-source R package, group regularization algorithms for zero-inflated models (GRAZIMs), provides these tools to enable reliable group selection in ZINB models.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70356"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12703072/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145757591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Random-Effects Approach to Generalized Linear Mixed Model Analysis of Incomplete Longitudinal Data. 不完全纵向数据广义线性混合模型分析的随机效应方法。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70343
Thuan Nguyen, Jiangshan Zhang, Jiming Jiang

We propose a random-effects approach to missing values for generalized linear mixed model (GLMM) analysis of longitudinal data. The method converts a GLMM with missing covariates to another GLMM without missing covariates. The standard GLMM analysis tools then apply. The method applies, in particular, to the cases of linear mixed models (LMMs) and logistic regression. Performance of the method is evaluated empirically, and compared with alternative approaches, including the popular MICE procedure of multiple imputation (MI). Theoretical justification of the method is given, and explained, for the patterns observed in the simulation studies. Two real-data examples from healthcare studies are discussed.

我们提出了一种随机效应方法来处理纵向数据的广义线性混合模型(GLMM)分析的缺失值。该方法将缺少协变量的GLMM转换为不缺少协变量的另一个GLMM。然后应用标准的GLMM分析工具。该方法特别适用于线性混合模型和逻辑回归的情况。对该方法的性能进行了经验评估,并与其他方法进行了比较,包括流行的多重imputation (MI)的MICE过程。给出了该方法的理论依据,并对仿真研究中观察到的模式进行了解释。本文讨论了医疗保健研究中的两个实际数据实例。
{"title":"A Random-Effects Approach to Generalized Linear Mixed Model Analysis of Incomplete Longitudinal Data.","authors":"Thuan Nguyen, Jiangshan Zhang, Jiming Jiang","doi":"10.1002/sim.70343","DOIUrl":"10.1002/sim.70343","url":null,"abstract":"<p><p>We propose a random-effects approach to missing values for generalized linear mixed model (GLMM) analysis of longitudinal data. The method converts a GLMM with missing covariates to another GLMM without missing covariates. The standard GLMM analysis tools then apply. The method applies, in particular, to the cases of linear mixed models (LMMs) and logistic regression. Performance of the method is evaluated empirically, and compared with alternative approaches, including the popular MICE procedure of multiple imputation (MI). Theoretical justification of the method is given, and explained, for the patterns observed in the simulation studies. Two real-data examples from healthcare studies are discussed.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70343"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145655600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering-Informed Shared-Structure Variational Autoencoder for Missing Data Imputation in Large-Scale Healthcare Data. 基于聚类的共享结构变分自编码器的大规模医疗数据缺失数据输入。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70335
Yasin Khadem Charvadeh, Kenneth Seier, Katherine S Panageas, Danielle Vaithilingam, Mithat Gönen, Yuan Chen

Despite advancements in healthcare data management, missing data in electronic health records (EHR) and patient-reported outcomes remain a persistent challenge, limiting their usability in healthcare analytics. Conventional imputation methods often struggle to capture complex nonlinear relationships, require extensive computation time, and are limited in addressing various types of missing data mechanisms. To overcome these challenges, we propose the clustering-informed shared-structure variational autoencoder (CISS-VAE), which utilizes the strengths of Bayesian neural networks. This model can effectively capture complex associations and accommodate various missing data mechanisms, including missing not at random (MNAR). We also develop iterative learning algorithms that further enhance missing data imputation accuracy while preventing overfitting. Comprehensive simulations demonstrate the superior accuracy of our model compared to traditional and contemporary methods. We apply our method to EHR data from early-stage breast cancer patients at Memorial Sloan Kettering Cancer Center, aiming to mitigate the impact of missing data and enhance health monitoring and analyses.

尽管医疗保健数据管理取得了进步,但电子健康记录(EHR)和患者报告结果中的数据缺失仍然是一个持续的挑战,限制了它们在医疗保健分析中的可用性。传统的插值方法往往难以捕获复杂的非线性关系,需要大量的计算时间,并且在处理各种类型的缺失数据机制方面受到限制。为了克服这些挑战,我们提出了利用贝叶斯神经网络优势的聚类信息共享结构变分自编码器(csis - vae)。该模型可以有效地捕获复杂的关联,并适应各种缺失数据机制,包括非随机缺失(MNAR)。我们还开发了迭代学习算法,进一步提高缺失数据的输入精度,同时防止过拟合。综合仿真表明,与传统和现代方法相比,我们的模型具有更高的精度。我们将我们的方法应用于纪念斯隆凯特琳癌症中心早期乳腺癌患者的电子病历数据,旨在减轻数据缺失的影响,加强健康监测和分析。
{"title":"Clustering-Informed Shared-Structure Variational Autoencoder for Missing Data Imputation in Large-Scale Healthcare Data.","authors":"Yasin Khadem Charvadeh, Kenneth Seier, Katherine S Panageas, Danielle Vaithilingam, Mithat Gönen, Yuan Chen","doi":"10.1002/sim.70335","DOIUrl":"https://doi.org/10.1002/sim.70335","url":null,"abstract":"<p><p>Despite advancements in healthcare data management, missing data in electronic health records (EHR) and patient-reported outcomes remain a persistent challenge, limiting their usability in healthcare analytics. Conventional imputation methods often struggle to capture complex nonlinear relationships, require extensive computation time, and are limited in addressing various types of missing data mechanisms. To overcome these challenges, we propose the clustering-informed shared-structure variational autoencoder (CISS-VAE), which utilizes the strengths of Bayesian neural networks. This model can effectively capture complex associations and accommodate various missing data mechanisms, including missing not at random (MNAR). We also develop iterative learning algorithms that further enhance missing data imputation accuracy while preventing overfitting. Comprehensive simulations demonstrate the superior accuracy of our model compared to traditional and contemporary methods. We apply our method to EHR data from early-stage breast cancer patients at Memorial Sloan Kettering Cancer Center, aiming to mitigate the impact of missing data and enhance health monitoring and analyses.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70335"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145661765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Likelihood-Based Non-Parametric Receiver Operating Characteristic Curve Analysis in the Presence of Imperfect Reference Standard. 参考标准不完善情况下基于似然的非参数接收者工作特性曲线分析。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70327
Yifan Sun, Peijun Sang, Qinglong Tian, Pengfei Li

In diagnostic studies, researchers frequently encounter imperfect reference standards with some misclassified labels. Treating these as gold standards can bias receiver operating characteristic (ROC) curve analysis. To address this issue, we propose a novel likelihood-based method under a non-parametric density ratio model. This approach enables the reliable estimation of the ROC curve, area under the curve (AUC), partial AUC, and Youden's index with favorable statistical properties. To implement the method, we develop an efficient expectation-maximization (EM) algorithm. Extensive simulations evaluate its finite-sample performance, showing smaller mean squared errors in estimating the ROC curve, partial AUC, and Youden's index compared to existing methods. We apply the proposed approach to a malaria study.

在诊断研究中,研究人员经常遇到不完善的参考标准和一些错误分类的标签。将这些作为金标准可能会对受试者工作特征(ROC)曲线分析产生偏差。为了解决这个问题,我们在非参数密度比模型下提出了一种新的基于似然的方法。该方法能够可靠地估计ROC曲线、曲线下面积(AUC)、部分AUC和Youden指数,具有良好的统计特性。为了实现该方法,我们开发了一种高效的期望最大化(EM)算法。大量的模拟评估了其有限样本性能,与现有方法相比,在估计ROC曲线、部分AUC和Youden指数方面显示出更小的均方误差。我们将提出的方法应用于疟疾研究。
{"title":"Likelihood-Based Non-Parametric Receiver Operating Characteristic Curve Analysis in the Presence of Imperfect Reference Standard.","authors":"Yifan Sun, Peijun Sang, Qinglong Tian, Pengfei Li","doi":"10.1002/sim.70327","DOIUrl":"10.1002/sim.70327","url":null,"abstract":"<p><p>In diagnostic studies, researchers frequently encounter imperfect reference standards with some misclassified labels. Treating these as gold standards can bias receiver operating characteristic (ROC) curve analysis. To address this issue, we propose a novel likelihood-based method under a non-parametric density ratio model. This approach enables the reliable estimation of the ROC curve, area under the curve (AUC), partial AUC, and Youden's index with favorable statistical properties. To implement the method, we develop an efficient expectation-maximization (EM) algorithm. Extensive simulations evaluate its finite-sample performance, showing smaller mean squared errors in estimating the ROC curve, partial AUC, and Youden's index compared to existing methods. We apply the proposed approach to a malaria study.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70327"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12680895/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145687990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Tutorial for Propensity Score Weighting Methods Under Violations of the Positivity Assumption. 违反正性假设的倾向得分加权方法教程。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70329
Yi Liu, Yuan Wang, Ying Gao, Tonia Poteat, Roland A Matsouaka

Violations of the positivity assumption can render conventional causal estimands unidentifiable, including the average treatment effect (ATE), the average treatment effect on the treated (ATT), and the average treatment effect on the controls (ATC). Shifting the inferential focus to their alternative counterparts-the weighted ATE (WATE), the weighted ATT (WATT), and the weighted ATC (WATC)-offers valuable insights into treatment effects while preserving internal validity. In this tutorial, we provide a comprehensive review of recent advances in propensity score (PS) weighting methods, along with practical guidance on how to select a primary target estimand (while other estimands serve as supplementary analyses), implement the corresponding PS-weighted estimators, and conduct post-weighting diagnostic assessments. The tutorial is accompanied by a user-friendly R package, ChiPS. We demonstrate the pertinence of various estimators through extensive simulation studies. We illustrate the flow of the tutorial on two real-world case studies: (i) Effect of smoking on blood lead level using data from the 2007-2008 National Health and Nutrition Examination Survey (NHANES); and (ii) Impact of history of sex work on HIV status among transgender women in South Africa.

对正性假设的违反可能导致传统的因果估计无法识别,包括平均治疗效果(ATE),对被治疗者的平均治疗效果(ATT)和对对照组的平均治疗效果(ATC)。将推理焦点转移到它们的替代对应物上——加权ATE (water)、加权ATT (WATT)和加权ATC (WATC)——在保持内部有效性的同时,为治疗效果提供了有价值的见解。在本教程中,我们全面回顾了倾向得分(PS)加权方法的最新进展,以及如何选择主要目标估计(而其他估计作为补充分析),实现相应的PS加权估计器以及进行加权后诊断评估的实用指导。该教程附有一个用户友好的R包ChiPS。我们通过广泛的模拟研究证明了各种估计器的相关性。我们通过两个现实世界的案例研究来说明教程的流程:(i)使用2007-2008年国家健康和营养检查调查(NHANES)的数据,吸烟对血铅水平的影响;(二)南非性工作史对跨性别妇女艾滋病毒感染状况的影响。
{"title":"A Tutorial for Propensity Score Weighting Methods Under Violations of the Positivity Assumption.","authors":"Yi Liu, Yuan Wang, Ying Gao, Tonia Poteat, Roland A Matsouaka","doi":"10.1002/sim.70329","DOIUrl":"https://doi.org/10.1002/sim.70329","url":null,"abstract":"<p><p>Violations of the positivity assumption can render conventional causal estimands unidentifiable, including the average treatment effect (ATE), the average treatment effect on the treated (ATT), and the average treatment effect on the controls (ATC). Shifting the inferential focus to their alternative counterparts-the weighted ATE (WATE), the weighted ATT (WATT), and the weighted ATC (WATC)-offers valuable insights into treatment effects while preserving internal validity. In this tutorial, we provide a comprehensive review of recent advances in propensity score (PS) weighting methods, along with practical guidance on how to select a primary target estimand (while other estimands serve as supplementary analyses), implement the corresponding PS-weighted estimators, and conduct post-weighting diagnostic assessments. The tutorial is accompanied by a user-friendly R package, ChiPS. We demonstrate the pertinence of various estimators through extensive simulation studies. We illustrate the flow of the tutorial on two real-world case studies: (i) Effect of smoking on blood lead level using data from the 2007-2008 National Health and Nutrition Examination Survey (NHANES); and (ii) Impact of history of sex work on HIV status among transgender women in South Africa.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70329"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12659692/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145639967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Individualized Treatment Effects to Assess Treatment Effect Heterogeneity. 使用个体化治疗效果评估治疗效果异质性。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70324
Konstantinos Sechidis, Cong Zhang, Sophie Sun, Yao Chen, Asher Spector, Björn Bornkamp

Assessing treatment effect heterogeneity (TEH) in clinical trials is crucial, as it provides insights into the variability of treatment responses among patients, influencing key decisions related to drug development. Furthermore, it can lead to personalized medicine by tailoring treatments to individual patient characteristics. This paper introduces novel methodologies for assessing treatment effects using the individualized treatment effect as a basis. To estimate this effect, we use a doubly robust (DR) learner to infer a pseudo-outcome that reflects the causal contrast. This pseudo-outcome is then used to perform three objectives: (1) a global test for heterogeneity, (2) ranking covariates based on their influence on effect modification, and (3) providing estimates of the individualized treatment effect. We compare the DR-learner with various alternatives and competing methods in a simulation study, and also use it to assess heterogeneity in a pooled analysis of five Phase III trials in psoriatic arthritis (PsA). By integrating these methods with the recently proposed Workflow to Assess Treatment Effect Heterogeneity in Drug Development for Clinical Trial Sponsors (WATCH) workflow, we provide a robust framework for analyzing TEH, offering insights that enable more informed decision-making in this challenging area.

评估临床试验中的治疗效果异质性(TEH)至关重要,因为它提供了对患者治疗反应可变性的见解,影响与药物开发相关的关键决策。此外,它还可以根据患者的个体特征定制治疗方案,从而实现个性化医疗。本文介绍了以个体化治疗效果为基础评估治疗效果的新方法。为了估计这种效果,我们使用双鲁棒(DR)学习器来推断反映因果对比的伪结果。然后,这个伪结果用于实现三个目标:(1)对异质性进行全局检验,(2)根据协变量对效果修改的影响对其进行排序,(3)提供个体化治疗效果的估计。在一项模拟研究中,我们将DR-learner与各种替代方法和竞争方法进行了比较,并使用DR-learner对银屑病关节炎(PsA)的5项III期试验进行了汇总分析,以评估其异质性。通过将这些方法与最近提出的评估临床试验发起人药物开发治疗效果异质性的工作流程(WATCH)工作流程相结合,我们为分析TEH提供了一个强大的框架,为在这一具有挑战性的领域做出更明智的决策提供了见解。
{"title":"Using Individualized Treatment Effects to Assess Treatment Effect Heterogeneity.","authors":"Konstantinos Sechidis, Cong Zhang, Sophie Sun, Yao Chen, Asher Spector, Björn Bornkamp","doi":"10.1002/sim.70324","DOIUrl":"https://doi.org/10.1002/sim.70324","url":null,"abstract":"<p><p>Assessing treatment effect heterogeneity (TEH) in clinical trials is crucial, as it provides insights into the variability of treatment responses among patients, influencing key decisions related to drug development. Furthermore, it can lead to personalized medicine by tailoring treatments to individual patient characteristics. This paper introduces novel methodologies for assessing treatment effects using the individualized treatment effect as a basis. To estimate this effect, we use a doubly robust (DR) learner to infer a pseudo-outcome that reflects the causal contrast. This pseudo-outcome is then used to perform three objectives: (1) a global test for heterogeneity, (2) ranking covariates based on their influence on effect modification, and (3) providing estimates of the individualized treatment effect. We compare the DR-learner with various alternatives and competing methods in a simulation study, and also use it to assess heterogeneity in a pooled analysis of five Phase III trials in psoriatic arthritis (PsA). By integrating these methods with the recently proposed Workflow to Assess Treatment Effect Heterogeneity in Drug Development for Clinical Trial Sponsors (WATCH) workflow, we provide a robust framework for analyzing TEH, offering insights that enable more informed decision-making in this challenging area.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70324"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145639997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiparametric Partial Functional Regression Model for Estimating Optimal Individualized Treatment Regime. 估计最优个体化治疗方案的半参数偏函数回归模型。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70355
Kaidi Kong, Li Guan, Zhongzhan Zhang

Estimating the optimal individualized treatment regimes based on patients' characteristic information has become an increasingly important topic in personalized medicine study. These characteristic data can range from simple scalar values to complex functional data such as curves or images, which might be taken into account to recommend more beneficial treatment decisions for patients. In this paper, we propose a novel semiparametric partial functional regression model for estimating the optimal individualized treatment regimes with scalar and functional covariates. One advantage of this model is that it involves a fully nonparametric main effect of the covariates and a flexible interaction effect between the covariates and the treatment, and greatly reduces the risk of model misspecification. The form of single index interaction effect with a monotone link function ensures the estimated optimal individualized treatment regime preserving good interpretability. We estimate this model via B-spline and establish the convergence rate of the estimated optimal individualized treatment regime. Sufficient simulation studies and a real data analysis are conducted to assess the performance of the proposed method.

基于患者特征信息估计最佳个体化治疗方案已成为个体化医学研究中日益重要的课题。这些特征数据的范围可以从简单的标量值到复杂的函数数据,如曲线或图像,可以考虑这些数据,为患者推荐更有益的治疗决策。本文提出了一种新的半参数偏泛函回归模型,用于估计具有标量协变量和泛函协变量的最优个体化治疗方案。该模型的一个优点是它包含了协变量的完全非参数主效应和协变量与处理之间的灵活交互效应,大大降低了模型错误规范的风险。单指标交互效应与单调关联函数的形式保证了估计的最优个体化治疗方案保持良好的可解释性。我们通过b样条估计了该模型,并建立了估计的最优个体化治疗方案的收敛率。进行了充分的仿真研究和实际数据分析,以评估所提出的方法的性能。
{"title":"Semiparametric Partial Functional Regression Model for Estimating Optimal Individualized Treatment Regime.","authors":"Kaidi Kong, Li Guan, Zhongzhan Zhang","doi":"10.1002/sim.70355","DOIUrl":"https://doi.org/10.1002/sim.70355","url":null,"abstract":"<p><p>Estimating the optimal individualized treatment regimes based on patients' characteristic information has become an increasingly important topic in personalized medicine study. These characteristic data can range from simple scalar values to complex functional data such as curves or images, which might be taken into account to recommend more beneficial treatment decisions for patients. In this paper, we propose a novel semiparametric partial functional regression model for estimating the optimal individualized treatment regimes with scalar and functional covariates. One advantage of this model is that it involves a fully nonparametric main effect of the covariates and a flexible interaction effect between the covariates and the treatment, and greatly reduces the risk of model misspecification. The form of single index interaction effect with a monotone link function ensures the estimated optimal individualized treatment regime preserving good interpretability. We estimate this model via B-spline and establish the convergence rate of the estimated optimal individualized treatment regime. Sufficient simulation studies and a real data analysis are conducted to assess the performance of the proposed method.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70355"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145763879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Streaming K-Nearest Neighbor Algorithm for Status Prediction in Block-Sparse, Autocorrelated, Irregular Longitudinal Data. 一种新的流k近邻算法用于块稀疏、自相关、不规则纵向数据的状态预测。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70332
Xin Zhao, Xiaokai Nie, Yu Zhao, Kaida Cai

In streaming longitudinal data, status prediction becomes challenging when input variables are block-sparse, autocorrelated, and irregular in both dimension and distribution. General methods cannot model such data directly, especially when the classes are extremely imbalanced. This research proposes a K-Nearest Neighbor (KNN) algorithm where distance is measured by Kullback-Leibler (KL) divergence. The algorithm uses features extracted from metric conditional density, both with and without first-order lag. The developed streaming KNN algorithm is further applied to simulation data. Results show that when differences originate from the location hyperparameters of the Gaussian distribution or both the shape and scale hyperparameters of the inverse gamma distribution, the method performs quite well, as expected, with an AUC close to 1. Additionally, a numerical method is proposed for general distributions that lack an analytical expression in real data. This method is applied to a big medical streaming dataset with similar properties. Results indicate that the AUC value gradually increases to 0.913, with a sensitivity of 0.851 and a specificity of 0.816.

在纵向流数据中,当输入变量是块稀疏的、自相关的、维度和分布不规则的时,状态预测变得很有挑战性。一般方法不能直接对这类数据建模,特别是当类非常不平衡时。本文提出了一种k-最近邻(KNN)算法,该算法通过Kullback-Leibler (KL)散度来测量距离。该算法使用从度量条件密度中提取的特征,具有和不具有一阶滞后。将所提出的流式KNN算法进一步应用于仿真数据。结果表明,当差异来自高斯分布的位置超参数或反伽玛分布的形状和尺度超参数时,该方法的性能与预期的一样好,AUC接近1。此外,对于实际数据中缺乏解析表达式的一般分布,提出了一种数值方法。将该方法应用于具有相似属性的大型医疗流数据集。结果表明,AUC值逐渐增大至0.913,敏感性为0.851,特异性为0.816。
{"title":"A New Streaming K-Nearest Neighbor Algorithm for Status Prediction in Block-Sparse, Autocorrelated, Irregular Longitudinal Data.","authors":"Xin Zhao, Xiaokai Nie, Yu Zhao, Kaida Cai","doi":"10.1002/sim.70332","DOIUrl":"https://doi.org/10.1002/sim.70332","url":null,"abstract":"<p><p>In streaming longitudinal data, status prediction becomes challenging when input variables are block-sparse, autocorrelated, and irregular in both dimension and distribution. General methods cannot model such data directly, especially when the classes are extremely imbalanced. This research proposes a K-Nearest Neighbor (KNN) algorithm where distance is measured by Kullback-Leibler (KL) divergence. The algorithm uses features extracted from metric conditional density, both with and without first-order lag. The developed streaming KNN algorithm is further applied to simulation data. Results show that when differences originate from the location hyperparameters of the Gaussian distribution or both the shape and scale hyperparameters of the inverse gamma distribution, the method performs quite well, as expected, with an AUC close to 1. Additionally, a numerical method is proposed for general distributions that lack an analytical expression in real data. This method is applied to a big medical streaming dataset with similar properties. Results indicate that the AUC value gradually increases to 0.913, with a sensitivity of 0.851 and a specificity of 0.816.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70332"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145655595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Bayesian Hidden Markov Model With Subject-Specific Transitions for Wearable Sensor Data. 可穿戴传感器数据具有主题特定转换的联合贝叶斯隐马尔可夫模型。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70334
Wenbo Fei, Zhen Miao, Tianchen Xu, Yuanjia Wang

With the rapid advancements in wearable device technologies, there is a growing interest in learning useful digital biomarkers from wearable device data as objective, low-cost, real-time alternatives to use in healthcare settings. They have the potential to facilitate disease progression monitoring, medication tailoring, and supplementing clinical trial endpoints. For example, triaxial accelerometer sensor data is promising for monitoring symptoms of movement-related diseases, such as tremors in Parkinson's disease (PD). However, existing methods for accelerometer studies based on hidden Markov models (HMM) often analyze each individual's activity data separately, leading to inefficiency and limited generalizability. This paper proposes a joint nonparametric Bayesian method that extends the hierarchical Dirichlet process autoregressive HMM (HDP-AR-HMM) to incorporate subject-specific transition parameters. This approach allows for simultaneous estimation across multiple subjects and repeated measurements, accounts for between-subject variability, and provides consistent hidden state estimation without pre-specifying the number of states. We validate our method on simulated data and show that it can achieve higher accuracy in detecting the true hidden states compared to alternative methods. We apply the method to a free-living study, the Biomarker & Endpoint Assessment to Track Parkinson's disease (BEAT-PD) DREAM Challenge CIS-PD study, to demonstrate its utility in monitoring disease symptoms in PD patients.

随着可穿戴设备技术的快速发展,人们越来越有兴趣从可穿戴设备数据中学习有用的数字生物标志物,作为在医疗保健环境中使用的客观、低成本、实时的替代方案。它们具有促进疾病进展监测、药物定制和补充临床试验终点的潜力。例如,三轴加速度计传感器数据有望用于监测运动相关疾病的症状,如帕金森病(PD)的震颤。然而,现有的基于隐马尔可夫模型(HMM)的加速度计研究方法往往是单独分析每个个体的活动数据,导致效率低下和泛化能力有限。本文提出了一种联合非参数贝叶斯方法,该方法扩展了层次Dirichlet过程自回归HMM (HDP-AR-HMM),以纳入特定主题的过渡参数。这种方法允许跨多个主题和重复测量的同时估计,考虑到主题之间的可变性,并在不预先指定状态数量的情况下提供一致的隐藏状态估计。在仿真数据上验证了该方法的有效性,结果表明,与其他方法相比,该方法在检测真实隐藏状态方面具有更高的准确性。我们将该方法应用于一项自由生活研究,即跟踪帕金森病的生物标志物和终点评估(BEAT-PD) DREAM Challenge CIS-PD研究,以证明其在监测帕金森病患者疾病症状方面的实用性。
{"title":"Joint Bayesian Hidden Markov Model With Subject-Specific Transitions for Wearable Sensor Data.","authors":"Wenbo Fei, Zhen Miao, Tianchen Xu, Yuanjia Wang","doi":"10.1002/sim.70334","DOIUrl":"https://doi.org/10.1002/sim.70334","url":null,"abstract":"<p><p>With the rapid advancements in wearable device technologies, there is a growing interest in learning useful digital biomarkers from wearable device data as objective, low-cost, real-time alternatives to use in healthcare settings. They have the potential to facilitate disease progression monitoring, medication tailoring, and supplementing clinical trial endpoints. For example, triaxial accelerometer sensor data is promising for monitoring symptoms of movement-related diseases, such as tremors in Parkinson's disease (PD). However, existing methods for accelerometer studies based on hidden Markov models (HMM) often analyze each individual's activity data separately, leading to inefficiency and limited generalizability. This paper proposes a joint nonparametric Bayesian method that extends the hierarchical Dirichlet process autoregressive HMM (HDP-AR-HMM) to incorporate subject-specific transition parameters. This approach allows for simultaneous estimation across multiple subjects and repeated measurements, accounts for between-subject variability, and provides consistent hidden state estimation without pre-specifying the number of states. We validate our method on simulated data and show that it can achieve higher accuracy in detecting the true hidden states compared to alternative methods. We apply the method to a free-living study, the Biomarker & Endpoint Assessment to Track Parkinson's disease (BEAT-PD) DREAM Challenge CIS-PD study, to demonstrate its utility in monitoring disease symptoms in PD patients.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70334"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145688010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On "Confirmatory" Methodological Research in Statistics and Related Fields. 论统计学及相关领域的“验证性”方法论研究。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-01 DOI: 10.1002/sim.70303
F J D Lange, Juliane C Wilcke, Sabine Hoffmann, Moritz Herrmann, Anne-Laure Boulesteix

Empirical substantive research, such as in the life or social sciences, is commonly categorized into the two modes exploratory and confirmatory, both of which are essential to scientific progress. The former is also referred to as hypothesis-generating or data-contingent research, while the latter is also called hypothesis-testing research. In the context of empirical methodological research in statistics, however, the exploratory-confirmatory distinction has received very little attention so far. Our paper aims to fill this gap. First, we revisit the concept of empirical methodological research through the lens of the exploratory-confirmatory distinction. Second, we examine current practice with respect to this distinction through a literature survey including 115 articles from the field of biostatistics. Third, we provide practical recommendations toward a more appropriate design, interpretation, and reporting of empirical methodological research in light of this distinction. In particular, we argue that both modes of research are crucial to methodological progress, but that most published studies-even if sometimes disguised as confirmatory-are essentially exploratory in nature. We emphasize that it may be adequate to consider empirical methodological research as a continuum between "pure" exploration and "strict" confirmation, recommend transparently reporting the mode of conducted research within the spectrum between exploratory and confirmatory, and stress the importance of study protocols written before conducting the study, especially in confirmatory methodological research.

实证实质性研究,如生命科学或社会科学,通常分为探索性和验证性两种模式,这两种模式对科学进步都是必不可少的。前者也被称为假设生成研究或数据偶然研究,而后者也被称为假设检验研究。然而,在统计学实证方法论研究的背景下,迄今为止,探索性-确证性的区别很少受到关注。本文旨在填补这一空白。首先,我们通过探索性-确证性区分的视角重新审视实证方法论研究的概念。其次,我们通过包括115篇来自生物统计学领域的文章的文献调查来研究关于这种区分的当前实践。第三,根据这一区别,我们为更适当的设计、解释和报告实证方法研究提供了实用的建议。特别是,我们认为这两种研究模式对方法论的进步至关重要,但大多数已发表的研究——即使有时伪装成证实性的——本质上是探索性的。我们强调,将实证方法学研究视为“纯粹”探索和“严格”确认之间的连续体可能是足够的,建议在探索性和验证性之间的范围内透明地报告所进行研究的模式,并强调在进行研究之前编写研究协议的重要性,特别是在验证性方法学研究中。
{"title":"On \"Confirmatory\" Methodological Research in Statistics and Related Fields.","authors":"F J D Lange, Juliane C Wilcke, Sabine Hoffmann, Moritz Herrmann, Anne-Laure Boulesteix","doi":"10.1002/sim.70303","DOIUrl":"10.1002/sim.70303","url":null,"abstract":"<p><p>Empirical substantive research, such as in the life or social sciences, is commonly categorized into the two modes exploratory and confirmatory, both of which are essential to scientific progress. The former is also referred to as hypothesis-generating or data-contingent research, while the latter is also called hypothesis-testing research. In the context of empirical methodological research in statistics, however, the exploratory-confirmatory distinction has received very little attention so far. Our paper aims to fill this gap. First, we revisit the concept of empirical methodological research through the lens of the exploratory-confirmatory distinction. Second, we examine current practice with respect to this distinction through a literature survey including 115 articles from the field of biostatistics. Third, we provide practical recommendations toward a more appropriate design, interpretation, and reporting of empirical methodological research in light of this distinction. In particular, we argue that both modes of research are crucial to methodological progress, but that most published studies-even if sometimes disguised as confirmatory-are essentially exploratory in nature. We emphasize that it may be adequate to consider empirical methodological research as a continuum between \"pure\" exploration and \"strict\" confirmation, recommend transparently reporting the mode of conducted research within the spectrum between exploratory and confirmatory, and stress the importance of study protocols written before conducting the study, especially in confirmatory methodological research.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 25-27","pages":"e70303"},"PeriodicalIF":1.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12600059/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145490425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistics in Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1