首页 > 最新文献

Biometrics最新文献

英文 中文
Entrywise splitting cross-validation in generalized factor models: from sample splitting to entrywise splitting. 广义因子模型的入项分裂交叉验证:从样本分裂到入项分裂。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf153
Zhijing Wang

The generalized factor models have been widely employed for dimension reduction across various types of multivariate data, including binary choices, counts, and continuous observations. While determining the number of factors in such models has received significant scholarly attention, it remains an open challenge in the field. In this paper, we propose a cross-validation (CV) method based on entrywise splitting (ES), rather than sample splitting, to address this problem. Similar to traditional cross-validation, this approach primarily prevents the underestimation of the number of factors. We then introduce a penalized entrywise splitting cross-validation criterion, which integrates the original CV with information theoretic criteria by adding a penalty term. Its consistency is established under mild conditions in a high-dimensional setting, where both the sample size and the number of features grow to infinity. Furthermore, we extend our methodology to random missing data with different probability scenarios. We evaluate the performance of the proposed method through comprehensive simulations and apply it to a mouse brain single-cell RNA sequencing dataset.

广义因子模型已被广泛用于各种类型的多变量数据的降维,包括二元选择、计数和连续观测。虽然确定这些模型中的因素数量已经受到了重要的学术关注,但它仍然是该领域的一个公开挑战。在本文中,我们提出了一种基于入口分裂(ES)而不是样本分裂的交叉验证(CV)方法来解决这个问题。与传统的交叉验证类似,这种方法主要防止了对因素数量的低估。然后,我们引入了一个受惩罚的入口分裂交叉验证准则,该准则通过添加惩罚项将原始CV与信息论准则相结合。它的一致性是在温和的条件下建立的,在高维环境中,样本大小和特征数量都增长到无穷大。此外,我们将我们的方法扩展到具有不同概率情景的随机丢失数据。我们通过综合模拟评估了所提出方法的性能,并将其应用于小鼠大脑单细胞RNA测序数据集。
{"title":"Entrywise splitting cross-validation in generalized factor models: from sample splitting to entrywise splitting.","authors":"Zhijing Wang","doi":"10.1093/biomtc/ujaf153","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf153","url":null,"abstract":"<p><p>The generalized factor models have been widely employed for dimension reduction across various types of multivariate data, including binary choices, counts, and continuous observations. While determining the number of factors in such models has received significant scholarly attention, it remains an open challenge in the field. In this paper, we propose a cross-validation (CV) method based on entrywise splitting (ES), rather than sample splitting, to address this problem. Similar to traditional cross-validation, this approach primarily prevents the underestimation of the number of factors. We then introduce a penalized entrywise splitting cross-validation criterion, which integrates the original CV with information theoretic criteria by adding a penalty term. Its consistency is established under mild conditions in a high-dimensional setting, where both the sample size and the number of features grow to infinity. Furthermore, we extend our methodology to random missing data with different probability scenarios. We evaluate the performance of the proposed method through comprehensive simulations and apply it to a mouse brain single-cell RNA sequencing dataset.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145628656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible Bayesian quantile regression for counts via generative modeling. 灵活的贝叶斯分位数回归计数通过生成建模。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf152
Yuta Yamauchi, Genya Kobayashi, Shonosuke Sugasawa

Count data frequently arises in biomedical applications, such as the length of hospital stay. However, their discrete nature poses significant challenges for appropriately modeling conditional quantiles, which are crucial for understanding heterogeneous effects and variability in outcomes. To solve the practical difficulty, we propose a novel general Bayesian framework for quantile regression tailored to count data. We seek the regression parameter on the conditional quantile by minimizing the expected loss with respect to the distribution of the conditional quantile of the latent continuous variable associated with the observed count response variable. By modeling the unknown conditional distribution through a Bayesian nonparametric kernel mixture for the joint distribution of the count response and covariates, we obtain the posterior distribution of the regression parameter via a simple optimization. We numerically demonstrate that the proposed method improves bias and estimation accuracy of the existing crude approaches to count quantile regression. Furthermore, we analyze the length of hospital stay for acute myocardial infarction and demonstrate that the proposed method gives more interpretable and flexible results than the existing ones.

计数数据经常出现在生物医学应用中,例如住院时间。然而,它们的离散性对适当地建模条件分位数提出了重大挑战,这对于理解结果的异质性效应和可变性至关重要。为了解决实际困难,我们提出了一种针对计数数据的分位数回归的通用贝叶斯框架。我们通过最小化与观测计数响应变量相关的潜在连续变量的条件分位数分布的期望损失来寻求条件分位数上的回归参数。通过对计数响应和协变量联合分布的贝叶斯非参数核混合模型对未知条件分布进行建模,通过简单的优化得到回归参数的后验分布。数值结果表明,本文提出的方法改善了现有计数分位数回归方法的偏倚和估计精度。此外,我们对急性心肌梗死的住院时间进行了分析,并证明所提出的方法比现有方法具有更强的可解释性和灵活性。
{"title":"Flexible Bayesian quantile regression for counts via generative modeling.","authors":"Yuta Yamauchi, Genya Kobayashi, Shonosuke Sugasawa","doi":"10.1093/biomtc/ujaf152","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf152","url":null,"abstract":"<p><p>Count data frequently arises in biomedical applications, such as the length of hospital stay. However, their discrete nature poses significant challenges for appropriately modeling conditional quantiles, which are crucial for understanding heterogeneous effects and variability in outcomes. To solve the practical difficulty, we propose a novel general Bayesian framework for quantile regression tailored to count data. We seek the regression parameter on the conditional quantile by minimizing the expected loss with respect to the distribution of the conditional quantile of the latent continuous variable associated with the observed count response variable. By modeling the unknown conditional distribution through a Bayesian nonparametric kernel mixture for the joint distribution of the count response and covariates, we obtain the posterior distribution of the regression parameter via a simple optimization. We numerically demonstrate that the proposed method improves bias and estimation accuracy of the existing crude approaches to count quantile regression. Furthermore, we analyze the length of hospital stay for acute myocardial infarction and demonstrate that the proposed method gives more interpretable and flexible results than the existing ones.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145628659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated double machine learning for high-dimensional semiparametric models. 高维半参数模型的联合双机器学习。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf150
Kai Kang, Zhihao Wu, Xinjie Qian, Xinyuan Song, Hongtu Zhu

Federated learning enables the training of a global model while keeping data localized; however, current methods face challenges with high-dimensional semiparametric models that involve complex nuisance parameters. This paper proposes a federated double machine learning framework designed to address high-dimensional nuisance parameters of semiparametric models in multicenter studies. Our approach leverages double machine learning (Chernozhukov et al., 2018a) to estimate center-specific parameters, extends the surrogate efficient score method within a Neyman-orthogonal framework, and applies density ratio tilting to create a federated estimator that combines local individual-level data with summary statistics from other centers. This methodology mitigates regularization bias and overfitting in high-dimensional nuisance parameter estimation. We establish the estimator's limiting distribution under minimal assumptions, validate its performance through extensive simulations, and demonstrate its effectiveness in analyzing multiphase data from the Alzheimer's Disease Neuroimaging Initiative study.

联邦学习可以在保持数据本地化的同时训练全局模型;然而,目前的方法面临着涉及复杂干扰参数的高维半参数模型的挑战。本文提出了一种联邦双机器学习框架,旨在解决多中心研究中半参数模型的高维干扰参数。我们的方法利用双机器学习(Chernozhukov等人,2018a)来估计中心特定的参数,在内曼正交框架内扩展代理有效评分方法,并应用密度比倾斜来创建一个联合估计器,该估计器将本地个人层面的数据与来自其他中心的汇总统计数据相结合。该方法减轻了高维干扰参数估计中的正则化偏差和过拟合。我们在最小假设下建立了估计器的极限分布,通过广泛的模拟验证了它的性能,并证明了它在分析来自阿尔茨海默病神经成像倡议研究的多相数据中的有效性。
{"title":"Federated double machine learning for high-dimensional semiparametric models.","authors":"Kai Kang, Zhihao Wu, Xinjie Qian, Xinyuan Song, Hongtu Zhu","doi":"10.1093/biomtc/ujaf150","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf150","url":null,"abstract":"<p><p>Federated learning enables the training of a global model while keeping data localized; however, current methods face challenges with high-dimensional semiparametric models that involve complex nuisance parameters. This paper proposes a federated double machine learning framework designed to address high-dimensional nuisance parameters of semiparametric models in multicenter studies. Our approach leverages double machine learning (Chernozhukov et al., 2018a) to estimate center-specific parameters, extends the surrogate efficient score method within a Neyman-orthogonal framework, and applies density ratio tilting to create a federated estimator that combines local individual-level data with summary statistics from other centers. This methodology mitigates regularization bias and overfitting in high-dimensional nuisance parameter estimation. We establish the estimator's limiting distribution under minimal assumptions, validate its performance through extensive simulations, and demonstrate its effectiveness in analyzing multiphase data from the Alzheimer's Disease Neuroimaging Initiative study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145562431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging the gap between design and analysis: randomization inference and sensitivity analysis for matched observational studies with treatment doses. 弥合设计和分析之间的差距:随机化推理和对治疗剂量匹配的观察性研究的敏感性分析。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf156
Jeffrey Zhang, Siyu Heng

Matching is a commonly used causal inference study design in observational studies. Through matching on measured confounders between different treatment groups, valid randomization inferences can be conducted under the no unmeasured confounding assumption, and sensitivity analysis can be further performed to assess robustness of results to potential unmeasured confounding. However, for many common matched designs, there is still a lack of valid downstream randomization inference and sensitivity analysis methods. Specifically, in matched observational studies with treatment doses (eg, continuous or ordinal treatments), with the exception of some special cases such as pair matching, there is no existing randomization inference or sensitivity analysis method for studying analogs of the sample average treatment effect (ie, Neyman-type weak nulls), and no existing valid sensitivity analysis approach for testing the sharp null of no treatment effect for any subject (ie, Fisher's sharp null) when the outcome is nonbinary. To fill these important gaps, we propose new methods for randomization inference and sensitivity analysis that can work for general matched designs with treatment doses, applicable to general types of outcome variables (eg, binary, ordinal, or continuous), and cover both Fisher's sharp null and Neyman-type weak nulls. We illustrate our methods via comprehensive simulation studies and a real data application. All the proposed methods have been incorporated into $tt {R}$ package $tt {doseSens}$.

匹配是观察性研究中常用的因果推理研究设计。通过对不同处理组间测量混杂因素的匹配,可以在无不可测混杂假设下进行有效的随机化推断,并进一步进行敏感性分析,评估结果对潜在不可测混杂因素的稳健性。然而,对于许多常见的匹配设计,仍然缺乏有效的下游随机化推理和灵敏度分析方法。具体而言,在治疗剂量匹配的观察性研究中(如连续或顺序治疗),除配对等特殊情况外,没有现有的随机化推理或灵敏度分析方法来研究样本平均治疗效果的类似物(即neyman型弱零值),也没有现有的有效的灵敏度分析方法来检验任何受试者无治疗效果的锐零值(即:当结果是非二元的时候。为了填补这些重要的空白,我们提出了新的随机化推理和敏感性分析方法,这些方法可以适用于治疗剂量的一般匹配设计,适用于一般类型的结果变量(例如,二进制,有序或连续),并涵盖Fisher尖锐零值和neyman型弱零值。我们通过全面的仿真研究和实际数据应用来说明我们的方法。所有建议的方法都已纳入$tt {R}$ package $tt {doseSens}$。
{"title":"Bridging the gap between design and analysis: randomization inference and sensitivity analysis for matched observational studies with treatment doses.","authors":"Jeffrey Zhang, Siyu Heng","doi":"10.1093/biomtc/ujaf156","DOIUrl":"10.1093/biomtc/ujaf156","url":null,"abstract":"<p><p>Matching is a commonly used causal inference study design in observational studies. Through matching on measured confounders between different treatment groups, valid randomization inferences can be conducted under the no unmeasured confounding assumption, and sensitivity analysis can be further performed to assess robustness of results to potential unmeasured confounding. However, for many common matched designs, there is still a lack of valid downstream randomization inference and sensitivity analysis methods. Specifically, in matched observational studies with treatment doses (eg, continuous or ordinal treatments), with the exception of some special cases such as pair matching, there is no existing randomization inference or sensitivity analysis method for studying analogs of the sample average treatment effect (ie, Neyman-type weak nulls), and no existing valid sensitivity analysis approach for testing the sharp null of no treatment effect for any subject (ie, Fisher's sharp null) when the outcome is nonbinary. To fill these important gaps, we propose new methods for randomization inference and sensitivity analysis that can work for general matched designs with treatment doses, applicable to general types of outcome variables (eg, binary, ordinal, or continuous), and cover both Fisher's sharp null and Neyman-type weak nulls. We illustrate our methods via comprehensive simulation studies and a real data application. All the proposed methods have been incorporated into $tt {R}$ package $tt {doseSens}$.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Super learner for survival prediction in case-cohort and generalized case-cohort studies. 在病例队列和广义病例队列研究中用于生存预测的超级学习者。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf155
Haolin Li, Haibo Zhou, David Couper, Jianwen Cai

The case-cohort study design is often used in modern epidemiological studies of rare diseases, as it can achieve similar efficiency as a much larger cohort study with a fraction of the cost. Previous work focused on parameter estimation for case-cohort studies based on a particular statistical model, but few discussed the survival prediction problem under such type of design. In this article, we propose a super learner algorithm for survival prediction in case-cohort studies. We further extend our proposed algorithm to generalized case-cohort studies. The proposed super learner algorithm is shown to have asymptotic model selection consistency as well as uniform consistency. We also demonstrate our algorithm has satisfactory finite sample performances. Simulation studies suggest that the proposed super learners trained by data from case-cohort and generalized case-cohort studies have better prediction accuracy than the ones trained by data from the simple random sampling design with the same sample sizes. Finally, we apply the proposed method to analyze a generalized case-cohort study conducted as part of the Atherosclerosis Risk in Communities Study.

病例队列研究设计经常用于罕见病的现代流行病学研究,因为它可以达到与规模大得多的队列研究相似的效率,而成本只是前者的一小部分。以往的工作主要集中在基于特定统计模型的病例队列研究的参数估计,但很少讨论这种设计下的生存预测问题。在本文中,我们提出了一种超级学习者算法,用于病例队列研究中的生存预测。我们进一步将我们提出的算法扩展到广义病例队列研究。所提出的超级学习算法具有渐近模型选择一致性和均匀一致性。我们还证明了该算法具有令人满意的有限样本性能。仿真研究表明,用病例队列和广义病例队列数据训练的超级学习者比用相同样本量的简单随机抽样设计的数据训练的超级学习者具有更好的预测精度。最后,我们应用所提出的方法来分析作为社区动脉粥样硬化风险研究一部分的一项广义病例队列研究。
{"title":"Super learner for survival prediction in case-cohort and generalized case-cohort studies.","authors":"Haolin Li, Haibo Zhou, David Couper, Jianwen Cai","doi":"10.1093/biomtc/ujaf155","DOIUrl":"10.1093/biomtc/ujaf155","url":null,"abstract":"<p><p>The case-cohort study design is often used in modern epidemiological studies of rare diseases, as it can achieve similar efficiency as a much larger cohort study with a fraction of the cost. Previous work focused on parameter estimation for case-cohort studies based on a particular statistical model, but few discussed the survival prediction problem under such type of design. In this article, we propose a super learner algorithm for survival prediction in case-cohort studies. We further extend our proposed algorithm to generalized case-cohort studies. The proposed super learner algorithm is shown to have asymptotic model selection consistency as well as uniform consistency. We also demonstrate our algorithm has satisfactory finite sample performances. Simulation studies suggest that the proposed super learners trained by data from case-cohort and generalized case-cohort studies have better prediction accuracy than the ones trained by data from the simple random sampling design with the same sample sizes. Finally, we apply the proposed method to analyze a generalized case-cohort study conducted as part of the Atherosclerosis Risk in Communities Study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A semiparametric method for addressing underdiagnosis using electronic health record data. 利用电子病历数据解决诊断不足问题的半参数方法。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf157
Weidong Ma, Jordana B Cohen, Jinbo Chen

Effective treatment of medical conditions begins with an accurate diagnosis. However, many conditions are often underdiagnosed, either being overlooked or diagnosed after significant delays. Electronic health records (EHRs) contain extensive patient health information, offering an opportunity to probabilistically identify underdiagnosed individuals. The rationale is that both diagnosed and underdiagnosed patients may display similar health profiles in EHR data, distinguishing them from condition-free patients. Thus, EHR data can be leveraged to develop models that assess an individual's risk of having a condition. To date, this opportunity has largely remained unexploited, partly due to the lack of suitable statistical methods. The key challenge is the positive-unlabeled EHR data structure, which consists of data for diagnosed ("positive") patients and the remaining ("unlabeled") that include underdiagnosed patients and many condition-free patients. Therefore, data for patients who are unambiguously condition-free, essential for developing risk assessment models, are unavailable. To overcome this challenge, we propose ascertaining condition statuses for a small subset of unlabeled patients. We develop a novel statistical method for building accurate models using this supplemented EHR data to estimate the probability that a patient has the condition of interest. We study the asymptotic properties of our method and assess its finite-sample performance through simulation studies. Finally, we apply our method to develop a preliminary model for identifying potentially underdiagnosed non-alcoholic steatohepatitis patients using data from Penn Medicine EHRs.

医疗条件的有效治疗始于准确的诊断。然而,许多疾病往往没有得到充分诊断,要么被忽视,要么在严重延误后才得到诊断。电子健康记录(EHRs)包含广泛的患者健康信息,提供了一个机会,以概率识别未被诊断的个体。其基本原理是,在电子病历数据中,确诊和未确诊的患者可能显示出相似的健康状况,从而将他们与无疾病患者区分开来。因此,电子病历数据可以用来开发评估个人患病风险的模型。迄今为止,这一机会在很大程度上仍未得到利用,部分原因是缺乏适当的统计方法。关键的挑战是阳性-未标记的EHR数据结构,它由诊断(“阳性”)患者的数据和剩余(“未标记”)的数据组成,其中包括未确诊的患者和许多无病患者。因此,对于开发风险评估模型至关重要的无症状患者的数据是不可用的。为了克服这一挑战,我们建议确定一小部分未标记患者的病情状态。我们开发了一种新的统计方法,利用这种补充的电子病历数据来建立准确的模型,以估计患者具有感兴趣条件的概率。我们研究了该方法的渐近性质,并通过仿真研究评估了其有限样本性能。最后,我们应用我们的方法开发了一个初步模型,用于识别潜在的未被诊断的非酒精性脂肪性肝炎患者,使用的数据来自宾夕法尼亚大学医学电子病历。
{"title":"A semiparametric method for addressing underdiagnosis using electronic health record data.","authors":"Weidong Ma, Jordana B Cohen, Jinbo Chen","doi":"10.1093/biomtc/ujaf157","DOIUrl":"10.1093/biomtc/ujaf157","url":null,"abstract":"<p><p>Effective treatment of medical conditions begins with an accurate diagnosis. However, many conditions are often underdiagnosed, either being overlooked or diagnosed after significant delays. Electronic health records (EHRs) contain extensive patient health information, offering an opportunity to probabilistically identify underdiagnosed individuals. The rationale is that both diagnosed and underdiagnosed patients may display similar health profiles in EHR data, distinguishing them from condition-free patients. Thus, EHR data can be leveraged to develop models that assess an individual's risk of having a condition. To date, this opportunity has largely remained unexploited, partly due to the lack of suitable statistical methods. The key challenge is the positive-unlabeled EHR data structure, which consists of data for diagnosed (\"positive\") patients and the remaining (\"unlabeled\") that include underdiagnosed patients and many condition-free patients. Therefore, data for patients who are unambiguously condition-free, essential for developing risk assessment models, are unavailable. To overcome this challenge, we propose ascertaining condition statuses for a small subset of unlabeled patients. We develop a novel statistical method for building accurate models using this supplemented EHR data to estimate the probability that a patient has the condition of interest. We study the asymptotic properties of our method and assess its finite-sample performance through simulation studies. Finally, we apply our method to develop a preliminary model for identifying potentially underdiagnosed non-alcoholic steatohepatitis patients using data from Penn Medicine EHRs.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145647261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Covariate-Adjusted Response-Adaptive Randomization for Multi-Arm Clinical Trials Using a Modified Forward Looking Gittins Index Rule. 修正:使用改进的前瞻性gittin指数规则进行多组临床试验的协变量调整反应-自适应随机化。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf139
{"title":"Correction to: Covariate-Adjusted Response-Adaptive Randomization for Multi-Arm Clinical Trials Using a Modified Forward Looking Gittins Index Rule.","authors":"","doi":"10.1093/biomtc/ujaf139","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf139","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145372147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rejoinder to Letter to the Editors "Comments on 'Statistical inference on change points in generalized semiparametric segmented models' by Yang et al. (2025)" by Vito M.R. Muggeo. Vito M.R. Muggeo的《致编辑的信》“对Yang等人(2025)的‘广义半参数分段模型中变化点的统计推断’的评论”的回复。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf148
Guangyu Yang, Min Zhang
{"title":"Rejoinder to Letter to the Editors \"Comments on 'Statistical inference on change points in generalized semiparametric segmented models' by Yang et al. (2025)\" by Vito M.R. Muggeo.","authors":"Guangyu Yang, Min Zhang","doi":"10.1093/biomtc/ujaf148","DOIUrl":"10.1093/biomtc/ujaf148","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":" ","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145602083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large row-constrained supersaturated designs for high-throughput screening. 用于高通量筛选的大行约束过饱和设计。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf160
Byran J Smucker, Stephen E Wright, Isaac Williams, Richard C Page, Andor J Kiss, Surendra Bikram Silwal, Maria Weese, David J Edwards

High-throughput screening, in which large numbers of compounds are traditionally studied one-at-a-time in multiwell plates against specific targets, is widely used across many areas of the biological sciences, including drug discovery. To improve the effectiveness of these screens, we propose a new class of supersaturated designs that guide the construction of pools of compounds in each well. Because the size of the pools is typically limited by the particular application, the new designs accommodate this constraint and are part of a larger procedure that we call Constrained Row Screening or CRowS. We develop an efficient computational procedure to construct the CRowS designs, provide some initial lower bounds on the average squared off-diagonal values of their main-effects information matrix, and study the impact of the constraint on design quality. We also show via simulation that CRowS is statistically superior to the traditional one-compound-one-well approach as well as an existing pooling method, and demonstrate the use of the new methodology on a Verona Integron-encoded Metallo-$beta$-lactamase-2 assay.

传统的高通量筛选方法是在多孔板上针对特定靶点一次对大量化合物进行研究,这种方法被广泛应用于生物科学的许多领域,包括药物发现。为了提高这些筛管的有效性,我们提出了一类新的过饱和设计,可以指导每口井中化合物池的构建。由于池的大小通常受到特定应用程序的限制,因此新的设计适应了这一约束,并且是我们称为约束行筛选(Constrained Row Screening, CRowS)的更大过程的一部分。我们开发了一种高效的计算程序来构建CRowS设计,给出了其主效应信息矩阵的非对角线平均平方值的初始下界,并研究了约束对设计质量的影响。我们还通过模拟表明,CRowS在统计上优于传统的一化合物一井方法以及现有的池化方法,并演示了新方法在维罗纳整合子编码的金属- β -内酰胺酶-2分析中的应用。
{"title":"Large row-constrained supersaturated designs for high-throughput screening.","authors":"Byran J Smucker, Stephen E Wright, Isaac Williams, Richard C Page, Andor J Kiss, Surendra Bikram Silwal, Maria Weese, David J Edwards","doi":"10.1093/biomtc/ujaf160","DOIUrl":"10.1093/biomtc/ujaf160","url":null,"abstract":"<p><p>High-throughput screening, in which large numbers of compounds are traditionally studied one-at-a-time in multiwell plates against specific targets, is widely used across many areas of the biological sciences, including drug discovery. To improve the effectiveness of these screens, we propose a new class of supersaturated designs that guide the construction of pools of compounds in each well. Because the size of the pools is typically limited by the particular application, the new designs accommodate this constraint and are part of a larger procedure that we call Constrained Row Screening or CRowS. We develop an efficient computational procedure to construct the CRowS designs, provide some initial lower bounds on the average squared off-diagonal values of their main-effects information matrix, and study the impact of the constraint on design quality. We also show via simulation that CRowS is statistically superior to the traditional one-compound-one-well approach as well as an existing pooling method, and demonstrate the use of the new methodology on a Verona Integron-encoded Metallo-$beta$-lactamase-2 assay.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12696866/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145720530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A meta-learning method for estimation of causal excursion effects to assess time-varying moderation. 用元学习方法估计因果偏移效应以评估时变适度。
IF 1.7 4区 数学 Q3 BIOLOGY Pub Date : 2025-10-08 DOI: 10.1093/biomtc/ujaf129
Jieru Shi, Walter Dempsey

Advances in wearable technologies and health interventions delivered by smartphones have greatly increased the accessibility of mobile health (mHealth) interventions. Micro-randomized trials (MRTs) are designed to assess the effectiveness of the mHealth intervention and introduce a novel class of causal estimands called "causal excursion effects." These estimands enable the evaluation of how intervention effects change over time and are influenced by individual characteristics or context. Existing methods for analyzing causal excursion effects assume known randomization probabilities, complete observations, and a linear nuisance function with prespecified features of the high-dimensional observed history. However, in complex mobile systems, these assumptions often fall short: randomization probabilities can be uncertain, observations may be incomplete, and the granularity of mHealth data makes linear modeling difficult. To address this issue, we propose a flexible and doubly robust inferential procedure, called "DR-WCLS," for estimating causal excursion effects from a meta-learner perspective. We present the bidirectional asymptotic properties of the proposed estimators and compare them with existing methods both theoretically and through extensive simulations. The results show a consistent and more efficient estimate, even with missing observations or uncertain treatment randomization probabilities. Finally, the practical utility of the proposed methods is demonstrated by analyzing data from a multi-institution cohort of first-year medical residents in the United States.

可穿戴技术的进步和智能手机提供的卫生干预措施大大提高了移动卫生干预措施的可及性。微随机试验(MRTs)旨在评估移动医疗干预的有效性,并引入一类称为“因果偏移效应”的新型因果估计。这些估计能够评估干预效果如何随时间变化,以及如何受到个体特征或环境的影响。现有的分析因果偏移效应的方法假设已知的随机化概率、完整的观测值和具有高维观测历史的预先指定特征的线性干扰函数。然而,在复杂的移动系统中,这些假设往往不足:随机概率可能是不确定的,观察可能是不完整的,移动健康数据的粒度使得线性建模变得困难。为了解决这个问题,我们提出了一种灵活且双重稳健的推理程序,称为“DR-WCLS”,用于从元学习者的角度估计因果偏移效应。我们给出了所提出的估计量的双向渐近性质,并从理论上和通过广泛的仿真将它们与现有方法进行了比较。结果显示了一个一致的和更有效的估计,即使有缺失的观察或不确定的治疗随机化概率。最后,通过分析美国多机构一年级住院医师队列的数据,证明了所提出方法的实际效用。
{"title":"A meta-learning method for estimation of causal excursion effects to assess time-varying moderation.","authors":"Jieru Shi, Walter Dempsey","doi":"10.1093/biomtc/ujaf129","DOIUrl":"https://doi.org/10.1093/biomtc/ujaf129","url":null,"abstract":"<p><p>Advances in wearable technologies and health interventions delivered by smartphones have greatly increased the accessibility of mobile health (mHealth) interventions. Micro-randomized trials (MRTs) are designed to assess the effectiveness of the mHealth intervention and introduce a novel class of causal estimands called \"causal excursion effects.\" These estimands enable the evaluation of how intervention effects change over time and are influenced by individual characteristics or context. Existing methods for analyzing causal excursion effects assume known randomization probabilities, complete observations, and a linear nuisance function with prespecified features of the high-dimensional observed history. However, in complex mobile systems, these assumptions often fall short: randomization probabilities can be uncertain, observations may be incomplete, and the granularity of mHealth data makes linear modeling difficult. To address this issue, we propose a flexible and doubly robust inferential procedure, called \"DR-WCLS,\" for estimating causal excursion effects from a meta-learner perspective. We present the bidirectional asymptotic properties of the proposed estimators and compare them with existing methods both theoretically and through extensive simulations. The results show a consistent and more efficient estimate, even with missing observations or uncertain treatment randomization probabilities. Finally, the practical utility of the proposed methods is demonstrated by analyzing data from a multi-institution cohort of first-year medical residents in the United States.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145249539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1