首页 > 最新文献

Biometrical Journal最新文献

英文 中文
Issue Information: Biometrical Journal 1'25
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-12-15 DOI: 10.1002/bimj.70027
{"title":"Issue Information: Biometrical Journal 1'25","authors":"","doi":"10.1002/bimj.70027","DOIUrl":"https://doi.org/10.1002/bimj.70027","url":null,"abstract":"","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"67 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142868580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Interactions in High-Dimensional Data Using Cross Leverage Scores
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-29 DOI: 10.1002/bimj.70014
Sven Teschke, Katja Ickstadt, Alexander Munteanu

We develop a variable selection method for interactions in regression models on large data in the context of genetics. The method is intended for investigating the influence of single-nucleotide polymorphisms (SNPs) and their interactions on health outcomes, which is a pn$pgg n$ problem. We introduce cross leverage scores (CLSs) to detect interactions of variables while maintaining interpretability. Using this method, it is not necessary to consider every possible interaction between variables individually, which would be very time-consuming even for moderate amounts of variables. Instead, we calculate the CLS for each variable and obtain a measure of importance for this variable. Calculating the scores remains time-consuming for large data sets. The key idea for scaling to large data is to divide the data into smaller random batches or consecutive windows of variables. This avoids complex and time-consuming computations on high-dimensional matrices by performing the computations only for small subsets of the data, which is less costly. We compare these methods to provable approximations of CLS based on sketching, which aims at summarizing data succinctly. In a simulation study, we show that the CLSs are directly linked to the importance of a variable in the sense of an interaction effect. We further show that the approximation approaches are appropriate for performing the calculations efficiently on arbitrarily large data while preserving the interaction detection effect of the CLS. This underlines their scalability to genome wide data. In addition, we evaluate the methods on real data from the HapMap project.

{"title":"Detecting Interactions in High-Dimensional Data Using Cross Leverage Scores","authors":"Sven Teschke,&nbsp;Katja Ickstadt,&nbsp;Alexander Munteanu","doi":"10.1002/bimj.70014","DOIUrl":"https://doi.org/10.1002/bimj.70014","url":null,"abstract":"<p>We develop a variable selection method for interactions in regression models on large data in the context of genetics. The method is intended for investigating the influence of single-nucleotide polymorphisms (SNPs) and their interactions on health outcomes, which is a <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>p</mi>\u0000 <mo>≫</mo>\u0000 <mi>n</mi>\u0000 </mrow>\u0000 <annotation>$pgg n$</annotation>\u0000 </semantics></math> problem. We introduce cross leverage scores (CLSs) to detect interactions of variables while maintaining interpretability. Using this method, it is not necessary to consider every possible interaction between variables individually, which would be very time-consuming even for moderate amounts of variables. Instead, we calculate the CLS for each variable and obtain a measure of importance for this variable. Calculating the scores remains time-consuming for large data sets. The key idea for scaling to large data is to divide the data into smaller random batches or consecutive windows of variables. This avoids complex and time-consuming computations on high-dimensional matrices by performing the computations only for small subsets of the data, which is less costly. We compare these methods to provable approximations of CLS based on sketching, which aims at summarizing data succinctly. In a simulation study, we show that the CLSs are directly linked to the importance of a variable in the sense of an interaction effect. We further show that the approximation approaches are appropriate for performing the calculations efficiently on arbitrarily large data while preserving the interaction detection effect of the CLS. This underlines their scalability to genome wide data. In addition, we evaluate the methods on real data from the HapMap project.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142749303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model Selection for Ordinary Differential Equations: A Statistical Testing Approach 常微分方程的模型选择:统计检验方法》。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-28 DOI: 10.1002/bimj.70013
Itai Dattner, Shota Gugushvili, Oleksandr Laskorunskyi

Ordinary differential equations (ODEs) are foundational tools in modeling intricate dynamics across a gamut of scientific disciplines. Yet, a possibility to represent a single phenomenon through multiple ODE models, driven by different understandings of nuances in internal mechanisms or abstraction levels, presents a model selection challenge. This study introduces a testing-based approach for ODE model selection amidst statistical noise. Rooted in the model misspecification framework, we adapt classical statistical paradigms (Vuong and Hotelling) to the ODE context, allowing for the comparison and ranking of diverse causal explanations without the constraints of nested models. Our simulation studies numerically investigate the statistical properties of the test, demonstrating its attainment of the nominal size and power across various settings. Real-world data examples further underscore the algorithm's applicability in practice. To foster accessibility and encourage real-world applications, we provide a user-friendly Python implementation of our model selection algorithm, bridging theoretical advancements with hands-on tools for the scientific community.

常微分方程(ODEs)是各学科复杂动力学建模的基础工具。然而,由于对内部机制或抽象程度的细微差别有不同的理解,通过多个 ODE 模型表示单一现象的可能性给模型选择带来了挑战。本研究介绍了一种基于测试的方法,用于在统计噪声中选择 ODE 模型。植根于模型错配框架,我们将经典统计范式(Vuong 和 Hotelling)应用于 ODE,从而可以在不受嵌套模型限制的情况下对不同的因果解释进行比较和排序。我们的模拟研究从数值上研究了该检验的统计特性,证明它在各种环境下都能达到标称规模和功率。真实世界的数据实例进一步强调了该算法在实践中的适用性。为了提高可访问性并鼓励实际应用,我们为模型选择算法提供了用户友好的 Python 实现,为科学界架起了理论进展与实践工具之间的桥梁。
{"title":"Model Selection for Ordinary Differential Equations: A Statistical Testing Approach","authors":"Itai Dattner,&nbsp;Shota Gugushvili,&nbsp;Oleksandr Laskorunskyi","doi":"10.1002/bimj.70013","DOIUrl":"10.1002/bimj.70013","url":null,"abstract":"<p>Ordinary differential equations (ODEs) are foundational tools in modeling intricate dynamics across a gamut of scientific disciplines. Yet, a possibility to represent a single phenomenon through multiple ODE models, driven by different understandings of nuances in internal mechanisms or abstraction levels, presents a model selection challenge. This study introduces a testing-based approach for ODE model selection amidst statistical noise. Rooted in the model misspecification framework, we adapt classical statistical paradigms (Vuong and Hotelling) to the ODE context, allowing for the comparison and ranking of diverse causal explanations without the constraints of nested models. Our simulation studies numerically investigate the statistical properties of the test, demonstrating its attainment of the nominal size and power across various settings. Real-world data examples further underscore the algorithm's applicability in practice. To foster accessibility and encourage real-world applications, we provide a user-friendly Python implementation of our model selection algorithm, bridging theoretical advancements with hands-on tools for the scientific community.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
τ $tau$ -Inflated Beta Regression Model for Estimating τ $tau$ -Restricted Means and Event-Free Probabilities for Censored Time-to-Event Data τ $tau$ -Inflated Beta Regression Model for Estimating τ $tau$ -Restricted Means and Event-Free Probabilities for Censored Time-to-Event Data.
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-28 DOI: 10.1002/bimj.70009
Yizhuo Wang, Susan Murray
<p>In this research, we propose analysis of <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math>-restricted censored time-to-event data via a <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math>-inflated beta regression (<span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math>-IBR) model. The outcome of interest is <span></span><math> <semantics> <mrow> <mi>min</mi> <mo>(</mo> <mi>τ</mi> <mo>,</mo> <mi>T</mi> <mo>)</mo> </mrow> <annotation>${rm min}(tau,T)$</annotation> </semantics></math>, where <span></span><math> <semantics> <mi>T</mi> <annotation>$T$</annotation> </semantics></math> and <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math> are the time-to-event and follow-up duration, respectively. Our analysis goals include estimation and inference related to <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math>-restricted mean survival time (<span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math>-RMST) values and event-free probabilities at <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math> that address the censored nature of the data. In this setting, it is common to observe many individuals with <span></span><math> <semantics> <mrow> <mi>min</mi> <mo>(</mo> <mi>τ</mi> <mo>,</mo> <mi>T</mi> <mo>)</mo> <mo>=</mo> <mi>τ</mi> </mrow> <annotation>${rm min}(tau,T)=tau$</annotation> </semantics></math>, a point mass that is typically overlooked in <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></math>-restricted event-time analyses. Our proposed <span></span><math> <semantics> <mi>τ</mi> <annotation>$tau$</annotation> </semantics></
在这项研究中,我们提出通过τ $tau$ -膨胀贝塔回归(τ $tau$ -IBR)模型来分析τ $tau$ -限制删减的时间到事件数据。我们感兴趣的结果是 min ( τ , T ) ${rm min}(tau,T)$,其中 T $T$ 和 τ $tau$ 分别是事件发生时间和随访持续时间。我们的分析目标包括与τ $tau$ -限制平均生存时间(τ $tau$ -RMST)值和τ $tau$ 处的无事件概率相关的估计和推断,以解决数据的删减性质。在这种情况下,通常会观察到许多个体的 min ( τ , T ) = τ ${rm min}(tau,T)=tau$,这是在 τ $tau$ 限制的事件时间分析中通常会忽略的点质量。我们提出的 τ $tau$ -IBR 模型基于将 min ( τ , T ) ${rm min}(tau,T)$分解为 τ [ I ( T ≥ τ ) + ( T / τ ) I ( T τ ) ] $tau [I(T ge tau) +(T/tau) I(T <tau)]$ 。我们使用联合逻辑和贝塔回归模型对后一个表达式的均值进行建模,并使用期望最大化算法进行拟合。用于拟合 τ $tau$ -IBR 模型的另一种多重归因(MI)算法的另一个优点是可以生成用于分析的无删减数据集。模拟结果表明,在独立和从属删失设置中,τ $tau$ -IBR模型和相应的τ $tau$ -RMST估计值都具有出色的性能。我们将我们的方法应用于阿奇霉素预防慢性阻塞性肺病(COPD)恶化试验。除了τ $tau$ -IBR模型结果提供了对治疗效果的细微理解外,我们还给出了基于我们的MI数据集的τ $tau$ -限制事件时间的直观热图,这种可视化方式通常无法用于删减时间到事件数据。
{"title":"τ\u0000 $tau$\u0000 -Inflated Beta Regression Model for Estimating \u0000 \u0000 τ\u0000 $tau$\u0000 -Restricted Means and Event-Free Probabilities for Censored Time-to-Event Data","authors":"Yizhuo Wang,&nbsp;Susan Murray","doi":"10.1002/bimj.70009","DOIUrl":"10.1002/bimj.70009","url":null,"abstract":"&lt;p&gt;In this research, we propose analysis of &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-restricted censored time-to-event data via a &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-inflated beta regression (&lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-IBR) model. The outcome of interest is &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;min&lt;/mi&gt;\u0000 &lt;mo&gt;(&lt;/mo&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;mo&gt;,&lt;/mo&gt;\u0000 &lt;mi&gt;T&lt;/mi&gt;\u0000 &lt;mo&gt;)&lt;/mo&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;${rm min}(tau,T)$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;, where &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;T&lt;/mi&gt;\u0000 &lt;annotation&gt;$T$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; and &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; are the time-to-event and follow-up duration, respectively. Our analysis goals include estimation and inference related to &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-restricted mean survival time (&lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-RMST) values and event-free probabilities at &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; that address the censored nature of the data. In this setting, it is common to observe many individuals with &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;min&lt;/mi&gt;\u0000 &lt;mo&gt;(&lt;/mo&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;mo&gt;,&lt;/mo&gt;\u0000 &lt;mi&gt;T&lt;/mi&gt;\u0000 &lt;mo&gt;)&lt;/mo&gt;\u0000 &lt;mo&gt;=&lt;/mo&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;${rm min}(tau,T)=tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;, a point mass that is typically overlooked in &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-restricted event-time analyses. Our proposed &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mi&gt;τ&lt;/mi&gt;\u0000 &lt;annotation&gt;$tau$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk-Based Decision Making: Estimands for Sequential Prediction Under Interventions 基于风险的决策:干预下的连续预测估计值。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-28 DOI: 10.1002/bimj.70011
Kim Luijken, Paweł Morzywołek, Wouter van Amsterdam, Giovanni Cinà, Jeroen Hoogland, Ruth Keogh, Jesse H. Krijthe, Sara Magliacane, Thijs van Ommen, Niels Peek, Hein Putter, Maarten van Smeden, Matthew Sperrin, Junfeng Wang, Daniala L. Weir, Vanessa Didelez, Nan van Geloven

Prediction models are used among others to inform medical decisions on interventions. Typically, individuals with high risks of adverse outcomes are advised to undergo an intervention while those at low risk are advised to refrain from it. Standard prediction models do not always provide risks that are relevant to inform such decisions: for example, an individual may be estimated to be at low risk because similar individuals in the past received an intervention which lowered their risk. Therefore, prediction models supporting decisions should target risks belonging to defined intervention strategies. Previous works on prediction under interventions assumed that the prediction model was used only at one time point to make an intervention decision. In clinical practice, intervention decisions are rarely made only once: they might be repeated, deferred, and reevaluated. This requires estimated risks under interventions that can be reconsidered at several potential decision moments. In the current work, we highlight key considerations for formulating estimands in sequential prediction under interventions that can inform such intervention decisions. We illustrate these considerations by giving examples of estimands for a case study about choosing between vaginal delivery and cesarean section for women giving birth. Our formalization of prediction tasks in a sequential, causal, and estimand context provides guidance for future studies to ensure that the right question is answered and appropriate causal estimation approaches are chosen to develop sequential prediction models that can inform intervention decisions.

预测模型主要用于为医疗干预决策提供信息。通常,建议不良后果风险高的人接受干预,而建议风险低的人不要接受干预。标准预测模型并不总能提供与此类决策相关的风险信息:例如,一个人可能被估计为低风险,因为过去类似的人接受了干预,从而降低了风险。因此,支持决策的预测模型应针对属于既定干预策略的风险。以前关于干预下预测的研究假设预测模型只在一个时间点用于做出干预决定。在临床实践中,干预决策很少只做一次:可能会重复、推迟和重新评估。这就要求干预措施下的估计风险可以在多个潜在的决策时刻进行重新考虑。在当前的工作中,我们强调了制定干预措施下的连续预测估计值的关键考虑因素,这些估计值可为此类干预决策提供信息。我们举例说明了这些注意事项,并给出了一个关于产妇在阴道分娩和剖腹产之间做出选择的案例研究的估计值。我们在顺序、因果关系和估计因素的背景下对预测任务进行了形式化,为今后的研究提供了指导,以确保回答正确的问题,并选择适当的因果关系估计方法来开发可为干预决策提供信息的顺序预测模型。
{"title":"Risk-Based Decision Making: Estimands for Sequential Prediction Under Interventions","authors":"Kim Luijken,&nbsp;Paweł Morzywołek,&nbsp;Wouter van Amsterdam,&nbsp;Giovanni Cinà,&nbsp;Jeroen Hoogland,&nbsp;Ruth Keogh,&nbsp;Jesse H. Krijthe,&nbsp;Sara Magliacane,&nbsp;Thijs van Ommen,&nbsp;Niels Peek,&nbsp;Hein Putter,&nbsp;Maarten van Smeden,&nbsp;Matthew Sperrin,&nbsp;Junfeng Wang,&nbsp;Daniala L. Weir,&nbsp;Vanessa Didelez,&nbsp;Nan van Geloven","doi":"10.1002/bimj.70011","DOIUrl":"10.1002/bimj.70011","url":null,"abstract":"<p>Prediction models are used among others to inform medical decisions on interventions. Typically, individuals with high risks of adverse outcomes are advised to undergo an intervention while those at low risk are advised to refrain from it. Standard prediction models do not always provide risks that are relevant to inform such decisions: for example, an individual may be estimated to be at low risk because similar individuals in the past received an intervention which lowered their risk. Therefore, prediction models supporting decisions should target risks belonging to defined intervention strategies. Previous works on prediction under interventions assumed that the prediction model was used only at one time point to make an intervention decision. In clinical practice, intervention decisions are rarely made only once: they might be repeated, deferred, and reevaluated. This requires estimated risks under interventions that can be reconsidered at several potential decision moments. In the current work, we highlight key considerations for formulating estimands in sequential prediction under interventions that can inform such intervention decisions. We illustrate these considerations by giving examples of estimands for a case study about choosing between vaginal delivery and cesarean section for women giving birth. Our formalization of prediction tasks in a sequential, causal, and estimand context provides guidance for future studies to ensure that the right question is answered and appropriate causal estimation approaches are chosen to develop sequential prediction models that can inform intervention decisions.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Matched Design for Causal Inference With Survey Data: Evaluation of Medical Marijuana Legalization in Kentucky and Tennessee 利用调查数据进行因果推断的匹配设计:肯塔基州和田纳西州医用大麻合法化评估》。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-28 DOI: 10.1002/bimj.70012
Marco H. Benedetti, Bo Lu, Motao Zhu

A concern surrounding marijuana legalization is that driving after marijuana use may become more prevalent. Survey data are valuable for estimating policy effects, however their observational nature and unequal sampling probabilities create challenges for causal inference. To estimate population-level effects using survey data, we propose a matched design and implement sensitivity analyses to quantify how robust conclusions are to unmeasured confounding. Both theoretical justification and simulation studies are presented. We found no support that marijuana legalization increased tolerant behaviors and attitudes toward driving after marijuana use, and these conclusions seem moderately robust to unmeasured confounding.

围绕大麻合法化的一个担忧是,吸食大麻后驾车的现象可能会更加普遍。调查数据对于估算政策效果很有价值,但其观察性质和不平等的抽样概率给因果推断带来了挑战。为了利用调查数据估计人口层面的影响,我们提出了一种匹配设计,并实施了敏感性分析,以量化结论对未测量混杂因素的稳健程度。我们还介绍了理论依据和模拟研究。我们没有发现大麻合法化会增加容忍行为和对吸食大麻后驾车的态度,这些结论似乎对未测量的混杂因素具有适度的稳健性。
{"title":"A Matched Design for Causal Inference With Survey Data: Evaluation of Medical Marijuana Legalization in Kentucky and Tennessee","authors":"Marco H. Benedetti,&nbsp;Bo Lu,&nbsp;Motao Zhu","doi":"10.1002/bimj.70012","DOIUrl":"10.1002/bimj.70012","url":null,"abstract":"<p>A concern surrounding marijuana legalization is that driving after marijuana use may become more prevalent. Survey data are valuable for estimating policy effects, however their observational nature and unequal sampling probabilities create challenges for causal inference. To estimate population-level effects using survey data, we propose a matched design and implement sensitivity analyses to quantify how robust conclusions are to unmeasured confounding. Both theoretical justification and simulation studies are presented. We found no support that marijuana legalization increased tolerant behaviors and attitudes toward driving after marijuana use, and these conclusions seem moderately robust to unmeasured confounding.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain Selection for Gaussian Process Data: An Application to Electrocardiogram Signals 高斯过程数据的领域选择:心电图信号的应用
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-28 DOI: 10.1002/bimj.70018
Nicolás Hernández, Gabriel Martos

Gaussian processes and the Kullback–Leibler divergence have been deeply studied in statistics and machine learning. This paper marries these two concepts and introduce the local Kullback–Leibler divergence to learn about intervals where two Gaussian processes differ the most. We address subtleties entailed in the estimation of local divergences and the corresponding interval of local maximum divergence as well. The estimation performance and the numerical efficiency of the proposed method are showcased via a Monte Carlo simulation study. In a medical research context, we assess the potential of the devised tools in the analysis of electrocardiogram signals.

统计学和机器学习领域对高斯过程和库尔贝克-莱布勒发散进行了深入研究。本文将这两个概念结合起来,引入了局部库尔贝克-莱布勒发散,以了解两个高斯过程差异最大的区间。我们还讨论了估计局部发散和相应的局部最大发散区间所涉及的微妙问题。我们通过蒙特卡罗模拟研究展示了所提方法的估计性能和数值效率。在医学研究方面,我们评估了所设计的工具在分析心电图信号方面的潜力。
{"title":"Domain Selection for Gaussian Process Data: An Application to Electrocardiogram Signals","authors":"Nicolás Hernández,&nbsp;Gabriel Martos","doi":"10.1002/bimj.70018","DOIUrl":"10.1002/bimj.70018","url":null,"abstract":"<p>Gaussian processes and the Kullback–Leibler divergence have been deeply studied in statistics and machine learning. This paper marries these two concepts and introduce the local Kullback–Leibler divergence to learn about intervals where two Gaussian processes differ the most. We address subtleties entailed in the estimation of local divergences and the corresponding interval of local maximum divergence as well. The estimation performance and the numerical efficiency of the proposed method are showcased via a Monte Carlo simulation study. In a medical research context, we assess the potential of the devised tools in the analysis of electrocardiogram signals.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Semiparametric Two-Sample Density Ratio Model With a Change Point 带变化点的半参数双样本密度比模型
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-25 DOI: 10.1002/bimj.202300214
Jiahui Feng, Kin Yau Wong, Chun Yin Lee

The logistic regression model for a binary outcome with a continuous covariate can be expressed equivalently as a two-sample density ratio model for the covariate. Utilizing this equivalence, we study a change-point logistic regression model within the corresponding density ratio modeling framework. We investigate estimation and inference methods for the density ratio model and develop maximal score-type tests to detect the presence of a change point. In contrast to existing work, the density ratio modeling framework facilitates the development of a natural Kolmogorov–Smirnov type test to assess the validity of the logistic model assumptions. A simulation study is conducted to evaluate the finite-sample performance of the proposed tests and estimation methods. We illustrate the proposed approach using a mother-to-child HIV-1 transmission data set and an oral cancer data set.

带有连续协变量的二元结果逻辑回归模型可以等价地表示为协变量的双样本密度比模型。利用这一等价关系,我们在相应的密度比模型框架内研究了变化点逻辑回归模型。我们研究了密度比模型的估计和推理方法,并开发了最大得分类型检验来检测变化点的存在。与现有工作不同的是,密度比建模框架有助于开发一种自然的 Kolmogorov-Smirnov 类型检验,以评估逻辑模型假设的有效性。我们进行了一项模拟研究,以评估所提出的检验和估算方法的有限样本性能。我们使用 HIV-1 母婴传播数据集和口腔癌数据集说明了所提出的方法。
{"title":"A Semiparametric Two-Sample Density Ratio Model With a Change Point","authors":"Jiahui Feng,&nbsp;Kin Yau Wong,&nbsp;Chun Yin Lee","doi":"10.1002/bimj.202300214","DOIUrl":"10.1002/bimj.202300214","url":null,"abstract":"<div>\u0000 \u0000 <p>The logistic regression model for a binary outcome with a continuous covariate can be expressed equivalently as a two-sample density ratio model for the covariate. Utilizing this equivalence, we study a change-point logistic regression model within the corresponding density ratio modeling framework. We investigate estimation and inference methods for the density ratio model and develop maximal score-type tests to detect the presence of a change point. In contrast to existing work, the density ratio modeling framework facilitates the development of a natural Kolmogorov–Smirnov type test to assess the validity of the logistic model assumptions. A simulation study is conducted to evaluate the finite-sample performance of the proposed tests and estimation methods. We illustrate the proposed approach using a mother-to-child HIV-1 transmission data set and an oral cancer data set.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142717664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smoothed Estimation on Optimal Treatment Regime Under Semisupervised Setting in Randomized Trials 随机试验中半监督设置下最佳治疗方案的平滑估计
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-23 DOI: 10.1002/bimj.70006
Xiaoqi Jiao, Mengjiao Peng, Yong Zhou

A treatment regime refers to the process of assigning the most suitable treatment to a patient based on their observed information. However, prevailing research on treatment regimes predominantly relies on labeled data, which may lead to the omission of valuable information contained within unlabeled data, such as historical records and healthcare databases. Current semisupervised works for deriving optimal treatment regimes either rely on model assumptions or struggle with high computational burdens for even moderate-dimensional covariates. To address this concern, we propose a semisupervised framework that operates within a model-free context to estimate the optimal treatment regime by leveraging the abundant unlabeled data. Our proposed approach encompasses three key steps. First, we employ a single-index model to achieve dimension reduction, followed by kernel regression to impute the missing outcomes in the unlabeled data. Second, we propose various forms of semisupervised value functions based on the imputed values, incorporating both labeled and unlabeled data components. Lastly, the optimal treatment regimes are derived by maximizing the semisupervised value functions. We establish the consistency and asymptotic normality of the estimators proposed in our framework. Furthermore, we introduce a perturbation resampling procedure to estimate the asymptotic variance. Simulations confirm the advantageous properties of incorporating unlabeled data in the estimation for optimal treatment regimes. A practical data example is also provided to illustrate the application of our methodology. This work is rooted in the framework of randomized trials, with additional discussions extending to observational studies.

治疗方案是指根据观察到的患者信息为其指定最适合的治疗方法的过程。然而,目前有关治疗方案的研究主要依赖于标注数据,这可能会导致遗漏未标注数据(如历史记录和医疗数据库)中包含的宝贵信息。目前用于推导最佳治疗方案的半监督工作要么依赖于模型假设,要么即使是中等维度的协变量也要承受高昂的计算负担。为了解决这个问题,我们提出了一个半监督框架,该框架在无模型的背景下运行,利用丰富的无标记数据来估计最佳治疗方案。我们提出的方法包括三个关键步骤。首先,我们采用单指标模型来实现降维,然后用核回归来补偿未标记数据中的缺失结果。其次,我们根据估算值提出了各种形式的半监督值函数,其中包含标记和非标记数据成分。最后,通过使半监督价值函数最大化,得出最佳处理机制。我们确定了我们框架中提出的估计值的一致性和渐近正态性。此外,我们还引入了扰动重采样程序来估计渐近方差。模拟证实了将非标记数据纳入最优处理机制估计的优势特性。我们还提供了一个实际数据示例来说明我们方法的应用。本研究以随机试验为基础,并对观察性研究进行了补充讨论。
{"title":"Smoothed Estimation on Optimal Treatment Regime Under Semisupervised Setting in Randomized Trials","authors":"Xiaoqi Jiao,&nbsp;Mengjiao Peng,&nbsp;Yong Zhou","doi":"10.1002/bimj.70006","DOIUrl":"10.1002/bimj.70006","url":null,"abstract":"<div>\u0000 \u0000 <p>A treatment regime refers to the process of assigning the most suitable treatment to a patient based on their observed information. However, prevailing research on treatment regimes predominantly relies on labeled data, which may lead to the omission of valuable information contained within unlabeled data, such as historical records and healthcare databases. Current semisupervised works for deriving optimal treatment regimes either rely on model assumptions or struggle with high computational burdens for even moderate-dimensional covariates. To address this concern, we propose a semisupervised framework that operates within a model-free context to estimate the optimal treatment regime by leveraging the abundant unlabeled data. Our proposed approach encompasses three key steps. First, we employ a single-index model to achieve dimension reduction, followed by kernel regression to impute the missing outcomes in the unlabeled data. Second, we propose various forms of semisupervised value functions based on the imputed values, incorporating both labeled and unlabeled data components. Lastly, the optimal treatment regimes are derived by maximizing the semisupervised value functions. We establish the consistency and asymptotic normality of the estimators proposed in our framework. Furthermore, we introduce a perturbation resampling procedure to estimate the asymptotic variance. Simulations confirm the advantageous properties of incorporating unlabeled data in the estimation for optimal treatment regimes. A practical data example is also provided to illustrate the application of our methodology. This work is rooted in the framework of randomized trials, with additional discussions extending to observational studies.</p></div>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulating Data From Marginal Structural Models for a Survival Time Outcome 模拟生存时间结果的边际结构模型数据。
IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-23 DOI: 10.1002/bimj.70010
Shaun R. Seaman, Ruth H. Keogh

Marginal structural models (MSMs) are often used to estimate causal effects of treatments on survival time outcomes from observational data when time-dependent confounding may be present. They can be fitted using, for example, inverse probability of treatment weighting (IPTW). It is important to evaluate the performance of statistical methods in different scenarios, and simulation studies are a key tool for such evaluations. In such simulation studies, it is common to generate data in such a way that the model of interest is correctly specified, but this is not always straightforward when the model of interest is for potential outcomes, as is an MSM. Methods have been proposed for simulating from MSMs for a survival outcome, but these methods impose restrictions on the data-generating mechanism. Here, we propose a method that overcomes these restrictions. The MSM can be, for example, a marginal structural logistic model for a discrete survival time or a Cox or additive hazards MSM for a continuous survival time. The hazard of the potential survival time can be conditional on baseline covariates, and the treatment variable can be discrete or continuous. We illustrate the use of the proposed simulation algorithm by carrying out a brief simulation study. This study compares the coverage of confidence intervals calculated in two different ways for causal effect estimates obtained by fitting an MSM via IPTW.

边际结构模型(MSMs)通常用于估算观察数据中治疗对生存时间结果的因果效应,此时可能存在与时间相关的混杂因素。例如,可以使用治疗反概率加权法(IPTW)对其进行拟合。评估统计方法在不同情况下的性能非常重要,而模拟研究则是进行此类评估的重要工具。在此类模拟研究中,通常要以正确指定相关模型的方式生成数据,但如果相关模型是针对潜在结果的,如 MSM,则并非总是那么简单。有人提出了用 MSM 模拟生存结果的方法,但这些方法对数据生成机制施加了限制。在此,我们提出一种克服这些限制的方法。例如,MSM 可以是离散生存时间的边际结构逻辑模型,也可以是连续生存时间的 Cox 或加性危害 MSM。潜在生存时间的危害可以是以基线协变量为条件的,治疗变量可以是离散的,也可以是连续的。我们通过开展一项简短的模拟研究来说明所提出的模拟算法的使用方法。这项研究比较了通过 IPTW 拟合 MSM 得到的因果效应估计值的两种不同方法计算出的置信区间的覆盖范围。
{"title":"Simulating Data From Marginal Structural Models for a Survival Time Outcome","authors":"Shaun R. Seaman,&nbsp;Ruth H. Keogh","doi":"10.1002/bimj.70010","DOIUrl":"10.1002/bimj.70010","url":null,"abstract":"<p>Marginal structural models (MSMs) are often used to estimate causal effects of treatments on survival time outcomes from observational data when time-dependent confounding may be present. They can be fitted using, for example, inverse probability of treatment weighting (IPTW). It is important to evaluate the performance of statistical methods in different scenarios, and simulation studies are a key tool for such evaluations. In such simulation studies, it is common to generate data in such a way that the model of interest is correctly specified, but this is not always straightforward when the model of interest is for potential outcomes, as is an MSM. Methods have been proposed for simulating from MSMs for a survival outcome, but these methods impose restrictions on the data-generating mechanism. Here, we propose a method that overcomes these restrictions. The MSM can be, for example, a marginal structural logistic model for a discrete survival time or a Cox or additive hazards MSM for a continuous survival time. The hazard of the potential survival time can be conditional on baseline covariates, and the treatment variable can be discrete or continuous. We illustrate the use of the proposed simulation algorithm by carrying out a brief simulation study. This study compares the coverage of confidence intervals calculated in two different ways for causal effect estimates obtained by fitting an MSM via IPTW.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 8","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrical Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1