Pub Date : 2024-09-01Epub Date: 2024-08-06DOI: 10.1177/09622802241268654
Priyanka Nagar, Andriette Bekker, Mohammad Arashi, Cor-Jacques Kat, Annette-Christi Barnard
Biomechanical and orthopaedic studies frequently encounter complex datasets that encompass both circular and linear variables. In most cases (i) the circular and linear variables are considered in isolation with dependency between variables neglected and (ii) the cyclicity of the circular variables is disregarded resulting in erroneous decision making. Given the inherent characteristics of circular variables, it is imperative to adopt methods that integrate directional statistics to achieve precise modelling. This paper is motivated by the modelling of biomechanical data, that is, the fracture displacements, that is used as a measure in external fixator comparisons. We focus on a dataset, based on an Ilizarov ring fixator, comprising of six variables. A modelling framework applicable to the six-dimensional joint distribution of circular-linear data based on vine copulas is proposed. The pair-copula decomposition concept of vine copulas represents the dependence structure as a combination of circular-linear, circular-circular and linear-linear pairs modelled by their respective copulas. This framework allows us to assess the dependencies in the joint distribution as well as account for the cyclicity of the circular variables. Thus, a new approach for accurate modelling of mechanical behaviour for Ilizarov ring fixators and other data of this nature is imparted.
{"title":"A dependent circular-linear model for multivariate biomechanical data: Ilizarov ring fixator study.","authors":"Priyanka Nagar, Andriette Bekker, Mohammad Arashi, Cor-Jacques Kat, Annette-Christi Barnard","doi":"10.1177/09622802241268654","DOIUrl":"10.1177/09622802241268654","url":null,"abstract":"<p><p>Biomechanical and orthopaedic studies frequently encounter complex datasets that encompass both circular and linear variables. In most cases (i) the circular and linear variables are considered in isolation with dependency between variables neglected and (ii) the cyclicity of the circular variables is disregarded resulting in erroneous decision making. Given the inherent characteristics of circular variables, it is imperative to adopt methods that integrate directional statistics to achieve precise modelling. This paper is motivated by the modelling of biomechanical data, that is, the fracture displacements, that is used as a measure in external fixator comparisons. We focus on a dataset, based on an Ilizarov ring fixator, comprising of six variables. A modelling framework applicable to the six-dimensional joint distribution of circular-linear data based on vine copulas is proposed. The pair-copula decomposition concept of vine copulas represents the dependence structure as a combination of circular-linear, circular-circular and linear-linear pairs modelled by their respective copulas. This framework allows us to assess the dependencies in the joint distribution as well as account for the cyclicity of the circular variables. Thus, a new approach for accurate modelling of mechanical behaviour for Ilizarov ring fixators and other data of this nature is imparted.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1660-1672"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11497752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141894363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-08-08DOI: 10.1177/09622802241267356
Kate J Young, Leonidas E Bantis
measures of biomarker accuracy that employ the receiver operating characteristic surface have been proposed for biomarkers that classify patients into one of three groups: healthy, benign, or aggressive disease. The volume under the receiver operating characteristic surface summarizes the overall discriminatory ability of a biomarker in such configurations, but includes cutoffs associated with clinically irrelevant true classification rates. Due to the lethal nature of pancreatic cancer, cutoffs associated with a low true classification rate for identifying patients with pancreatic cancer may be undesirable and not appropriate for use in a clinical setting. In this project, we study the properties of a more focused criterion, the partial volume under the receiver operating characteristic surface, that summarizes the diagnostic accuracy of a marker in the three-class setting for regions restricted to only those of clinical interest. We propose methods for estimation and inference on the partial volume under the receiver operating characteristic surface under parametric and non-parametric frameworks and apply these methods to the evaluation of potential biomarkers for the diagnosis of pancreatic cancer.
{"title":"Estimation and inference on the partial volume under the receiver operating characteristic surface.","authors":"Kate J Young, Leonidas E Bantis","doi":"10.1177/09622802241267356","DOIUrl":"10.1177/09622802241267356","url":null,"abstract":"<p><p>measures of biomarker accuracy that employ the receiver operating characteristic surface have been proposed for biomarkers that classify patients into one of three groups: healthy, benign, or aggressive disease. The volume under the receiver operating characteristic surface summarizes the overall discriminatory ability of a biomarker in such configurations, but includes cutoffs associated with clinically irrelevant true classification rates. Due to the lethal nature of pancreatic cancer, cutoffs associated with a low true classification rate for identifying patients with pancreatic cancer may be undesirable and not appropriate for use in a clinical setting. In this project, we study the properties of a more focused criterion, the partial volume under the receiver operating characteristic surface, that summarizes the diagnostic accuracy of a marker in the three-class setting for regions restricted to only those of clinical interest. We propose methods for estimation and inference on the partial volume under the receiver operating characteristic surface under parametric and non-parametric frameworks and apply these methods to the evaluation of potential biomarkers for the diagnosis of pancreatic cancer.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1577-1594"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141907756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-08-14DOI: 10.1177/09622802241268488
David Payares-Garcia, Frank Osei, Jorge Mateu, Alfred Stein
Multivariate disease mapping is important for public health research, as it provides insights into spatial patterns of health outcomes. Geostatistical methods that are widely used for mapping spatially correlated health data encounter challenges when dealing with spatial count data. These include heterogeneity, zero-inflated distributions and unreliable estimation, and lead to difficulties when estimating spatial dependence and poor predictions. Variability in population sizes further complicates risk estimation from the counts. This study introduces multivariate Poisson cokriging for predicting and filtering out disease risk. Pairwise correlations between the target variable and multiple ancillary variables are included. By means of a simulation experiment and an application to human immunodeficiency virus incidence and sexually transmitted diseases data in Pennsylvania, we demonstrate accurate disease risk estimation that captures fine-scale variation. This method is compared with ordinary Poisson kriging in prediction and smoothing. Results of the simulation study show a reduction in the mean square prediction error when utilizing auxiliary correlated variables, with mean square prediction error values decreasing by up to 50%. This gain is further evident in the real data analysis, where Poisson cokriging yields a 74% drop in mean square prediction error relative to Poisson kriging, underscoring the value of incorporating secondary information. The findings of this work stress on the potential of Poisson cokriging in disease mapping and surveillance, offering richer risk predictions, better representation of spatial interdependencies, and identification of high-risk and low-risk areas.
{"title":"Multivariate Poisson cokriging: A geostatistical model for health count data.","authors":"David Payares-Garcia, Frank Osei, Jorge Mateu, Alfred Stein","doi":"10.1177/09622802241268488","DOIUrl":"10.1177/09622802241268488","url":null,"abstract":"<p><p>Multivariate disease mapping is important for public health research, as it provides insights into spatial patterns of health outcomes. Geostatistical methods that are widely used for mapping spatially correlated health data encounter challenges when dealing with spatial count data. These include heterogeneity, zero-inflated distributions and unreliable estimation, and lead to difficulties when estimating spatial dependence and poor predictions. Variability in population sizes further complicates risk estimation from the counts. This study introduces multivariate Poisson cokriging for predicting and filtering out disease risk. Pairwise correlations between the target variable and multiple ancillary variables are included. By means of a simulation experiment and an application to human immunodeficiency virus incidence and sexually transmitted diseases data in Pennsylvania, we demonstrate accurate disease risk estimation that captures fine-scale variation. This method is compared with ordinary Poisson kriging in prediction and smoothing. Results of the simulation study show a reduction in the mean square prediction error when utilizing auxiliary correlated variables, with mean square prediction error values decreasing by up to 50%. This gain is further evident in the real data analysis, where Poisson cokriging yields a 74% drop in mean square prediction error relative to Poisson kriging, underscoring the value of incorporating secondary information. The findings of this work stress on the potential of Poisson cokriging in disease mapping and surveillance, offering richer risk predictions, better representation of spatial interdependencies, and identification of high-risk and low-risk areas.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1637-1659"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11500483/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141976656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1177/09622802241269624
Sonja Zehetmayer, Franz Koenig, Martin Posch
The closure principle is a powerful approach to constructing efficient testing procedures controlling the familywise error rate in the strong sense. For small numbers of hypotheses and the setting of independent elementary -values we consider closed tests where each intersection hypothesis is tested with a -value combination test. Examples of such combination tests are the Fisher combination test, the Stouffer test, the Omnibus test, the truncated test, or the Wilson test. Some of these tests, such as the Fisher combination, the Stouffer, or the Omnibus test, are not consonant and rejection of the global null hypothesis does not always lead to rejection of at least one elementary null hypothesis. We develop a general principle to uniformly improve closed tests based on -value combination tests by modifying the rejection regions such that the new procedure becomes consonant. For the Fisher combination test and the Stouffer test, we show by simulations that this improvement can lead to a substantial increase in power.
封闭原理是构建有效检验程序的有力方法,它能从强意义上控制族内误差率。对于少量假设和独立基本 p 值的设置,我们可以考虑封闭检验,即用 p 值组合检验对每个交叉假设进行检验。这类组合检验的例子有费雪组合检验、斯托弗检验、全能检验、截断检验或威尔逊检验。其中一些检验,如费雪组合检验、Stouffer 检验或 Omnibus 检验,并不一致,拒绝全局零假设并不总是导致拒绝至少一个基本零假设。我们提出了一个一般原则,通过修改拒绝区域,使新的检验过程变得协整,从而统一改进基于 p 值组合检验的封闭检验。对于费雪组合检验和斯托弗检验,我们通过模拟证明了这种改进可以大幅提高检验的有效性。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">A general consonance principle for closure tests based on <ns0:math><ns0:mi>p</ns0:mi></ns0:math>-values.","authors":"Sonja Zehetmayer, Franz Koenig, Martin Posch","doi":"10.1177/09622802241269624","DOIUrl":"https://doi.org/10.1177/09622802241269624","url":null,"abstract":"<p><p>The closure principle is a powerful approach to constructing efficient testing procedures controlling the familywise error rate in the strong sense. For small numbers of hypotheses and the setting of independent elementary <math><mi>p</mi></math>-values we consider closed tests where each intersection hypothesis is tested with a <math><mi>p</mi></math>-value combination test. Examples of such combination tests are the Fisher combination test, the Stouffer test, the Omnibus test, the truncated test, or the Wilson test. Some of these tests, such as the Fisher combination, the Stouffer, or the Omnibus test, are not consonant and rejection of the global null hypothesis does not always lead to rejection of at least one elementary null hypothesis. We develop a general principle to uniformly improve closed tests based on <math><mi>p</mi></math>-value combination tests by modifying the rejection regions such that the new procedure becomes consonant. For the Fisher combination test and the Stouffer test, we show by simulations that this improvement can lead to a substantial increase in power.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":"33 9","pages":"1595-1609"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142508327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-08-06DOI: 10.1177/09622802241268601
Qingning Zhou, Kin Yau Wong
The case-cohort design is a commonly used cost-effective sampling strategy for large cohort studies, where some covariates are expensive to measure or obtain. In this paper, we consider regression analysis under a case-cohort study with interval-censored failure time data, where the failure time is only known to fall within an interval instead of being exactly observed. A common approach to analyzing data from a case-cohort study is the inverse probability weighting approach, where only subjects in the case-cohort sample are used in estimation, and the subjects are weighted based on the probability of inclusion into the case-cohort sample. This approach, though consistent, is generally inefficient as it does not incorporate information outside the case-cohort sample. To improve efficiency, we first develop a sieve maximum weighted likelihood estimator under the Cox model based on the case-cohort sample and then propose a procedure to update this estimator by using information in the full cohort. We show that the update estimator is consistent, asymptotically normal, and at least as efficient as the original estimator. The proposed method can flexibly incorporate auxiliary variables to improve estimation efficiency. A weighted bootstrap procedure is employed for variance estimation. Simulation results indicate that the proposed method works well in practical situations. An application to a Phase 3 HIV vaccine efficacy trial is provided for illustration.
病例队列设计是大型队列研究中常用的一种具有成本效益的抽样策略,因为有些协变量的测量或获取成本较高。在本文中,我们将考虑在病例队列研究中使用区间删失的故障时间数据进行回归分析,在区间删失的故障时间中,只知道故障时间在一个区间内,而不是精确观测到的故障时间。分析病例队列研究数据的常用方法是反概率加权法,即只使用病例队列样本中的受试者进行估计,并根据纳入病例队列样本的概率对受试者进行加权。这种方法虽然前后一致,但由于没有纳入病例队列样本以外的信息,因此效率普遍较低。为了提高效率,我们首先开发了基于病例队列样本的考克斯模型下的筛网最大加权似然估计器,然后提出了一种利用完整队列信息更新该估计器的程序。我们的研究表明,更新后的估计值是一致的、渐近正态的,并且至少与原始估计值一样有效。所提出的方法可以灵活地结合辅助变量来提高估计效率。方差估计采用了加权自举程序。模拟结果表明,所提出的方法在实际情况中效果良好。本文还提供了一个 HIV 疫苗 3 期疗效试验的应用实例,以资说明。
{"title":"Improving estimation efficiency of case-cohort studies with interval-censored failure time data.","authors":"Qingning Zhou, Kin Yau Wong","doi":"10.1177/09622802241268601","DOIUrl":"10.1177/09622802241268601","url":null,"abstract":"<p><p>The case-cohort design is a commonly used cost-effective sampling strategy for large cohort studies, where some covariates are expensive to measure or obtain. In this paper, we consider regression analysis under a case-cohort study with interval-censored failure time data, where the failure time is only known to fall within an interval instead of being exactly observed. A common approach to analyzing data from a case-cohort study is the inverse probability weighting approach, where only subjects in the case-cohort sample are used in estimation, and the subjects are weighted based on the probability of inclusion into the case-cohort sample. This approach, though consistent, is generally inefficient as it does not incorporate information outside the case-cohort sample. To improve efficiency, we first develop a sieve maximum weighted likelihood estimator under the Cox model based on the case-cohort sample and then propose a procedure to update this estimator by using information in the full cohort. We show that the update estimator is consistent, asymptotically normal, and at least as efficient as the original estimator. The proposed method can flexibly incorporate auxiliary variables to improve estimation efficiency. A weighted bootstrap procedure is employed for variance estimation. Simulation results indicate that the proposed method works well in practical situations. An application to a Phase 3 HIV vaccine efficacy trial is provided for illustration.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1673-1685"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141894364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-08-06DOI: 10.1177/09622802241265501
Zhaojin Li, Xiang Geng, Yawen Hou, Zheng Chen
It is not uncommon for a substantial proportion of patients to be cured (or survive long-term) in clinical trials with time-to-event endpoints, such as the endometrial cancer trial. When designing a clinical trial, a mixture cure model should be used to fully consider the cure fraction. Previously, mixture cure model sample size calculations were based on the proportional hazards assumption of latency distribution between groups, and the log-rank test was used for deriving sample size formulas. In real studies, the latency distributions of the two groups often do not satisfy the proportional hazards assumptions. This article has derived a sample size calculation formula for a mixture cure model with restricted mean survival time as the primary endpoint, and did simulation and example studies. The restricted mean survival time test is not subject to proportional hazards assumptions, and the difference in treatment effect obtained can be quantified as the number of years (or months) increased or decreased in survival time, making it very convenient for clinical patient-physician communication. The simulation results showed that the sample sizes estimated by the restricted mean survival time test for the mixture cure model were accurate regardless of whether the proportional hazards assumptions were satisfied and were smaller than the sample sizes estimated by the log-rank test in most cases for the scenarios in which the proportional hazards assumptions were violated.
{"title":"Sample size calculation for mixture cure model with restricted mean survival time as a primary endpoint.","authors":"Zhaojin Li, Xiang Geng, Yawen Hou, Zheng Chen","doi":"10.1177/09622802241265501","DOIUrl":"10.1177/09622802241265501","url":null,"abstract":"<p><p>It is not uncommon for a substantial proportion of patients to be cured (or survive long-term) in clinical trials with time-to-event endpoints, such as the endometrial cancer trial. When designing a clinical trial, a mixture cure model should be used to fully consider the cure fraction. Previously, mixture cure model sample size calculations were based on the proportional hazards assumption of latency distribution between groups, and the log-rank test was used for deriving sample size formulas. In real studies, the latency distributions of the two groups often do not satisfy the proportional hazards assumptions. This article has derived a sample size calculation formula for a mixture cure model with restricted mean survival time as the primary endpoint, and did simulation and example studies. The restricted mean survival time test is not subject to proportional hazards assumptions, and the difference in treatment effect obtained can be quantified as the number of years (or months) increased or decreased in survival time, making it very convenient for clinical patient-physician communication. The simulation results showed that the sample sizes estimated by the restricted mean survival time test for the mixture cure model were accurate regardless of whether the proportional hazards assumptions were satisfied and were smaller than the sample sizes estimated by the log-rank test in most cases for the scenarios in which the proportional hazards assumptions were violated.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1546-1558"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-08-08DOI: 10.1177/09622802241267808
Muhammad Umair, Manzoor Khan, Jake Olivier
Regression to the mean occurs when an unusual observation is followed by a more typical outcome closer to the population mean. In pre- and post-intervention studies, treatment is administered to subjects with initial measurements located in the tail of a distribution, and a paired sample -test can be utilized to assess the effectiveness of the intervention. The observed change in the pre-post means is the sum of regression to the mean and treatment effects, and ignoring regression to the mean could lead to erroneous conclusions about the effectiveness of the treatment effect. In this study, formulae for regression to the mean are derived, and maximum likelihood estimation is employed to numerically estimate the regression to the mean effect when the test statistic follows the bivariate -distribution based on a baseline criterion or a cut-off point. The pre-post degrees of freedom could be equal but also unequal such as when there is missing data. Additionally, we illustrate how regression to the mean is influenced by cut-off points, mixing angles which are related to correlation, and degrees of freedom. A simulation study is conducted to assess the statistical properties of unbiasedness, consistency, and asymptotic normality of the regression to the mean estimator. Moreover, the proposed methods are compared with an existing one assuming bivariate normality. The -values are compared when regression to the mean is either ignored or accounted for to gauge the statistical significance of the paired -test. The proposed method is applied to real data concerning schizophrenia patients, and the observed conditional mean difference called the total effect is decomposed into the regression to the mean and treatment effects.
当不寻常的观察结果之后出现更接近人群平均值的典型结果时,就会出现向平均值回归的现象。在干预前和干预后研究中,对初始测量值位于分布尾部的受试者进行治疗,可以利用配对样本 t 检验来评估干预的效果。观察到的干预前平均值的变化是向平均值回归和治疗效果的总和,忽略向平均值回归可能会导致对治疗效果的有效性得出错误的结论。本研究推导了回归均值的公式,并采用最大似然估计法对回归均值效应进行了数值估计,当检验统计量遵循基于基线标准或临界点的双变量 t 分布时。前后自由度可以相等,但也可以不相等,如数据缺失时。此外,我们还说明了均值回归如何受到临界点、与相关性有关的混合角和自由度的影响。我们进行了模拟研究,以评估均值回归估计器的无偏性、一致性和渐近正态性等统计特性。此外,还将所提出的方法与现有的假设二元正态的方法进行了比较。比较了忽略或考虑均值回归时的 p 值,以衡量配对 t 检验的统计意义。将所提出的方法应用于精神分裂症患者的真实数据,并将观察到的称为总效应的条件均值差异分解为均值回归效应和治疗效应。
{"title":"<ArticleTitle xmlns:ns0=\"http://www.w3.org/1998/Math/MathML\">Accounting for regression to the mean under the bivariate <ns0:math><ns0:mi>t</ns0:mi></ns0:math>-distribution.","authors":"Muhammad Umair, Manzoor Khan, Jake Olivier","doi":"10.1177/09622802241267808","DOIUrl":"10.1177/09622802241267808","url":null,"abstract":"<p><p>Regression to the mean occurs when an unusual observation is followed by a more typical outcome closer to the population mean. In pre- and post-intervention studies, treatment is administered to subjects with initial measurements located in the tail of a distribution, and a paired sample <math><mi>t</mi></math>-test can be utilized to assess the effectiveness of the intervention. The observed change in the pre-post means is the sum of regression to the mean and treatment effects, and ignoring regression to the mean could lead to erroneous conclusions about the effectiveness of the treatment effect. In this study, formulae for regression to the mean are derived, and maximum likelihood estimation is employed to numerically estimate the regression to the mean effect when the test statistic follows the bivariate <math><mi>t</mi></math>-distribution based on a baseline criterion or a cut-off point. The pre-post degrees of freedom could be equal but also unequal such as when there is missing data. Additionally, we illustrate how regression to the mean is influenced by cut-off points, mixing angles which are related to correlation, and degrees of freedom. A simulation study is conducted to assess the statistical properties of unbiasedness, consistency, and asymptotic normality of the regression to the mean estimator. Moreover, the proposed methods are compared with an existing one assuming bivariate normality. The <math><mi>p</mi></math>-values are compared when regression to the mean is either ignored or accounted for to gauge the statistical significance of the paired <math><mi>t</mi></math>-test. The proposed method is applied to real data concerning schizophrenia patients, and the observed conditional mean difference called the total effect is decomposed into the regression to the mean and treatment effects.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1624-1636"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141907755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-08-07DOI: 10.1177/09622802241267812
Fangfang Bai, Xiaoran Yang, Xuerong Chen, Xiaofei Wang
The restricted mean survival time (RMST) is often of direct interest in clinical studies involving censored survival outcomes. It describes the area under the survival curve from time zero to a specified time point. When data are subject to length-biased sampling, as is frequently encountered in observational cohort studies, existing methods cannot estimate the RMST for various restriction times through a single model. In this article, we model the RMST as a continuous function of the restriction time under the setting of length-biased sampling. Two approaches based on estimating equations are proposed to estimate the time-varying effects of covariates. Finally, we establish the asymptotic properties for the proposed estimators. Simulation studies are performed to demonstrate the finite sample performance. Two real-data examples are analyzed by our procedures.
{"title":"Inference for restricted mean survival time as a function of restriction time under length-biased sampling.","authors":"Fangfang Bai, Xiaoran Yang, Xuerong Chen, Xiaofei Wang","doi":"10.1177/09622802241267812","DOIUrl":"10.1177/09622802241267812","url":null,"abstract":"<p><p>The restricted mean survival time (RMST) is often of direct interest in clinical studies involving censored survival outcomes. It describes the area under the survival curve from time zero to a specified time point. When data are subject to length-biased sampling, as is frequently encountered in observational cohort studies, existing methods cannot estimate the RMST for various restriction times through a single model. In this article, we model the RMST as a continuous function of the restriction time under the setting of length-biased sampling. Two approaches based on estimating equations are proposed to estimate the time-varying effects of covariates. Finally, we establish the asymptotic properties for the proposed estimators. Simulation studies are performed to demonstrate the finite sample performance. Two real-data examples are analyzed by our procedures.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1610-1623"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141898271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-07-25DOI: 10.1177/09622802241262526
Joseph Descallar, Jun Ma, Houying Zhu, Stephane Heritier, Rory Wolfe
The cause-specific hazard Cox model is widely used in analyzing competing risks survival data, and the partial likelihood method is a standard approach when survival times contain only right censoring. In practice, however, interval-censored survival times often arise, and this means the partial likelihood method is not directly applicable. Two common remedies in practice are (i) to replace each censoring interval with a single value, such as the middle point; or (ii) to redefine the event of interest, such as the time to diagnosis instead of the time to recurrence of a disease. However, the mid-point approach can cause biased parameter estimates. In this article, we develop a penalized likelihood approach to fit semi-parametric cause-specific hazard Cox models, and this method is general enough to allow left, right, and interval censoring times. Penalty functions are used to regularize the baseline hazard estimates and also to make these estimates less affected by the number and location of knots used for the estimates. We will provide asymptotic properties for the estimated parameters. A simulation study is designed to compare our method with the mid-point partial likelihood approach. We apply our method to the Aspirin in Reducing Events in the Elderly (ASPREE) study, illustrating an application of our proposed method.
特定原因危险 Cox 模型被广泛用于分析竞争风险生存数据,当生存时间只包含右删失时,部分似然法是一种标准方法。然而,在实际应用中,往往会出现间隔删失的生存时间,这就意味着偏似然法不能直接适用。在实践中有两种常见的补救方法:(i) 用单一值(如中间点)代替每个删失区间;或 (ii) 重新定义感兴趣的事件,如用诊断时间代替疾病复发时间。然而,中点法可能会导致参数估计偏差。在本文中,我们开发了一种惩罚似然法来拟合半参数病因特异性危险 Cox 模型,这种方法具有足够的通用性,允许左侧、右侧和区间普查时间。惩罚函数用于正则化基线危险估计值,并使这些估计值较少受到用于估计的结点数量和位置的影响。我们将提供估计参数的渐近特性。我们设计了一项模拟研究,将我们的方法与中点部分似然法进行比较。我们将我们的方法应用于阿司匹林在减少老年人事件中的作用(ASPREE)研究,以说明我们提出的方法的应用。
{"title":"Cause-specific hazard Cox models with partly interval censoring - Penalized likelihood estimation using Gaussian quadrature.","authors":"Joseph Descallar, Jun Ma, Houying Zhu, Stephane Heritier, Rory Wolfe","doi":"10.1177/09622802241262526","DOIUrl":"10.1177/09622802241262526","url":null,"abstract":"<p><p>The cause-specific hazard Cox model is widely used in analyzing competing risks survival data, and the partial likelihood method is a standard approach when survival times contain only right censoring. In practice, however, interval-censored survival times often arise, and this means the partial likelihood method is not directly applicable. Two common remedies in practice are (i) to replace each censoring interval with a single value, such as the middle point; or (ii) to redefine the event of interest, such as the time to diagnosis instead of the time to recurrence of a disease. However, the mid-point approach can cause biased parameter estimates. In this article, we develop a penalized likelihood approach to fit semi-parametric cause-specific hazard Cox models, and this method is general enough to allow left, right, and interval censoring times. Penalty functions are used to regularize the baseline hazard estimates and also to make these estimates less affected by the number and location of knots used for the estimates. We will provide asymptotic properties for the estimated parameters. A simulation study is designed to compare our method with the mid-point partial likelihood approach. We apply our method to the Aspirin in Reducing Events in the Elderly (ASPREE) study, illustrating an application of our proposed method.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1531-1545"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523552/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141760989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-05-29DOI: 10.1177/09622802241248382
Yongdong Ouyang, Monica Taljaard, Andrew B Forbes, Fan Li
Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials. A key consideration for analyzing a stepped-wedge cluster randomized trial is accounting for the potentially complex correlation structure, which can be achieved by specifying random-effects. The simplest random effects structure is random intercept but more complex structures such as random cluster-by-period, discrete-time decay, and more recently, the random intervention structure, have been proposed. Specifying appropriate random effects in practice can be challenging: assuming more complex correlation structures may be reasonable but they are vulnerable to computational challenges. To circumvent these challenges, robust variance estimators may be applied to linear mixed models to provide consistent estimators of standard errors of fixed effect parameters in the presence of random-effects misspecification. However, there has been no empirical investigation of robust variance estimators for stepped-wedge cluster randomized trials. In this article, we review six robust variance estimators (both standard and small-sample bias-corrected robust variance estimators) that are available for linear mixed models in R, and then describe a comprehensive simulation study to examine the performance of these robust variance estimators for stepped-wedge cluster randomized trials with a continuous outcome under different data generators. For each data generator, we investigate whether the use of a robust variance estimator with either the random intercept model or the random cluster-by-period model is sufficient to provide valid statistical inference for fixed effect parameters, when these working models are subject to random-effect misspecification. Our results indicate that the random intercept and random cluster-by-period models with robust variance estimators performed adequately. The CR3 robust variance estimator (approximate jackknife) estimator, coupled with the number of clusters minus two degrees of freedom correction, consistently gave the best coverage results, but could be slightly conservative when the number of clusters was below 16. We summarize the implications of our results for the linear mixed model analysis of stepped-wedge cluster randomized trials and offer some practical recommendations on the choice of the analytic model.
线性混合模型通常用于分析阶梯式楔形分组随机试验。分析阶梯式楔形分组随机试验的一个主要考虑因素是考虑潜在的复杂相关结构,这可以通过指定随机效应来实现。最简单的随机效应结构是随机截距,但也有人提出了更复杂的结构,如按期随机分组、离散时间衰减以及最近提出的随机干预结构。在实践中指定适当的随机效应可能具有挑战性:假设更复杂的相关结构可能是合理的,但它们容易受到计算挑战的影响。为了规避这些挑战,可以将稳健方差估计器应用于线性混合模型,以便在随机效应指定错误的情况下,提供固定效应参数标准误差的一致估计器。然而,目前还没有针对阶梯楔形分组随机试验的稳健方差估计器的实证研究。在这篇文章中,我们回顾了 R 语言中可用于线性混合模型的 6 个稳健方差估计器(包括标准稳健方差估计器和小样本偏差校正稳健方差估计器),然后介绍了一项综合模拟研究,以检验这些稳健方差估计器在不同数据生成器下用于连续结果的阶梯楔形分组随机试验的性能。对于每种数据生成器,我们研究了当这些工作模型受到随机效应错误规范的影响时,使用随机截距模型或随机逐期分组模型的稳健方差估计器是否足以为固定效应参数提供有效的统计推断。我们的结果表明,采用稳健方差估计器的随机截距模型和随机逐期聚类模型表现良好。CR3 稳健方差估计器(近似千分法)估计器加上聚类数减去两个自由度校正,一直能提供最好的覆盖结果,但当聚类数低于 16 时可能略显保守。我们总结了我们的结果对阶梯楔形分组随机试验线性混合模型分析的影响,并就分析模型的选择提出了一些实用建议。
{"title":"Maintaining the validity of inference from linear mixed models in stepped-wedge cluster randomized trials under misspecified random-effects structures.","authors":"Yongdong Ouyang, Monica Taljaard, Andrew B Forbes, Fan Li","doi":"10.1177/09622802241248382","DOIUrl":"10.1177/09622802241248382","url":null,"abstract":"<p><p>Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials. A key consideration for analyzing a stepped-wedge cluster randomized trial is accounting for the potentially complex correlation structure, which can be achieved by specifying random-effects. The simplest random effects structure is random intercept but more complex structures such as random cluster-by-period, discrete-time decay, and more recently, the random intervention structure, have been proposed. Specifying appropriate random effects in practice can be challenging: assuming more complex correlation structures may be reasonable but they are vulnerable to computational challenges. To circumvent these challenges, robust variance estimators may be applied to linear mixed models to provide consistent estimators of standard errors of fixed effect parameters in the presence of random-effects misspecification. However, there has been no empirical investigation of robust variance estimators for stepped-wedge cluster randomized trials. In this article, we review six robust variance estimators (both standard and small-sample bias-corrected robust variance estimators) that are available for linear mixed models in R, and then describe a comprehensive simulation study to examine the performance of these robust variance estimators for stepped-wedge cluster randomized trials with a continuous outcome under different data generators. For each data generator, we investigate whether the use of a robust variance estimator with either the random intercept model or the random cluster-by-period model is sufficient to provide valid statistical inference for fixed effect parameters, when these working models are subject to random-effect misspecification. Our results indicate that the random intercept and random cluster-by-period models with robust variance estimators performed adequately. The CR3 robust variance estimator (approximate jackknife) estimator, coupled with the number of clusters minus two degrees of freedom correction, consistently gave the best coverage results, but could be slightly conservative when the number of clusters was below 16. We summarize the implications of our results for the linear mixed model analysis of stepped-wedge cluster randomized trials and offer some practical recommendations on the choice of the analytic model.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1497-1516"},"PeriodicalIF":1.6,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11499024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141162723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}