In multivariate extreme value analysis, the estimation of the dependence structure in extremes is a challenging task, especially in the context of high-dimensional data. Therefore, a common approach is to reduce the model dimension by considering only the directions in which extreme values occur. In this paper, we use the concept of sparse regular variation recently introduced by Meyer and Wintenberger (2021) to derive information criteria for the number of directions in which extreme events occur, such as a Bayesian information criterion (BIC), a mean-squared error-based information criterion (MSEIC), and a quasi-Akaike information criterion (QAIC) based on the Gaussian likelihood function. As is typical in extreme value analysis, a challenging task is the choice of the number $k_n$ of observations used for the estimation. Therefore, for all information criteria, we present a two-step procedure to estimate both the number of directions of extremes and an optimal choice of $k_n$. We prove that the AIC of Meyer and Wintenberger (2023) and the MSEIC are inconsistent information criteria for the number of extreme directions whereas the BIC and the QAIC are consistent information criteria. Finally, the performance of the different information criteria is compared in a simulation study and applied on wind speed data.
{"title":"Information criteria for the number of directions of extremes in high-dimensional data","authors":"Lucas Butsch, Vicky Fasen-Hartmann","doi":"arxiv-2409.10174","DOIUrl":"https://doi.org/arxiv-2409.10174","url":null,"abstract":"In multivariate extreme value analysis, the estimation of the dependence\u0000structure in extremes is a challenging task, especially in the context of\u0000high-dimensional data. Therefore, a common approach is to reduce the model\u0000dimension by considering only the directions in which extreme values occur. In\u0000this paper, we use the concept of sparse regular variation recently introduced\u0000by Meyer and Wintenberger (2021) to derive information criteria for the number\u0000of directions in which extreme events occur, such as a Bayesian information\u0000criterion (BIC), a mean-squared error-based information criterion (MSEIC), and\u0000a quasi-Akaike information criterion (QAIC) based on the Gaussian likelihood\u0000function. As is typical in extreme value analysis, a challenging task is the\u0000choice of the number $k_n$ of observations used for the estimation. Therefore,\u0000for all information criteria, we present a two-step procedure to estimate both\u0000the number of directions of extremes and an optimal choice of $k_n$. We prove\u0000that the AIC of Meyer and Wintenberger (2023) and the MSEIC are inconsistent\u0000information criteria for the number of extreme directions whereas the BIC and\u0000the QAIC are consistent information criteria. Finally, the performance of the\u0000different information criteria is compared in a simulation study and applied on\u0000wind speed data.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"100 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Basket trials examine the efficacy of an intervention in multiple patient subgroups simultaneously. The division into subgroups, called baskets, is based on matching medical characteristics, which may result in small sample sizes within baskets that are also likely to differ. Sparse data complicate statistical inference. Several Bayesian methods have been proposed in the literature that allow information sharing between baskets to increase statistical power. In this work, we provide a systematic comparison of five different Bayesian basket trial designs when sample sizes differ between baskets. We consider the power prior approach with both known and new weighting methods, a design by Fujikawa et al., as well as models based on Bayesian hierarchical modeling and Bayesian model averaging. The results of our simulation study show a high sensitivity to changing sample sizes for Fujikawa's design and the power prior approach. Limiting the amount of shared information was found to be decisive for the robustness to varying basket sizes. In combination with the power prior approach, this resulted in the best performance and the most reliable detection of an effect of the treatment under investigation and its absence.
{"title":"Systematic comparison of Bayesian basket trial designs with unequal sample sizes and proposal of a new method based on power priors","authors":"Sabrina Schmitt, Lukas Baumann","doi":"arxiv-2409.10318","DOIUrl":"https://doi.org/arxiv-2409.10318","url":null,"abstract":"Basket trials examine the efficacy of an intervention in multiple patient\u0000subgroups simultaneously. The division into subgroups, called baskets, is based\u0000on matching medical characteristics, which may result in small sample sizes\u0000within baskets that are also likely to differ. Sparse data complicate\u0000statistical inference. Several Bayesian methods have been proposed in the\u0000literature that allow information sharing between baskets to increase\u0000statistical power. In this work, we provide a systematic comparison of five\u0000different Bayesian basket trial designs when sample sizes differ between\u0000baskets. We consider the power prior approach with both known and new weighting\u0000methods, a design by Fujikawa et al., as well as models based on Bayesian\u0000hierarchical modeling and Bayesian model averaging. The results of our\u0000simulation study show a high sensitivity to changing sample sizes for\u0000Fujikawa's design and the power prior approach. Limiting the amount of shared\u0000information was found to be decisive for the robustness to varying basket\u0000sizes. In combination with the power prior approach, this resulted in the best\u0000performance and the most reliable detection of an effect of the treatment under\u0000investigation and its absence.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One fundamental challenge of data-driven analysis in neuroscience is modeling causal interactions and exploring the connectivity of nodes in a brain network. Various statistical methods, relying on various perspectives and employing different data modalities, are being developed to examine and comprehend the underlying causal structures inherent to brain dynamics. This study introduces a novel statistical approach, TAR4C, to dissect causal interactions in multichannel EEG recordings. TAR4C uses the threshold autoregressive model to describe the causal interaction between nodes or clusters of nodes in a brain network. The perspective involves testing whether one node, which may represent a brain region, can control the dynamics of the other. The node that has such an impact on the other is called a threshold variable and can be classified as a causative because its functionality is the leading source operating as an instantaneous switching mechanism that regulates the time-varying autoregressive structure of the other. This statistical concept is commonly referred to as threshold non-linearity. Once threshold non-linearity has been verified between a pair of nodes, the subsequent essential facet of TAR modeling is to assess the predictive ability of the causal node for the current activity on the other and represent causal interactions in autoregressive terms. This predictive ability is what underlies Granger causality. The TAR4C approach can discover non-linear and time-dependent causal interactions without negating the G-causality perspective. The efficacy of the proposed approach is exemplified by analyzing the EEG signals recorded during the motor movement/imagery experiment. The similarities and differences between the causal interactions manifesting during the execution and the imagery of a given motor movement are demonstrated by analyzing EEG recordings from multiple subjects.
{"title":"Nonlinear Causality in Brain Networks: With Application to Motor Imagery vs Execution","authors":"Sipan Aslan, Hernando Ombao","doi":"arxiv-2409.10374","DOIUrl":"https://doi.org/arxiv-2409.10374","url":null,"abstract":"One fundamental challenge of data-driven analysis in neuroscience is modeling\u0000causal interactions and exploring the connectivity of nodes in a brain network.\u0000Various statistical methods, relying on various perspectives and employing\u0000different data modalities, are being developed to examine and comprehend the\u0000underlying causal structures inherent to brain dynamics. This study introduces\u0000a novel statistical approach, TAR4C, to dissect causal interactions in\u0000multichannel EEG recordings. TAR4C uses the threshold autoregressive model to\u0000describe the causal interaction between nodes or clusters of nodes in a brain\u0000network. The perspective involves testing whether one node, which may represent\u0000a brain region, can control the dynamics of the other. The node that has such\u0000an impact on the other is called a threshold variable and can be classified as\u0000a causative because its functionality is the leading source operating as an\u0000instantaneous switching mechanism that regulates the time-varying\u0000autoregressive structure of the other. This statistical concept is commonly\u0000referred to as threshold non-linearity. Once threshold non-linearity has been\u0000verified between a pair of nodes, the subsequent essential facet of TAR\u0000modeling is to assess the predictive ability of the causal node for the current\u0000activity on the other and represent causal interactions in autoregressive\u0000terms. This predictive ability is what underlies Granger causality. The TAR4C\u0000approach can discover non-linear and time-dependent causal interactions without\u0000negating the G-causality perspective. The efficacy of the proposed approach is\u0000exemplified by analyzing the EEG signals recorded during the motor\u0000movement/imagery experiment. The similarities and differences between the\u0000causal interactions manifesting during the execution and the imagery of a given\u0000motor movement are demonstrated by analyzing EEG recordings from multiple\u0000subjects.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent years have seen increased interest in combining drug agents and/or schedules. Several methods for Phase I combination-escalation trials are proposed, among which, the partial ordering continual reassessment method (POCRM) gained great attention for its simplicity and good operational characteristics. However, the one-parameter nature of the POCRM makes it restrictive in more complicated settings such as the inclusion of a control group. This paper proposes a Bayesian partial ordering logistic model (POBLRM), which combines partial ordering and the more flexible (than CRM) two-parameter logistic model. Simulation studies show that the POBLRM performs similarly as the POCRM in non-randomised settings. When patients are randomised between the experimental dose-combinations and a control, performance is drastically improved. Most designs require specifying hyper-parameters, often chosen from statistical considerations (operational prior). The conventional "grid search'' calibration approach requires large simulations, which are computationally costly. A novel "cyclic calibration" has been proposed to reduce the computation from multiplicative to additive. Furthermore, calibration processes should consider wide ranges of scenarios of true toxicity probabilities to avoid bias. A method to reduce scenarios based on scenario-complexities is suggested. This can reduce the computation by more than 500 folds while remaining operational characteristics similar to the grid search.
{"title":"Partial Ordering Bayesian Logistic Regression Model for Phase I Combination Trials and Computationally Efficient Approach to Operational Prior Specification","authors":"Weishi Chen, Pavel Mozgunov","doi":"arxiv-2409.10352","DOIUrl":"https://doi.org/arxiv-2409.10352","url":null,"abstract":"Recent years have seen increased interest in combining drug agents and/or\u0000schedules. Several methods for Phase I combination-escalation trials are\u0000proposed, among which, the partial ordering continual reassessment method\u0000(POCRM) gained great attention for its simplicity and good operational\u0000characteristics. However, the one-parameter nature of the POCRM makes it\u0000restrictive in more complicated settings such as the inclusion of a control\u0000group. This paper proposes a Bayesian partial ordering logistic model (POBLRM),\u0000which combines partial ordering and the more flexible (than CRM) two-parameter\u0000logistic model. Simulation studies show that the POBLRM performs similarly as\u0000the POCRM in non-randomised settings. When patients are randomised between the\u0000experimental dose-combinations and a control, performance is drastically\u0000improved. Most designs require specifying hyper-parameters, often chosen from\u0000statistical considerations (operational prior). The conventional \"grid search''\u0000calibration approach requires large simulations, which are computationally\u0000costly. A novel \"cyclic calibration\" has been proposed to reduce the\u0000computation from multiplicative to additive. Furthermore, calibration processes\u0000should consider wide ranges of scenarios of true toxicity probabilities to\u0000avoid bias. A method to reduce scenarios based on scenario-complexities is\u0000suggested. This can reduce the computation by more than 500 folds while\u0000remaining operational characteristics similar to the grid search.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bayesian predictive synthesis is useful in synthesizing multiple predictive distributions coherently. However, the proof for the fundamental equation of the synthesized predictive density has been missing. In this technical report, we review the series of research on predictive synthesis, then fill the gap between the known results and the equation used in modern applications. We provide two proofs and clarify the structure of predictive synthesis.
{"title":"On the Proofs of the Predictive Synthesis Formula","authors":"Riku Masuda, Kaoru Irie","doi":"arxiv-2409.09660","DOIUrl":"https://doi.org/arxiv-2409.09660","url":null,"abstract":"Bayesian predictive synthesis is useful in synthesizing multiple predictive\u0000distributions coherently. However, the proof for the fundamental equation of\u0000the synthesized predictive density has been missing. In this technical report,\u0000we review the series of research on predictive synthesis, then fill the gap\u0000between the known results and the equation used in modern applications. We\u0000provide two proofs and clarify the structure of predictive synthesis.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A multistate cure model is a statistical framework used to analyze and represent the transitions that individuals undergo between different states over time, taking into account the possibility of being cured by initial treatment. This model is particularly useful in pediatric oncology where a fraction of the patient population achieves cure through treatment and therefore they will never experience some events. Our study develops a generalized algorithm based on the extended long data format, an extension of long data format where a transition can be split up to two rows each with a weight assigned reflecting the posterior probability of its cure status. The multistate cure model is fit on top of the current framework of multistate model and mixture cure model. The proposed algorithm makes use of the Expectation-Maximization (EM) algorithm and weighted likelihood representation such that it is easy to implement with standard package. As an example, the proposed algorithm is applied on data from the European Society for Blood and Marrow Transplantation (EBMT). Standard errors of the estimated parameters are obtained via a non-parametric bootstrap procedure, while the method involving the calculation of the second-derivative matrix of the observed log-likelihood is also presented.
{"title":"A general approach to fitting multistate cure models based on an extended-long-format data structure","authors":"Yilin Jiang, Harm van Tinteren, Marta Fiocco","doi":"arxiv-2409.09865","DOIUrl":"https://doi.org/arxiv-2409.09865","url":null,"abstract":"A multistate cure model is a statistical framework used to analyze and\u0000represent the transitions that individuals undergo between different states\u0000over time, taking into account the possibility of being cured by initial\u0000treatment. This model is particularly useful in pediatric oncology where a\u0000fraction of the patient population achieves cure through treatment and\u0000therefore they will never experience some events. Our study develops a\u0000generalized algorithm based on the extended long data format, an extension of\u0000long data format where a transition can be split up to two rows each with a\u0000weight assigned reflecting the posterior probability of its cure status. The\u0000multistate cure model is fit on top of the current framework of multistate\u0000model and mixture cure model. The proposed algorithm makes use of the\u0000Expectation-Maximization (EM) algorithm and weighted likelihood representation\u0000such that it is easy to implement with standard package. As an example, the\u0000proposed algorithm is applied on data from the European Society for Blood and\u0000Marrow Transplantation (EBMT). Standard errors of the estimated parameters are\u0000obtained via a non-parametric bootstrap procedure, while the method involving\u0000the calculation of the second-derivative matrix of the observed log-likelihood\u0000is also presented.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Previous work on fantasy basketball quantifies player value for category leagues without taking draft circumstances into account. Quantifying value in this way is convenient, but inherently limited as a strategy, because it precludes the possibility of dynamic adaptation. This work introduces a framework for dynamic algorithms, dubbed "H-scoring", and describes an implementation of the framework for head-to-head formats, dubbed $H_0$. $H_0$ models many of the main aspects of category league strategy including category weighting, positional assignments, and format-specific objectives. Head-to-head simulations provide evidence that $H_0$ outperforms static ranking lists. Category-level results from the simulations reveal that one component of $H_0$'s strategy is punting a subset of categories, which it learns to do implicitly.
{"title":"Dynamic quantification of player value for fantasy basketball","authors":"Zach Rosenof","doi":"arxiv-2409.09884","DOIUrl":"https://doi.org/arxiv-2409.09884","url":null,"abstract":"Previous work on fantasy basketball quantifies player value for category\u0000leagues without taking draft circumstances into account. Quantifying value in\u0000this way is convenient, but inherently limited as a strategy, because it\u0000precludes the possibility of dynamic adaptation. This work introduces a\u0000framework for dynamic algorithms, dubbed \"H-scoring\", and describes an\u0000implementation of the framework for head-to-head formats, dubbed $H_0$. $H_0$\u0000models many of the main aspects of category league strategy including category\u0000weighting, positional assignments, and format-specific objectives. Head-to-head\u0000simulations provide evidence that $H_0$ outperforms static ranking lists.\u0000Category-level results from the simulations reveal that one component of\u0000$H_0$'s strategy is punting a subset of categories, which it learns to do\u0000implicitly.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Chen, Wenbin Lu, Shu Yang, Dipankar Bandyopadhyay
While the classic off-policy evaluation (OPE) literature commonly assumes decision time points to be evenly spaced for simplicity, in many real-world scenarios, such as those involving user-initiated visits, decisions are made at irregularly-spaced and potentially outcome-dependent time points. For a more principled evaluation of the dynamic policies, this paper constructs a novel OPE framework, which concerns not only the state-action process but also an observation process dictating the time points at which decisions are made. The framework is closely connected to the Markov decision process in computer science and with the renewal process in the statistical literature. Within the framework, two distinct value functions, derived from cumulative reward and integrated reward respectively, are considered, and statistical inference for each value function is developed under revised Markov and time-homogeneous assumptions. The validity of the proposed method is further supported by theoretical results, simulation studies, and a real-world application from electronic health records (EHR) evaluating periodontal disease treatments.
{"title":"Off-Policy Evaluation with Irregularly-Spaced, Outcome-Dependent Observation Times","authors":"Xin Chen, Wenbin Lu, Shu Yang, Dipankar Bandyopadhyay","doi":"arxiv-2409.09236","DOIUrl":"https://doi.org/arxiv-2409.09236","url":null,"abstract":"While the classic off-policy evaluation (OPE) literature commonly assumes\u0000decision time points to be evenly spaced for simplicity, in many real-world\u0000scenarios, such as those involving user-initiated visits, decisions are made at\u0000irregularly-spaced and potentially outcome-dependent time points. For a more\u0000principled evaluation of the dynamic policies, this paper constructs a novel\u0000OPE framework, which concerns not only the state-action process but also an\u0000observation process dictating the time points at which decisions are made. The\u0000framework is closely connected to the Markov decision process in computer\u0000science and with the renewal process in the statistical literature. Within the\u0000framework, two distinct value functions, derived from cumulative reward and\u0000integrated reward respectively, are considered, and statistical inference for\u0000each value function is developed under revised Markov and time-homogeneous\u0000assumptions. The validity of the proposed method is further supported by\u0000theoretical results, simulation studies, and a real-world application from\u0000electronic health records (EHR) evaluating periodontal disease treatments.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The variable selection problem is to discover which of a large set of predictors is associated with an outcome of interest, conditionally on the other predictors. This problem has been widely studied, but existing approaches lack either power against complex alternatives, robustness to model misspecification, computational efficiency, or quantification of evidence against individual hypotheses. We present tower PCM (tPCM), a statistically and computationally efficient solution to the variable selection problem that does not suffer from these shortcomings. tPCM adapts the best aspects of two existing procedures that are based on similar functionals: the holdout randomization test (HRT) and the projected covariance measure (PCM). The former is a model-X test that utilizes many resamples and few machine learning fits, while the latter is an asymptotic doubly-robust style test for a single hypothesis that requires no resamples and many machine learning fits. Theoretically, we demonstrate the validity of tPCM, and perhaps surprisingly, the asymptotic equivalence of HRT, PCM, and tPCM. In so doing, we clarify the relationship between two methods from two separate literatures. An extensive simulation study verifies that tPCM can have significant computational savings compared to HRT and PCM, while maintaining nearly identical power.
{"title":"Doubly robust and computationally efficient high-dimensional variable selection","authors":"Abhinav Chakraborty, Jeffrey Zhang, Eugene Katsevich","doi":"arxiv-2409.09512","DOIUrl":"https://doi.org/arxiv-2409.09512","url":null,"abstract":"The variable selection problem is to discover which of a large set of\u0000predictors is associated with an outcome of interest, conditionally on the\u0000other predictors. This problem has been widely studied, but existing approaches\u0000lack either power against complex alternatives, robustness to model\u0000misspecification, computational efficiency, or quantification of evidence\u0000against individual hypotheses. We present tower PCM (tPCM), a statistically and\u0000computationally efficient solution to the variable selection problem that does\u0000not suffer from these shortcomings. tPCM adapts the best aspects of two\u0000existing procedures that are based on similar functionals: the holdout\u0000randomization test (HRT) and the projected covariance measure (PCM). The former\u0000is a model-X test that utilizes many resamples and few machine learning fits,\u0000while the latter is an asymptotic doubly-robust style test for a single\u0000hypothesis that requires no resamples and many machine learning fits.\u0000Theoretically, we demonstrate the validity of tPCM, and perhaps surprisingly,\u0000the asymptotic equivalence of HRT, PCM, and tPCM. In so doing, we clarify the\u0000relationship between two methods from two separate literatures. An extensive\u0000simulation study verifies that tPCM can have significant computational savings\u0000compared to HRT and PCM, while maintaining nearly identical power.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The identification of surrogate markers is motivated by their potential to make decisions sooner about a treatment effect. However, few methods have been developed to actually use a surrogate marker to test for a treatment effect in a future study. Most existing methods consider combining surrogate marker and primary outcome information to test for a treatment effect, rely on fully parametric methods where strict parametric assumptions are made about the relationship between the surrogate and the outcome, and/or assume the surrogate marker is measured at only a single time point. Recent work has proposed a nonparametric test for a treatment effect using only surrogate marker information measured at a single time point by borrowing information learned from a prior study where both the surrogate and primary outcome were measured. In this paper, we utilize this nonparametric test and propose group sequential procedures that allow for early stopping of treatment effect testing in a setting where the surrogate marker is measured repeatedly over time. We derive the properties of the correlated surrogate-based nonparametric test statistics at multiple time points and compute stopping boundaries that allow for early stopping for a significant treatment effect, or for futility. We examine the performance of our testing procedure using a simulation study and illustrate the method using data from two distinct AIDS clinical trials.
{"title":"Group Sequential Testing of a Treatment Effect Using a Surrogate Marker","authors":"Layla Parast, Jay Bartroff","doi":"arxiv-2409.09440","DOIUrl":"https://doi.org/arxiv-2409.09440","url":null,"abstract":"The identification of surrogate markers is motivated by their potential to\u0000make decisions sooner about a treatment effect. However, few methods have been\u0000developed to actually use a surrogate marker to test for a treatment effect in\u0000a future study. Most existing methods consider combining surrogate marker and\u0000primary outcome information to test for a treatment effect, rely on fully\u0000parametric methods where strict parametric assumptions are made about the\u0000relationship between the surrogate and the outcome, and/or assume the surrogate\u0000marker is measured at only a single time point. Recent work has proposed a\u0000nonparametric test for a treatment effect using only surrogate marker\u0000information measured at a single time point by borrowing information learned\u0000from a prior study where both the surrogate and primary outcome were measured.\u0000In this paper, we utilize this nonparametric test and propose group sequential\u0000procedures that allow for early stopping of treatment effect testing in a\u0000setting where the surrogate marker is measured repeatedly over time. We derive\u0000the properties of the correlated surrogate-based nonparametric test statistics\u0000at multiple time points and compute stopping boundaries that allow for early\u0000stopping for a significant treatment effect, or for futility. We examine the\u0000performance of our testing procedure using a simulation study and illustrate\u0000the method using data from two distinct AIDS clinical trials.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}