首页 > 最新文献

arXiv - STAT - Methodology最新文献

英文 中文
Information criteria for the number of directions of extremes in high-dimensional data 高维数据中极端方向数量的信息标准
Pub Date : 2024-09-16 DOI: arxiv-2409.10174
Lucas Butsch, Vicky Fasen-Hartmann
In multivariate extreme value analysis, the estimation of the dependencestructure in extremes is a challenging task, especially in the context ofhigh-dimensional data. Therefore, a common approach is to reduce the modeldimension by considering only the directions in which extreme values occur. Inthis paper, we use the concept of sparse regular variation recently introducedby Meyer and Wintenberger (2021) to derive information criteria for the numberof directions in which extreme events occur, such as a Bayesian informationcriterion (BIC), a mean-squared error-based information criterion (MSEIC), anda quasi-Akaike information criterion (QAIC) based on the Gaussian likelihoodfunction. As is typical in extreme value analysis, a challenging task is thechoice of the number $k_n$ of observations used for the estimation. Therefore,for all information criteria, we present a two-step procedure to estimate boththe number of directions of extremes and an optimal choice of $k_n$. We provethat the AIC of Meyer and Wintenberger (2023) and the MSEIC are inconsistentinformation criteria for the number of extreme directions whereas the BIC andthe QAIC are consistent information criteria. Finally, the performance of thedifferent information criteria is compared in a simulation study and applied onwind speed data.
在多元极值分析中,估计极值的依赖结构是一项具有挑战性的任务,尤其是在高维数据的背景下。因此,一种常见的方法是通过只考虑极值出现的方向来降低模型维度。在本文中,我们利用 Meyer 和 Wintenberger(2021 年)最近提出的稀疏正则变异概念,推导出极端事件发生方向数量的信息准则,如贝叶斯信息准则(BIC)、基于均方误差的信息准则(MSEIC)和基于高斯似然函数的准阿卡克信息准则(QAIC)。与极值分析中的典型情况一样,一项具有挑战性的任务是选择用于估计的观测值 $k_n$。因此,对于所有信息标准,我们提出了一个两步程序来估计极值的方向数和 $k_n$ 的最佳选择。我们证明,Meyer 和 Wintenberger(2023 年)的 AIC 和 MSEIC 是极端方向数的不一致信息准则,而 BIC 和 QAIC 是一致信息准则。最后,在模拟研究中比较了不同信息标准的性能,并将其应用于风速数据。
{"title":"Information criteria for the number of directions of extremes in high-dimensional data","authors":"Lucas Butsch, Vicky Fasen-Hartmann","doi":"arxiv-2409.10174","DOIUrl":"https://doi.org/arxiv-2409.10174","url":null,"abstract":"In multivariate extreme value analysis, the estimation of the dependence\u0000structure in extremes is a challenging task, especially in the context of\u0000high-dimensional data. Therefore, a common approach is to reduce the model\u0000dimension by considering only the directions in which extreme values occur. In\u0000this paper, we use the concept of sparse regular variation recently introduced\u0000by Meyer and Wintenberger (2021) to derive information criteria for the number\u0000of directions in which extreme events occur, such as a Bayesian information\u0000criterion (BIC), a mean-squared error-based information criterion (MSEIC), and\u0000a quasi-Akaike information criterion (QAIC) based on the Gaussian likelihood\u0000function. As is typical in extreme value analysis, a challenging task is the\u0000choice of the number $k_n$ of observations used for the estimation. Therefore,\u0000for all information criteria, we present a two-step procedure to estimate both\u0000the number of directions of extremes and an optimal choice of $k_n$. We prove\u0000that the AIC of Meyer and Wintenberger (2023) and the MSEIC are inconsistent\u0000information criteria for the number of extreme directions whereas the BIC and\u0000the QAIC are consistent information criteria. Finally, the performance of the\u0000different information criteria is compared in a simulation study and applied on\u0000wind speed data.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"100 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic comparison of Bayesian basket trial designs with unequal sample sizes and proposal of a new method based on power priors 对样本量不等的贝叶斯篮子试验设计进行系统比较,并提出一种基于功率先验的新方法
Pub Date : 2024-09-16 DOI: arxiv-2409.10318
Sabrina Schmitt, Lukas Baumann
Basket trials examine the efficacy of an intervention in multiple patientsubgroups simultaneously. The division into subgroups, called baskets, is basedon matching medical characteristics, which may result in small sample sizeswithin baskets that are also likely to differ. Sparse data complicatestatistical inference. Several Bayesian methods have been proposed in theliterature that allow information sharing between baskets to increasestatistical power. In this work, we provide a systematic comparison of fivedifferent Bayesian basket trial designs when sample sizes differ betweenbaskets. We consider the power prior approach with both known and new weightingmethods, a design by Fujikawa et al., as well as models based on Bayesianhierarchical modeling and Bayesian model averaging. The results of oursimulation study show a high sensitivity to changing sample sizes forFujikawa's design and the power prior approach. Limiting the amount of sharedinformation was found to be decisive for the robustness to varying basketsizes. In combination with the power prior approach, this resulted in the bestperformance and the most reliable detection of an effect of the treatment underinvestigation and its absence.
篮子试验是同时对多个患者亚组的干预效果进行检查。亚组(称为篮子)的划分是基于匹配的医疗特征,这可能导致篮子内样本量较小,而篮子内的样本量也可能不同。稀疏数据使统计推断变得复杂。文献中已经提出了几种贝叶斯方法,允许篮子之间共享信息以提高统计能力。在这项工作中,我们系统地比较了五个不同的贝叶斯篮子试验设计,当篮子之间的样本量不同时。我们考虑了采用已知加权方法和新加权方法的功率先验方法、Fujikawa 等人的设计,以及基于贝叶斯层次模型和贝叶斯模型平均的模型。我们的模拟研究结果表明,藤川的设计和功率先验方法对样本量的变化非常敏感。我们发现,限制共享信息的数量对不同篮子的稳健性起着决定性作用。与幂先验方法相结合,可以获得最佳性能,并最可靠地检测出被调查处理的效应或无效应。
{"title":"Systematic comparison of Bayesian basket trial designs with unequal sample sizes and proposal of a new method based on power priors","authors":"Sabrina Schmitt, Lukas Baumann","doi":"arxiv-2409.10318","DOIUrl":"https://doi.org/arxiv-2409.10318","url":null,"abstract":"Basket trials examine the efficacy of an intervention in multiple patient\u0000subgroups simultaneously. The division into subgroups, called baskets, is based\u0000on matching medical characteristics, which may result in small sample sizes\u0000within baskets that are also likely to differ. Sparse data complicate\u0000statistical inference. Several Bayesian methods have been proposed in the\u0000literature that allow information sharing between baskets to increase\u0000statistical power. In this work, we provide a systematic comparison of five\u0000different Bayesian basket trial designs when sample sizes differ between\u0000baskets. We consider the power prior approach with both known and new weighting\u0000methods, a design by Fujikawa et al., as well as models based on Bayesian\u0000hierarchical modeling and Bayesian model averaging. The results of our\u0000simulation study show a high sensitivity to changing sample sizes for\u0000Fujikawa's design and the power prior approach. Limiting the amount of shared\u0000information was found to be decisive for the robustness to varying basket\u0000sizes. In combination with the power prior approach, this resulted in the best\u0000performance and the most reliable detection of an effect of the treatment under\u0000investigation and its absence.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonlinear Causality in Brain Networks: With Application to Motor Imagery vs Execution 大脑网络中的非线性因果关系:应用于运动想象与执行
Pub Date : 2024-09-16 DOI: arxiv-2409.10374
Sipan Aslan, Hernando Ombao
One fundamental challenge of data-driven analysis in neuroscience is modelingcausal interactions and exploring the connectivity of nodes in a brain network.Various statistical methods, relying on various perspectives and employingdifferent data modalities, are being developed to examine and comprehend theunderlying causal structures inherent to brain dynamics. This study introducesa novel statistical approach, TAR4C, to dissect causal interactions inmultichannel EEG recordings. TAR4C uses the threshold autoregressive model todescribe the causal interaction between nodes or clusters of nodes in a brainnetwork. The perspective involves testing whether one node, which may representa brain region, can control the dynamics of the other. The node that has suchan impact on the other is called a threshold variable and can be classified asa causative because its functionality is the leading source operating as aninstantaneous switching mechanism that regulates the time-varyingautoregressive structure of the other. This statistical concept is commonlyreferred to as threshold non-linearity. Once threshold non-linearity has beenverified between a pair of nodes, the subsequent essential facet of TARmodeling is to assess the predictive ability of the causal node for the currentactivity on the other and represent causal interactions in autoregressiveterms. This predictive ability is what underlies Granger causality. The TAR4Capproach can discover non-linear and time-dependent causal interactions withoutnegating the G-causality perspective. The efficacy of the proposed approach isexemplified by analyzing the EEG signals recorded during the motormovement/imagery experiment. The similarities and differences between thecausal interactions manifesting during the execution and the imagery of a givenmotor movement are demonstrated by analyzing EEG recordings from multiplesubjects.
神经科学数据驱动分析的一个基本挑战是建立因果交互模型和探索大脑网络中节点的连通性。目前正在开发各种统计方法,依靠不同的视角和采用不同的数据模式来检查和理解大脑动态内在的基本因果结构。本研究介绍了一种新型统计方法 TAR4C,用于剖析多通道脑电图记录中的因果交互作用。TAR4C 使用阈值自回归模型来描述脑网络中节点或节点集群之间的因果交互作用。这一视角涉及测试一个节点(可能代表一个脑区)是否能控制另一个节点的动态。对另一个节点具有这种影响的节点被称为阈值变量,可以归类为因果关系,因为其功能是作为瞬时切换机制运行的主导源,可以调节另一个节点的时变自回归结构。这一统计概念通常被称为阈值非线性。一旦验证了一对节点之间的阈值非线性,TAR 模型的下一个重要方面就是评估因果节点对另一节点当前活动的预测能力,并用自回归项来表示因果互动。这种预测能力是格兰杰因果关系的基础。TAR4C 方法可以发现非线性和随时间变化的因果互动关系,而不损害格兰杰因果关系的观点。通过分析运动/意象实验过程中记录的脑电信号,证明了所提出方法的有效性。通过分析多个受试者的脑电图记录,展示了特定运动的执行和想象过程中因果相互作用的异同。
{"title":"Nonlinear Causality in Brain Networks: With Application to Motor Imagery vs Execution","authors":"Sipan Aslan, Hernando Ombao","doi":"arxiv-2409.10374","DOIUrl":"https://doi.org/arxiv-2409.10374","url":null,"abstract":"One fundamental challenge of data-driven analysis in neuroscience is modeling\u0000causal interactions and exploring the connectivity of nodes in a brain network.\u0000Various statistical methods, relying on various perspectives and employing\u0000different data modalities, are being developed to examine and comprehend the\u0000underlying causal structures inherent to brain dynamics. This study introduces\u0000a novel statistical approach, TAR4C, to dissect causal interactions in\u0000multichannel EEG recordings. TAR4C uses the threshold autoregressive model to\u0000describe the causal interaction between nodes or clusters of nodes in a brain\u0000network. The perspective involves testing whether one node, which may represent\u0000a brain region, can control the dynamics of the other. The node that has such\u0000an impact on the other is called a threshold variable and can be classified as\u0000a causative because its functionality is the leading source operating as an\u0000instantaneous switching mechanism that regulates the time-varying\u0000autoregressive structure of the other. This statistical concept is commonly\u0000referred to as threshold non-linearity. Once threshold non-linearity has been\u0000verified between a pair of nodes, the subsequent essential facet of TAR\u0000modeling is to assess the predictive ability of the causal node for the current\u0000activity on the other and represent causal interactions in autoregressive\u0000terms. This predictive ability is what underlies Granger causality. The TAR4C\u0000approach can discover non-linear and time-dependent causal interactions without\u0000negating the G-causality perspective. The efficacy of the proposed approach is\u0000exemplified by analyzing the EEG signals recorded during the motor\u0000movement/imagery experiment. The similarities and differences between the\u0000causal interactions manifesting during the execution and the imagery of a given\u0000motor movement are demonstrated by analyzing EEG recordings from multiple\u0000subjects.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Ordering Bayesian Logistic Regression Model for Phase I Combination Trials and Computationally Efficient Approach to Operational Prior Specification I 期联合试验的部分排序贝叶斯逻辑回归模型和计算效率高的操作先验规范方法
Pub Date : 2024-09-16 DOI: arxiv-2409.10352
Weishi Chen, Pavel Mozgunov
Recent years have seen increased interest in combining drug agents and/orschedules. Several methods for Phase I combination-escalation trials areproposed, among which, the partial ordering continual reassessment method(POCRM) gained great attention for its simplicity and good operationalcharacteristics. However, the one-parameter nature of the POCRM makes itrestrictive in more complicated settings such as the inclusion of a controlgroup. This paper proposes a Bayesian partial ordering logistic model (POBLRM),which combines partial ordering and the more flexible (than CRM) two-parameterlogistic model. Simulation studies show that the POBLRM performs similarly asthe POCRM in non-randomised settings. When patients are randomised between theexperimental dose-combinations and a control, performance is drasticallyimproved. Most designs require specifying hyper-parameters, often chosen fromstatistical considerations (operational prior). The conventional "grid search''calibration approach requires large simulations, which are computationallycostly. A novel "cyclic calibration" has been proposed to reduce thecomputation from multiplicative to additive. Furthermore, calibration processesshould consider wide ranges of scenarios of true toxicity probabilities toavoid bias. A method to reduce scenarios based on scenario-complexities issuggested. This can reduce the computation by more than 500 folds whileremaining operational characteristics similar to the grid search.
近年来,人们对药物制剂和/或时间表的组合越来越感兴趣。人们提出了几种用于 I 期联合用药升级试验的方法,其中部分排序连续再评估法(POCRM)因其简便性和良好的操作特性而备受关注。然而,POCRM 的单参数特性使其在纳入对照组等更复杂的情况下受到限制。本文提出了贝叶斯部分排序逻辑模型(POBLRM),它结合了部分排序和更灵活(比 CRM 更灵活)的双参数逻辑模型。模拟研究表明,POBLRM 在非随机环境下的表现与 POCRM 相似。当病人在实验剂量组合和对照组之间进行随机分配时,性能会大幅提高。大多数设计都需要指定超参数,这些参数通常是从统计考虑因素(操作先验)中选择的。传统的 "网格搜索 "校准方法需要进行大量模拟,计算成本很高。有人提出了一种新颖的 "循环校准 "方法,将计算量从乘法减少到加法。此外,校准过程应考虑广泛的真实毒性概率情景,以避免偏差。建议采用一种基于情景复杂度的方法来减少情景。这可以将计算量减少 500 倍以上,同时保持与网格搜索类似的运行特性。
{"title":"Partial Ordering Bayesian Logistic Regression Model for Phase I Combination Trials and Computationally Efficient Approach to Operational Prior Specification","authors":"Weishi Chen, Pavel Mozgunov","doi":"arxiv-2409.10352","DOIUrl":"https://doi.org/arxiv-2409.10352","url":null,"abstract":"Recent years have seen increased interest in combining drug agents and/or\u0000schedules. Several methods for Phase I combination-escalation trials are\u0000proposed, among which, the partial ordering continual reassessment method\u0000(POCRM) gained great attention for its simplicity and good operational\u0000characteristics. However, the one-parameter nature of the POCRM makes it\u0000restrictive in more complicated settings such as the inclusion of a control\u0000group. This paper proposes a Bayesian partial ordering logistic model (POBLRM),\u0000which combines partial ordering and the more flexible (than CRM) two-parameter\u0000logistic model. Simulation studies show that the POBLRM performs similarly as\u0000the POCRM in non-randomised settings. When patients are randomised between the\u0000experimental dose-combinations and a control, performance is drastically\u0000improved. Most designs require specifying hyper-parameters, often chosen from\u0000statistical considerations (operational prior). The conventional \"grid search''\u0000calibration approach requires large simulations, which are computationally\u0000costly. A novel \"cyclic calibration\" has been proposed to reduce the\u0000computation from multiplicative to additive. Furthermore, calibration processes\u0000should consider wide ranges of scenarios of true toxicity probabilities to\u0000avoid bias. A method to reduce scenarios based on scenario-complexities is\u0000suggested. This can reduce the computation by more than 500 folds while\u0000remaining operational characteristics similar to the grid search.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Proofs of the Predictive Synthesis Formula 关于预测合成公式的证明
Pub Date : 2024-09-15 DOI: arxiv-2409.09660
Riku Masuda, Kaoru Irie
Bayesian predictive synthesis is useful in synthesizing multiple predictivedistributions coherently. However, the proof for the fundamental equation ofthe synthesized predictive density has been missing. In this technical report,we review the series of research on predictive synthesis, then fill the gapbetween the known results and the equation used in modern applications. Weprovide two proofs and clarify the structure of predictive synthesis.
贝叶斯预测合成法有助于连贯地合成多个预测分布。然而,合成预测密度基本方程的证明一直缺失。在本技术报告中,我们回顾了有关预测合成的一系列研究,然后填补了已知结果与现代应用中所用方程之间的空白。我们提供了两个证明,并阐明了预测合成的结构。
{"title":"On the Proofs of the Predictive Synthesis Formula","authors":"Riku Masuda, Kaoru Irie","doi":"arxiv-2409.09660","DOIUrl":"https://doi.org/arxiv-2409.09660","url":null,"abstract":"Bayesian predictive synthesis is useful in synthesizing multiple predictive\u0000distributions coherently. However, the proof for the fundamental equation of\u0000the synthesized predictive density has been missing. In this technical report,\u0000we review the series of research on predictive synthesis, then fill the gap\u0000between the known results and the equation used in modern applications. We\u0000provide two proofs and clarify the structure of predictive synthesis.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A general approach to fitting multistate cure models based on an extended-long-format data structure 基于扩展长格式数据结构的多态固化模型拟合通用方法
Pub Date : 2024-09-15 DOI: arxiv-2409.09865
Yilin Jiang, Harm van Tinteren, Marta Fiocco
A multistate cure model is a statistical framework used to analyze andrepresent the transitions that individuals undergo between different statesover time, taking into account the possibility of being cured by initialtreatment. This model is particularly useful in pediatric oncology where afraction of the patient population achieves cure through treatment andtherefore they will never experience some events. Our study develops ageneralized algorithm based on the extended long data format, an extension oflong data format where a transition can be split up to two rows each with aweight assigned reflecting the posterior probability of its cure status. Themultistate cure model is fit on top of the current framework of multistatemodel and mixture cure model. The proposed algorithm makes use of theExpectation-Maximization (EM) algorithm and weighted likelihood representationsuch that it is easy to implement with standard package. As an example, theproposed algorithm is applied on data from the European Society for Blood andMarrow Transplantation (EBMT). Standard errors of the estimated parameters areobtained via a non-parametric bootstrap procedure, while the method involvingthe calculation of the second-derivative matrix of the observed log-likelihoodis also presented.
多状态治愈模型是一种统计框架,用于分析和描述个体在不同状态之间的时间转换,同时考虑到通过初始治疗治愈的可能性。这种模型在儿科肿瘤学中特别有用,因为有一部分患者通过治疗达到了治愈,因此他们永远不会经历某些事件。我们的研究开发了一种基于扩展长数据格式的通用算法,这种数据格式是长数据格式的扩展,在这种格式中,一个转变可以分成两行,每行分配一个权重,反映其治愈状态的后验概率。多态治愈模型适用于当前的多态模型和混合治愈模型框架。提出的算法使用了期望最大化(EM)算法和加权似然表示法,因此很容易用标准软件包实现。以欧洲血液和骨髓移植学会(EBMT)的数据为例,对该算法进行了应用。估计参数的标准误差通过非参数引导程序获得,同时还介绍了计算观测对数似然的二次衍生矩阵的方法。
{"title":"A general approach to fitting multistate cure models based on an extended-long-format data structure","authors":"Yilin Jiang, Harm van Tinteren, Marta Fiocco","doi":"arxiv-2409.09865","DOIUrl":"https://doi.org/arxiv-2409.09865","url":null,"abstract":"A multistate cure model is a statistical framework used to analyze and\u0000represent the transitions that individuals undergo between different states\u0000over time, taking into account the possibility of being cured by initial\u0000treatment. This model is particularly useful in pediatric oncology where a\u0000fraction of the patient population achieves cure through treatment and\u0000therefore they will never experience some events. Our study develops a\u0000generalized algorithm based on the extended long data format, an extension of\u0000long data format where a transition can be split up to two rows each with a\u0000weight assigned reflecting the posterior probability of its cure status. The\u0000multistate cure model is fit on top of the current framework of multistate\u0000model and mixture cure model. The proposed algorithm makes use of the\u0000Expectation-Maximization (EM) algorithm and weighted likelihood representation\u0000such that it is easy to implement with standard package. As an example, the\u0000proposed algorithm is applied on data from the European Society for Blood and\u0000Marrow Transplantation (EBMT). Standard errors of the estimated parameters are\u0000obtained via a non-parametric bootstrap procedure, while the method involving\u0000the calculation of the second-derivative matrix of the observed log-likelihood\u0000is also presented.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic quantification of player value for fantasy basketball 为梦幻篮球动态量化球员价值
Pub Date : 2024-09-15 DOI: arxiv-2409.09884
Zach Rosenof
Previous work on fantasy basketball quantifies player value for categoryleagues without taking draft circumstances into account. Quantifying value inthis way is convenient, but inherently limited as a strategy, because itprecludes the possibility of dynamic adaptation. This work introduces aframework for dynamic algorithms, dubbed "H-scoring", and describes animplementation of the framework for head-to-head formats, dubbed $H_0$. $H_0$models many of the main aspects of category league strategy including categoryweighting, positional assignments, and format-specific objectives. Head-to-headsimulations provide evidence that $H_0$ outperforms static ranking lists.Category-level results from the simulations reveal that one component of$H_0$'s strategy is punting a subset of categories, which it learns to doimplicitly.
以往关于梦幻篮球的研究都是在不考虑选秀情况的情况下量化球员在各类联赛中的价值。以这种方式量化价值虽然方便,但作为一种策略却有其局限性,因为它排除了动态调整的可能性。这项工作介绍了一个动态算法框架,称为 "H-计分",并描述了该框架在头对头赛制中的实施,称为 "H_0$"。H_0$模拟了类别联赛策略的许多主要方面,包括类别加权、位置分配和特定赛制目标。模拟结果显示,$H_0$策略的一个组成部分是对一个类别子集进行惩罚,而这正是它所要学习的。
{"title":"Dynamic quantification of player value for fantasy basketball","authors":"Zach Rosenof","doi":"arxiv-2409.09884","DOIUrl":"https://doi.org/arxiv-2409.09884","url":null,"abstract":"Previous work on fantasy basketball quantifies player value for category\u0000leagues without taking draft circumstances into account. Quantifying value in\u0000this way is convenient, but inherently limited as a strategy, because it\u0000precludes the possibility of dynamic adaptation. This work introduces a\u0000framework for dynamic algorithms, dubbed \"H-scoring\", and describes an\u0000implementation of the framework for head-to-head formats, dubbed $H_0$. $H_0$\u0000models many of the main aspects of category league strategy including category\u0000weighting, positional assignments, and format-specific objectives. Head-to-head\u0000simulations provide evidence that $H_0$ outperforms static ranking lists.\u0000Category-level results from the simulations reveal that one component of\u0000$H_0$'s strategy is punting a subset of categories, which it learns to do\u0000implicitly.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Off-Policy Evaluation with Irregularly-Spaced, Outcome-Dependent Observation Times 观察时间间隔不规则、结果取决于观察时间的非政策评估
Pub Date : 2024-09-14 DOI: arxiv-2409.09236
Xin Chen, Wenbin Lu, Shu Yang, Dipankar Bandyopadhyay
While the classic off-policy evaluation (OPE) literature commonly assumesdecision time points to be evenly spaced for simplicity, in many real-worldscenarios, such as those involving user-initiated visits, decisions are made atirregularly-spaced and potentially outcome-dependent time points. For a moreprincipled evaluation of the dynamic policies, this paper constructs a novelOPE framework, which concerns not only the state-action process but also anobservation process dictating the time points at which decisions are made. Theframework is closely connected to the Markov decision process in computerscience and with the renewal process in the statistical literature. Within theframework, two distinct value functions, derived from cumulative reward andintegrated reward respectively, are considered, and statistical inference foreach value function is developed under revised Markov and time-homogeneousassumptions. The validity of the proposed method is further supported bytheoretical results, simulation studies, and a real-world application fromelectronic health records (EHR) evaluating periodontal disease treatments.
经典的非政策评估(OPE)文献通常假定决策时间点间隔均匀以简化评估,但在现实世界的许多场景中,例如涉及用户主动访问的场景,决策是在间隔不规则且可能与结果相关的时间点上做出的。为了对动态策略进行更原则性的评估,本文构建了一个新颖的 OPE 框架,该框架不仅涉及状态-行动过程,还涉及决定决策时间点的观察过程。该框架与计算机科学中的马尔可夫决策过程和统计文献中的更新过程密切相关。在该框架内,考虑了两种不同的价值函数,它们分别来自累积奖励和积分奖励,并在修正的马尔可夫假设和时间均质假设下对每个价值函数进行了统计推断。理论结果、模拟研究和电子健康记录(EHR)评估牙周病治疗的实际应用进一步证明了所提方法的有效性。
{"title":"Off-Policy Evaluation with Irregularly-Spaced, Outcome-Dependent Observation Times","authors":"Xin Chen, Wenbin Lu, Shu Yang, Dipankar Bandyopadhyay","doi":"arxiv-2409.09236","DOIUrl":"https://doi.org/arxiv-2409.09236","url":null,"abstract":"While the classic off-policy evaluation (OPE) literature commonly assumes\u0000decision time points to be evenly spaced for simplicity, in many real-world\u0000scenarios, such as those involving user-initiated visits, decisions are made at\u0000irregularly-spaced and potentially outcome-dependent time points. For a more\u0000principled evaluation of the dynamic policies, this paper constructs a novel\u0000OPE framework, which concerns not only the state-action process but also an\u0000observation process dictating the time points at which decisions are made. The\u0000framework is closely connected to the Markov decision process in computer\u0000science and with the renewal process in the statistical literature. Within the\u0000framework, two distinct value functions, derived from cumulative reward and\u0000integrated reward respectively, are considered, and statistical inference for\u0000each value function is developed under revised Markov and time-homogeneous\u0000assumptions. The validity of the proposed method is further supported by\u0000theoretical results, simulation studies, and a real-world application from\u0000electronic health records (EHR) evaluating periodontal disease treatments.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142256414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Doubly robust and computationally efficient high-dimensional variable selection 加倍稳健、计算高效的高维变量选择
Pub Date : 2024-09-14 DOI: arxiv-2409.09512
Abhinav Chakraborty, Jeffrey Zhang, Eugene Katsevich
The variable selection problem is to discover which of a large set ofpredictors is associated with an outcome of interest, conditionally on theother predictors. This problem has been widely studied, but existing approacheslack either power against complex alternatives, robustness to modelmisspecification, computational efficiency, or quantification of evidenceagainst individual hypotheses. We present tower PCM (tPCM), a statistically andcomputationally efficient solution to the variable selection problem that doesnot suffer from these shortcomings. tPCM adapts the best aspects of twoexisting procedures that are based on similar functionals: the holdoutrandomization test (HRT) and the projected covariance measure (PCM). The formeris a model-X test that utilizes many resamples and few machine learning fits,while the latter is an asymptotic doubly-robust style test for a singlehypothesis that requires no resamples and many machine learning fits.Theoretically, we demonstrate the validity of tPCM, and perhaps surprisingly,the asymptotic equivalence of HRT, PCM, and tPCM. In so doing, we clarify therelationship between two methods from two separate literatures. An extensivesimulation study verifies that tPCM can have significant computational savingscompared to HRT and PCM, while maintaining nearly identical power.
变量选择问题是发现在一大组预测因子中,哪一个与感兴趣的结果相关,并以其他预测因子为条件。这个问题已被广泛研究,但现有的方法在应对复杂替代方案的能力、对模型误设的鲁棒性、计算效率或对个别假设的证据量化方面都存在不足。我们提出了塔式 PCM(tPCM),它是变量选择问题的一种统计和计算高效的解决方案,而且不存在这些缺陷。tPCM 采用了两种基于类似函数的现有程序的最佳方面:保持随机化检验(HRT)和预测协方差测量(PCM)。前者是一种模型 X 检验,需要使用大量重样本和少量机器学习拟合,而后者是一种针对单一假设的渐进双稳健式检验,不需要重样本和大量机器学习拟合。从理论上讲,我们证明了 tPCM 的有效性,而且令人惊讶的是,HRT、PCM 和 tPCM 在渐进上是等价的。在此过程中,我们澄清了来自两个不同文献的两种方法之间的关系。一项广泛的仿真研究证实,与 HRT 和 PCM 相比,tPCM 可以显著节省计算量,同时保持几乎相同的功率。
{"title":"Doubly robust and computationally efficient high-dimensional variable selection","authors":"Abhinav Chakraborty, Jeffrey Zhang, Eugene Katsevich","doi":"arxiv-2409.09512","DOIUrl":"https://doi.org/arxiv-2409.09512","url":null,"abstract":"The variable selection problem is to discover which of a large set of\u0000predictors is associated with an outcome of interest, conditionally on the\u0000other predictors. This problem has been widely studied, but existing approaches\u0000lack either power against complex alternatives, robustness to model\u0000misspecification, computational efficiency, or quantification of evidence\u0000against individual hypotheses. We present tower PCM (tPCM), a statistically and\u0000computationally efficient solution to the variable selection problem that does\u0000not suffer from these shortcomings. tPCM adapts the best aspects of two\u0000existing procedures that are based on similar functionals: the holdout\u0000randomization test (HRT) and the projected covariance measure (PCM). The former\u0000is a model-X test that utilizes many resamples and few machine learning fits,\u0000while the latter is an asymptotic doubly-robust style test for a single\u0000hypothesis that requires no resamples and many machine learning fits.\u0000Theoretically, we demonstrate the validity of tPCM, and perhaps surprisingly,\u0000the asymptotic equivalence of HRT, PCM, and tPCM. In so doing, we clarify the\u0000relationship between two methods from two separate literatures. An extensive\u0000simulation study verifies that tPCM can have significant computational savings\u0000compared to HRT and PCM, while maintaining nearly identical power.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group Sequential Testing of a Treatment Effect Using a Surrogate Marker 使用替代标记对治疗效果进行分组序列测试
Pub Date : 2024-09-14 DOI: arxiv-2409.09440
Layla Parast, Jay Bartroff
The identification of surrogate markers is motivated by their potential tomake decisions sooner about a treatment effect. However, few methods have beendeveloped to actually use a surrogate marker to test for a treatment effect ina future study. Most existing methods consider combining surrogate marker andprimary outcome information to test for a treatment effect, rely on fullyparametric methods where strict parametric assumptions are made about therelationship between the surrogate and the outcome, and/or assume the surrogatemarker is measured at only a single time point. Recent work has proposed anonparametric test for a treatment effect using only surrogate markerinformation measured at a single time point by borrowing information learnedfrom a prior study where both the surrogate and primary outcome were measured.In this paper, we utilize this nonparametric test and propose group sequentialprocedures that allow for early stopping of treatment effect testing in asetting where the surrogate marker is measured repeatedly over time. We derivethe properties of the correlated surrogate-based nonparametric test statisticsat multiple time points and compute stopping boundaries that allow for earlystopping for a significant treatment effect, or for futility. We examine theperformance of our testing procedure using a simulation study and illustratethe method using data from two distinct AIDS clinical trials.
确定替代标记物的动机在于它们能更快地对治疗效果做出决定。然而,目前很少有方法能真正使用替代标记物来检验未来研究的治疗效果。现有的大多数方法都考虑结合替代标记物和主要结果信息来检验治疗效果,依赖于完全参数方法,即对替代标记物和结果之间的关系做出严格的参数假设,和/或假设替代标记物仅在单一时间点测量。在本文中,我们利用了这种非参数检验方法,并提出了分组序列程序,以便在替代标记物随着时间的推移被重复测量的情况下,及早停止治疗效果检验。我们推导了多个时间点上基于相关代用指标的非参数检验统计量的特性,并计算了停止边界,以便及早停止显著的治疗效果或无效治疗。我们通过模拟研究检验了测试程序的性能,并使用两项不同的艾滋病临床试验数据对该方法进行了说明。
{"title":"Group Sequential Testing of a Treatment Effect Using a Surrogate Marker","authors":"Layla Parast, Jay Bartroff","doi":"arxiv-2409.09440","DOIUrl":"https://doi.org/arxiv-2409.09440","url":null,"abstract":"The identification of surrogate markers is motivated by their potential to\u0000make decisions sooner about a treatment effect. However, few methods have been\u0000developed to actually use a surrogate marker to test for a treatment effect in\u0000a future study. Most existing methods consider combining surrogate marker and\u0000primary outcome information to test for a treatment effect, rely on fully\u0000parametric methods where strict parametric assumptions are made about the\u0000relationship between the surrogate and the outcome, and/or assume the surrogate\u0000marker is measured at only a single time point. Recent work has proposed a\u0000nonparametric test for a treatment effect using only surrogate marker\u0000information measured at a single time point by borrowing information learned\u0000from a prior study where both the surrogate and primary outcome were measured.\u0000In this paper, we utilize this nonparametric test and propose group sequential\u0000procedures that allow for early stopping of treatment effect testing in a\u0000setting where the surrogate marker is measured repeatedly over time. We derive\u0000the properties of the correlated surrogate-based nonparametric test statistics\u0000at multiple time points and compute stopping boundaries that allow for early\u0000stopping for a significant treatment effect, or for futility. We examine the\u0000performance of our testing procedure using a simulation study and illustrate\u0000the method using data from two distinct AIDS clinical trials.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Methodology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1