首页 > 最新文献

Annals of Statistics最新文献

英文 中文
COUNTERFACTUAL INFERENCE IN SEQUENTIAL EXPERIMENTS. 序贯实验中的反事实推理。
IF 3.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-12-01 Epub Date: 2025-12-22 DOI: 10.1214/25-aos2519
Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah

We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale-mean outcome under different treatments for each unit and each time-with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.

我们考虑对顺序设计实验的研究后统计推断,其中多个单元使用随时间适应的处理策略在多个时间点分配处理。我们的目标是在对自适应处理策略的最小假设下,为最小可能尺度上的反事实均值提供推理保证-每个单位和每次不同处理下的平均结果。如果没有对反事实手段的任何结构性假设,由于未知数多于观察到的数据点,这一具有挑战性的任务是不可行的。为了取得进展,我们在反事实手段上引入了一个潜在因素模型,作为非线性混合效应模型和先前工作中考虑的双线性潜在因素模型的非参数推广。对于估计,我们使用了一种非参数方法,即最近邻的变体,并为每个单位和每个时间的反事实均值建立了一个非渐近的高概率误差界。在正则性条件下,当单元和时间点的数量以适当的速率增长到∞时,该界导致反事实均值的渐近有效置信区间。我们通过几个模拟和一个涉及移动健康临床试验HeartSteps数据的案例研究来说明我们的理论。
{"title":"COUNTERFACTUAL INFERENCE IN SEQUENTIAL EXPERIMENTS.","authors":"Raaz Dwivedi, Katherine Tian, Sabina Tomkins, Predrag Klasnja, Susan Murphy, Devavrat Shah","doi":"10.1214/25-aos2519","DOIUrl":"10.1214/25-aos2519","url":null,"abstract":"<p><p>We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale-mean outcome under different treatments <i>for each unit and each time</i>-with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to <math><mo>∞</mo></math> together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 6","pages":"2380-2406"},"PeriodicalIF":3.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758907/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145899144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA. 基于异构数据的个体最优策略强化学习。
IF 3.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-08-01 DOI: 10.1214/25-aos2512
By Rui Miao, Babak Shahbaba, Annie Qu

Offline reinforcement learning (RL) aims to find optimal policies in dynamic environments in order to maximize the expected total rewards by leveraging pre-collected data. Learning from heterogeneous data is one of the fundamental challenges in offline RL. Traditional methods focus on learning an optimal policy for all individuals with pre-collected data from a single episode or homogeneous batch episodes, and thus, may result in a suboptimal policy for a heterogeneous population. In this paper, we propose an individualized offline policy optimization framework for heterogeneous time-stationary Markov decision processes (MDPs). The proposed heterogeneous model with individual latent variables enables us to efficiently estimate the individual Q-functions, and our Penalized Pessimistic Personalized Policy Learning (P4L) algorithm guarantees a fast rate on the average regret under a weak partial coverage assumption on behavior policies. In addition, our simulation studies and a real data application demonstrate the superior numerical performance of the proposed method compared with existing methods.

离线强化学习(RL)旨在利用预先收集的数据,在动态环境中找到最优策略,从而最大化预期的总回报。从异构数据中学习是离线强化学习的基本挑战之一。传统方法侧重于从单个事件或同质批次事件中预先收集数据的所有个体学习最优策略,因此,对于异质群体可能导致次优策略。本文提出了一种针对异构时平稳马尔可夫决策过程的个性化离线策略优化框架。我们提出的具有单个潜在变量的异构模型使我们能够有效地估计单个q函数,并且我们的惩罚悲观个性化策略学习(P4L)算法保证了在行为策略的弱部分覆盖假设下的快速平均后悔率。此外,我们的仿真研究和实际数据应用表明,与现有方法相比,所提出的方法具有优越的数值性能。
{"title":"REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA.","authors":"By Rui Miao, Babak Shahbaba, Annie Qu","doi":"10.1214/25-aos2512","DOIUrl":"10.1214/25-aos2512","url":null,"abstract":"<p><p>Offline reinforcement learning (RL) aims to find optimal policies in dynamic environments in order to maximize the expected total rewards by leveraging pre-collected data. Learning from heterogeneous data is one of the fundamental challenges in offline RL. Traditional methods focus on learning an optimal policy for all individuals with pre-collected data from a single episode or homogeneous batch episodes, and thus, may result in a suboptimal policy for a heterogeneous population. In this paper, we propose an individualized offline policy optimization framework for heterogeneous time-stationary Markov decision processes (MDPs). The proposed heterogeneous model with individual latent variables enables us to efficiently estimate the individual Q-functions, and our Penalized Pessimistic Personalized Policy Learning (P4L) algorithm guarantees a fast rate on the average regret under a weak partial coverage assumption on behavior policies. In addition, our simulation studies and a real data application demonstrate the superior numerical performance of the proposed method compared with existing methods.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 4","pages":"1513-1534"},"PeriodicalIF":3.7,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12439830/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145079494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NONLINEAR GLOBAL FRÉCHET REGRESSION FOR RANDOM OBJECTS VIA WEAK CONDITIONAL EXPECTATION. 基于弱条件期望的随机对象非线性全局frÉchet回归。
IF 3.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-02-01 Epub Date: 2025-02-13 DOI: 10.1214/24-aos2457
Satarupa Bhattacharjee, Bing Li, Lingzhou Xue

Random objects are complex non-Euclidean data taking values in general metric spaces, possibly devoid of any underlying vector space structure. Such data are becoming increasingly abundant with the rapid advancement in technology. Examples include probability distributions, positive semidefinite matrices and data on Riemannian manifolds. However, except for regression for object-valued response with Euclidean predictors and distribution-on-distribution regression, there has been limited development of a general framework for object-valued response with object-valued predictors in the literature. To fill this gap, we introduce the notion of a weak conditional Fréchet mean based on Carleman operators and then propose a global nonlinear Fréchet regression model through the reproducing kernel Hilbert space (RKHS) embedding. Furthermore, we establish the relationships between the conditional Fréchet mean and the weak conditional Fréchet mean for both Euclidean and object-valued data. We also show that the state-of-the-art global Fréchet regression developed by Petersen and Müller (Ann. Statist. 47 (2019) 691-719) emerges as a special case of our method by choosing a linear kernel. We require that the metric space for the predictor admits a reproducing kernel, while the intrinsic geometry of the metric space for the response is utilized to study the asymptotic properties of the proposed estimates. Numerical studies, including extensive simulations and a real application, are conducted to investigate the finite-sample performance.

随机对象是在一般度量空间中取值的复杂非欧几里得数据,可能没有任何潜在的向量空间结构。随着科技的飞速发展,这些数据也越来越丰富。例子包括概率分布、正半定矩阵和黎曼流形上的数据。然而,除了用欧几里得预测器对对象值响应进行回归和分布对分布回归之外,文献中对对象值预测器对对象值响应的一般框架的发展有限。为了填补这一空白,我们引入了基于Carleman算子的弱条件frcims均值的概念,并通过再现核希尔伯特空间(RKHS)嵌入提出了一个全局非线性frcims回归模型。此外,我们还建立了欧几里得数据和对象值数据的条件fr平均和弱条件fr平均之间的关系。我们还展示了由Petersen和m ller (Ann。Statist. 47(2019) 691-719)通过选择线性核作为我们方法的特殊情况出现。我们要求预测器的度量空间允许一个再现核,而响应的度量空间的固有几何特性被用来研究所提出估计的渐近性质。数值研究,包括广泛的模拟和实际应用,进行了研究有限样本性能。
{"title":"NONLINEAR GLOBAL FRÉCHET REGRESSION FOR RANDOM OBJECTS VIA WEAK CONDITIONAL EXPECTATION.","authors":"Satarupa Bhattacharjee, Bing Li, Lingzhou Xue","doi":"10.1214/24-aos2457","DOIUrl":"10.1214/24-aos2457","url":null,"abstract":"<p><p>Random objects are complex non-Euclidean data taking values in general metric spaces, possibly devoid of any underlying vector space structure. Such data are becoming increasingly abundant with the rapid advancement in technology. Examples include probability distributions, positive semidefinite matrices and data on Riemannian manifolds. However, except for regression for object-valued response with Euclidean predictors and distribution-on-distribution regression, there has been limited development of a general framework for object-valued response with object-valued predictors in the literature. To fill this gap, we introduce the notion of a weak conditional Fréchet mean based on Carleman operators and then propose a global nonlinear Fréchet regression model through the reproducing kernel Hilbert space (RKHS) embedding. Furthermore, we establish the relationships between the conditional Fréchet mean and the weak conditional Fréchet mean for both Euclidean and object-valued data. We also show that the state-of-the-art global Fréchet regression developed by Petersen and Müller (<i>Ann</i>. <i>Statist</i>. <b>47</b> (2019) 691-719) emerges as a special case of our method by choosing a linear kernel. We require that the metric space for the predictor admits a reproducing kernel, while the intrinsic geometry of the metric space for the response is utilized to study the asymptotic properties of the proposed estimates. Numerical studies, including extensive simulations and a real application, are conducted to investigate the finite-sample performance.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 1","pages":"117-143"},"PeriodicalIF":3.7,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12407180/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144999566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules. 大型语言模型的水印统计框架:支点、检测效率和最优规则。
IF 3.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2025-02-01 Epub Date: 2025-02-13 DOI: 10.1214/24-aos2468
Xiang Li, Feng Ruan, Huiyuan Wang, Qi Long, Weijie J Su

Since ChatGPT was introduced in November 2022, embedding (nearly) unnoticeable statistical signals into text generated by large language models (LLMs), also known as watermarking, has been used as a principled approach to provable detection of LLM-generated text from its human-written counterpart. In this paper, we introduce a general and flexible framework for reasoning about the statistical efficiency of watermarks and designing powerful detection rules. Inspired by the hypothesis testing formulation of watermark detection, our framework starts by selecting a pivotal statistic of the text and a secret key-provided by the LLM to the verifier-to control the false positive rate (the error of mistakenly detecting human-written text as LLM-generated). Next, this framework allows one to evaluate the power of watermark detection rules by obtaining a closed-form expression of the asymptotic false negative rate (the error of incorrectly classifying LLM-generated text as human-written). Our framework further reduces the problem of determining the optimal detection rule to solving a minimax optimization program. We apply this framework to two representative watermarks-one of which has been internally implemented at OpenAI-and obtain several findings that can be instrumental in guiding the practice of implementing watermarks. In particular, we derive optimal detection rules for these watermarks under our framework. These theoretically derived detection rules are demonstrated to be competitive and sometimes enjoy a higher power than existing detection approaches through numerical experiments.

自ChatGPT于2022年11月推出以来,将(几乎)不明显的统计信号嵌入到大型语言模型(llm)生成的文本中,也称为水印,已被用作一种原则方法,用于从人类编写的对等文本中检测llm生成的文本。本文介绍了一种通用的、灵活的框架,用于推理水印的统计效率和设计强大的检测规则。受水印检测的假设检验公式的启发,我们的框架首先选择文本的关键统计量和由LLM提供给验证者的密钥,以控制误报率(错误地将人类编写的文本检测为LLM生成的错误)。接下来,该框架允许人们通过获得渐近假阴性率(错误地将llm生成的文本分类为人类编写的错误)的封闭形式表达式来评估水印检测规则的能力。我们的框架进一步将确定最优检测规则的问题简化为求解极大极小优化方案。我们将这个框架应用于两个代表性的水印(其中一个已经在openai内部实现),并获得了一些有助于指导实现水印实践的发现。特别地,我们在我们的框架下推导出这些水印的最优检测规则。这些理论推导的检测规则通过数值实验证明是有竞争力的,有时比现有的检测方法具有更高的功率。
{"title":"A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules.","authors":"Xiang Li, Feng Ruan, Huiyuan Wang, Qi Long, Weijie J Su","doi":"10.1214/24-aos2468","DOIUrl":"10.1214/24-aos2468","url":null,"abstract":"<p><p>Since ChatGPT was introduced in November 2022, embedding (nearly) unnoticeable statistical signals into text generated by large language models (LLMs), also known as watermarking, has been used as a principled approach to provable detection of LLM-generated text from its human-written counterpart. In this paper, we introduce a general and flexible framework for reasoning about the statistical efficiency of watermarks and designing powerful detection rules. Inspired by the hypothesis testing formulation of watermark detection, our framework starts by selecting a pivotal statistic of the text and a secret key-provided by the LLM to the verifier-to control the false positive rate (the error of mistakenly detecting human-written text as LLM-generated). Next, this framework allows one to evaluate the power of watermark detection rules by obtaining a closed-form expression of the asymptotic false negative rate (the error of incorrectly classifying LLM-generated text as human-written). Our framework further reduces the problem of determining the optimal detection rule to solving a minimax optimization program. We apply this framework to two representative watermarks-one of which has been internally implemented at OpenAI-and obtain several findings that can be instrumental in guiding the practice of implementing watermarks. In particular, we derive optimal detection rules for these watermarks under our framework. These theoretically derived detection rules are demonstrated to be competitive and sometimes enjoy a higher power than existing detection approaches through numerical experiments.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"53 1","pages":"322-351"},"PeriodicalIF":3.7,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12467635/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145184574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ON BLOCKWISE AND REFERENCE PANEL-BASED ESTIMATORS FOR GENETIC DATA PREDICTION IN HIGH DIMENSIONS. 高维度遗传数据预测的基于 blockwise 和参考面板的估计器。
IF 3.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-06-01 Epub Date: 2024-08-11 DOI: 10.1214/24-aos2378
Bingxin Zhao, Shurong Zheng, Hongtu Zhu

Genetic prediction holds immense promise for translating genetic discoveries into medical advances. As the high-dimensional covariance matrix (or the linkage disequilibrium (LD) pattern) of genetic variants often presents a block-diagonal structure, numerous methods account for the dependence among variants in predetermined local LD blocks. Moreover, due to privacy considerations and data protection concerns, genetic variant dependence in each LD block is typically estimated from external reference panels rather than the original training data set. This paper presents a unified analysis of blockwise and reference panel-based estimators in a high-dimensional prediction framework without sparsity restrictions. We find that, surprisingly, even when the covariance matrix has a block-diagonal structure with well-defined boundaries, blockwise estimation methods adjusting for local dependence can be substantially less accurate than methods controlling for the whole covariance matrix. Further, estimation methods built on the original training data set and external reference panels are likely to have varying performance in high dimensions, which may reflect the cost of having only access to summary level data from the training data set. This analysis is based on novel results in random matrix theory for block-diagonal covariance matrix. We numerically evaluate our results using extensive simulations and real data analysis in the UK Biobank.

基因预测为将基因发现转化为医学进步带来了巨大希望。由于遗传变异的高维协方差矩阵(或称连锁不平衡(LD)模式)通常呈现块对角结构,因此许多方法都会考虑预定局部 LD 块中变异体之间的依赖性。此外,出于隐私和数据保护的考虑,每个 LD 块中的遗传变异依赖性通常是通过外部参考面板而不是原始训练数据集估算的。本文提出了在无稀疏性限制的高维预测框架下,对基于顺时针方向和参考面板的估计方法进行统一分析。我们发现,令人惊讶的是,即使协方差矩阵具有边界明确的块对角结构,调整局部依赖性的顺时针估计方法的准确性也会大大低于控制整个协方差矩阵的方法。此外,建立在原始训练数据集和外部参考面板基础上的估算方法在高维度上可能会有不同的表现,这可能反映了只能从训练数据集中获取摘要级数据的代价。这一分析基于随机矩阵理论中块对角协方差矩阵的新结果。我们利用大量模拟和英国生物库的真实数据分析对结果进行了数值评估。
{"title":"ON BLOCKWISE AND REFERENCE PANEL-BASED ESTIMATORS FOR GENETIC DATA PREDICTION IN HIGH DIMENSIONS.","authors":"Bingxin Zhao, Shurong Zheng, Hongtu Zhu","doi":"10.1214/24-aos2378","DOIUrl":"10.1214/24-aos2378","url":null,"abstract":"<p><p>Genetic prediction holds immense promise for translating genetic discoveries into medical advances. As the high-dimensional covariance matrix (or the linkage disequilibrium (LD) pattern) of genetic variants often presents a block-diagonal structure, numerous methods account for the dependence among variants in predetermined local LD blocks. Moreover, due to privacy considerations and data protection concerns, genetic variant dependence in each LD block is typically estimated from external reference panels rather than the original training data set. This paper presents a unified analysis of blockwise and reference panel-based estimators in a high-dimensional prediction framework without sparsity restrictions. We find that, surprisingly, even when the covariance matrix has a block-diagonal structure with well-defined boundaries, blockwise estimation methods adjusting for local dependence can be substantially less accurate than methods controlling for the whole covariance matrix. Further, estimation methods built on the original training data set and external reference panels are likely to have varying performance in high dimensions, which may reflect the cost of having only access to summary level data from the training data set. This analysis is based on novel results in random matrix theory for block-diagonal covariance matrix. We numerically evaluate our results using extensive simulations and real data analysis in the UK Biobank.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 3","pages":"948-965"},"PeriodicalIF":3.7,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11391480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142279682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimax rates for heterogeneous causal effect estimation. 异质因果效应估计的极小极大率。
IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-04-01 Epub Date: 2024-05-09 DOI: 10.1214/24-aos2369
Edward H Kennedy, Sivaraman Balakrishnan, James M Robins, Larry Wasserman

Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a Hölder-smooth nonparametric model, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. Our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid.

估算异质性因果效应--即政策和治疗方法的效应如何在不同受试者之间发生变化--是因果推断中的一项基本任务。近年来,人们提出了许多估计条件平均治疗效果(CATE)的方法,但围绕最优性的问题在很大程度上仍未得到解答。特别是,关于最优性的最小理论尚待发展,最小收敛率和最优率估计器的构建仍是悬而未决的问题。在本文中,我们在一个荷尔德平滑非参数模型中推导出了 CATE 估计的最小率,并提出了一个新的局部多项式估计器,给出了它是最小最优估计器的高级条件。我们的最小值下界是通过模糊假设方法的本地化版本推导出来的,结合了非参数回归和函数估计的下界构造。我们提出的估计器可以看作是基于高阶影响函数方法局部修正的局部多项式 R 学习器。我们发现的最小率具有几个有趣的特征,包括非标准的肘部现象和非参数回归与函数估计率之间不寻常的插值。后者量化了作为估算对象的 CATE 如何被视为回归/函数混合体。
{"title":"Minimax rates for heterogeneous causal effect estimation.","authors":"Edward H Kennedy, Sivaraman Balakrishnan, James M Robins, Larry Wasserman","doi":"10.1214/24-aos2369","DOIUrl":"10.1214/24-aos2369","url":null,"abstract":"<p><p>Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a Hölder-smooth nonparametric model, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. Our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 2","pages":"793-816"},"PeriodicalIF":3.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11960818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN TWO HIGH-DIMENSIONAL VECTORS. 基于秩的指数,用于测试两个高维向量之间的独立性。
IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-02-01 Epub Date: 2024-03-07 DOI: 10.1214/23-aos2339
Yeqing Zhou, Kai Xu, Liping Zhu, Runze Li

To test independence between two high-dimensional random vectors, we propose three tests based on the rank-based indices derived from Hoeffding's D, Blum-Kiefer-Rosenblatt's R and Bergsma-Dassios-Yanagimoto's τ*. Under the null hypothesis of independence, we show that the distributions of the proposed test statistics converge to normal ones if the dimensions diverge arbitrarily with the sample size. We further derive an explicit rate of convergence. Thanks to the monotone transformation-invariant property, these distribution-free tests can be readily used to generally distributed random vectors including heavily tailed ones. We further study the local power of the proposed tests and compare their relative efficiencies with two classic distance covariance/correlation based tests in high dimensional settings. We establish explicit relationships between D,R,τ* and Pearson's correlation for bivariate normal random variables. The relationships serve as a basis for power comparison. Our theoretical results show that under a Gaussian equicorrelation alternative, (i) the proposed tests are superior to the two classic distance covariance/correlation based tests if the components of random vectors have very different scales; (ii) the asymptotic efficiency of the proposed tests based on D,τ* and R are sorted in a descending order.

为了检验两个高维随机向量之间的独立性,我们提出了三种检验方法,分别基于从霍夫丁的 D、布卢姆-基弗-罗森布拉特的 R 和贝格斯马-达西奥斯-扬纳基莫托的τ* 得出的基于秩的指数。在独立性的零假设下,我们证明了如果维数随样本量任意发散,所提出的检验统计量的分布会收敛到正态分布。我们进一步推导出了明确的收敛率。得益于单调变换不变的特性,这些无分布检验可以很容易地用于一般分布的随机向量,包括重尾向量。我们进一步研究了所提出检验的局部功率,并比较了它们与两种基于距离协方差/相关性的经典检验在高维环境下的相对效率。我们在双变量正态随机变量的 D、R、τ* 和皮尔逊相关性之间建立了明确的关系。这些关系可作为功率比较的基础。我们的理论结果表明,在高斯等相关性替代条件下,(i) 如果随机向量的分量具有非常不同的尺度,所提出的检验优于基于距离协方差/相关性的两种经典检验;(ii) 基于 D、τ* 和 R 所提出的检验的渐进效率按降序排列。
{"title":"RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN TWO HIGH-DIMENSIONAL VECTORS.","authors":"Yeqing Zhou, Kai Xu, Liping Zhu, Runze Li","doi":"10.1214/23-aos2339","DOIUrl":"10.1214/23-aos2339","url":null,"abstract":"<p><p>To test independence between two high-dimensional random vectors, we propose three tests based on the rank-based indices derived from Hoeffding's <math><mi>D</mi></math>, Blum-Kiefer-Rosenblatt's <math><mi>R</mi></math> and Bergsma-Dassios-Yanagimoto's <math><msup><mrow><mi>τ</mi></mrow><mrow><mo>*</mo></mrow></msup></math>. Under the null hypothesis of independence, we show that the distributions of the proposed test statistics converge to normal ones if the dimensions diverge arbitrarily with the sample size. We further derive an explicit rate of convergence. Thanks to the monotone transformation-invariant property, these distribution-free tests can be readily used to generally distributed random vectors including heavily tailed ones. We further study the local power of the proposed tests and compare their relative efficiencies with two classic distance covariance/correlation based tests in high dimensional settings. We establish explicit relationships between <math><mi>D</mi><mo>,</mo><mi>R</mi><mo>,</mo><msup><mrow><mi>τ</mi></mrow><mrow><mo>*</mo></mrow></msup></math> and Pearson's correlation for bivariate normal random variables. The relationships serve as a basis for power comparison. Our theoretical results show that under a Gaussian equicorrelation alternative, (i) the proposed tests are superior to the two classic distance covariance/correlation based tests if the components of random vectors have very different scales; (ii) the asymptotic efficiency of the proposed tests based on <math><mi>D</mi><mo>,</mo><msup><mrow><mi>τ</mi></mrow><mrow><mo>*</mo></mrow></msup></math> and <math><mi>R</mi></math> are sorted in a descending order.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 1","pages":"184-206"},"PeriodicalIF":3.2,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11064990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140849012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SUPERVISED HOMOGENEITY FUSION: A COMBINATORIAL APPROACH. 监督同质性融合:一种组合方法。
IF 3.7 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-02-01 Epub Date: 2024-03-07 DOI: 10.1214/23-aos2347
Wen Wang, Shihao Wu, Ziwei Zhu, Ling Zhou, Peter X-K Song

Fusing regression coefficients into homogeneous groups can unveil those coefficients that share a common value within each group. Such groupwise homogeneity reduces the intrinsic dimension of the parameter space and unleashes sharper statistical accuracy. We propose and investigate a new combinatorial grouping approach called L 0 -Fusion that is amenable to mixed integer optimization (MIO). On the statistical aspect, we identify a fundamental quantity called MSE grouping sensitivity that underpins the difficulty of recovering the true groups. We show that L 0 -Fusion achieves grouping consistency under the weakest possible requirement of the grouping sensitivity: if this requirement is violated, then the minimax risk of group misspecification will fail to converge to zero. Moreover, we show that in the high-dimensional regime, one can apply L 0 -Fusion with a sure screening set of features without any essential loss of statistical efficiency, while reducing the computational cost substantially. On the algorithmic aspect, we provide an MIO formulation for L 0 -Fusion along with a warm start strategy. Simulation and real data analysis demonstrate that L 0 -Fusion exhibits superiority over its competitors in terms of grouping accuracy.

将回归系数融合到同质组中可以揭示那些在每个组中具有共同值的系数。这种群体同质性降低了参数空间的内在维度,并释放出更高的统计准确性。本文提出并研究了一种适合混合整数优化(MIO)的l0 -Fusion组合分组方法。在统计方面,我们确定了一个称为MSE分组敏感性的基本数量,它支持恢复真实组的难度。我们证明了l0 -Fusion在对分组灵敏度最弱的可能要求下实现了分组一致性,如果不满足这个要求,则分组错配的极大极小风险将不能收敛到零。此外,我们表明,在高维区域,我们可以应用l0 -Fusion与确定的筛选集的特征,而不会损失任何基本的统计效率,同时大大降低了计算成本。在算法方面,我们为l0 -Fusion提供了一个MIO公式以及一个热启动策略。仿真和实际数据分析表明,l0 -Fusion算法在分组精度方面优于同类算法。
{"title":"SUPERVISED HOMOGENEITY FUSION: A COMBINATORIAL APPROACH.","authors":"Wen Wang, Shihao Wu, Ziwei Zhu, Ling Zhou, Peter X-K Song","doi":"10.1214/23-aos2347","DOIUrl":"10.1214/23-aos2347","url":null,"abstract":"<p><p>Fusing regression coefficients into homogeneous groups can unveil those coefficients that share a common value within each group. Such groupwise homogeneity reduces the intrinsic dimension of the parameter space and unleashes sharper statistical accuracy. We propose and investigate a new combinatorial grouping approach called <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion that is amenable to mixed integer optimization (MIO). On the statistical aspect, we identify a fundamental quantity called <i>MSE grouping sensitivity</i> that underpins the difficulty of recovering the true groups. We show that <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion achieves grouping consistency under the weakest possible requirement of the grouping sensitivity: if this requirement is violated, then the minimax risk of group misspecification will fail to converge to zero. Moreover, we show that in the high-dimensional regime, one can apply <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion with a sure screening set of features without any essential loss of statistical efficiency, while reducing the computational cost substantially. On the algorithmic aspect, we provide an MIO formulation for <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion along with a warm start strategy. Simulation and real data analysis demonstrate that <math> <msub><mrow><mi>L</mi></mrow> <mrow><mn>0</mn></mrow> </msub> </math> -Fusion exhibits superiority over its competitors in terms of grouping accuracy.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"52 1","pages":"285-310"},"PeriodicalIF":3.7,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327361/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144793305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Order-of-addition orthogonal arrays to study the effect of treatment ordering 研究正交加法排序对处理排序的影响
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2317
Eric D. Schoen, Robert W. Mee
The effect of the order in which a set of m treatments is applied can be modeled by relative-position factors that indicate whether treatment i is carried out before or after treatment j, or by the absolute position for treatment i in the sequence. A design with the same normalized information matrix as the design with all m! sequences is D- and G-optimal for the main-effects model involving the relative-position factors. We prove that such designs are also I-optimal for this model and D-optimal as well as G- and I-optimal for the first-order model in the absolute-position factors. We propose a methodology for a complete or partial enumeration of nonequivalent designs that are optimal for both models.
一组m个处理顺序的影响可以通过指示处理i是在处理j之前还是之后进行的相对位置因素来建模,或者通过处理i在序列中的绝对位置来建模。一个具有相同归一化信息矩阵的设计与所有m!对于包含相对位置因子的主效应模型,序列是D-和g -最优的。我们证明了这种设计对于该模型也是i -最优的,对于一阶模型在绝对位置因子上也是d -最优的,G-最优的,i -最优的。我们提出了一种方法,用于完全或部分列举非等效设计,这两种模型都是最佳的。
{"title":"Order-of-addition orthogonal arrays to study the effect of treatment ordering","authors":"Eric D. Schoen, Robert W. Mee","doi":"10.1214/23-aos2317","DOIUrl":"https://doi.org/10.1214/23-aos2317","url":null,"abstract":"The effect of the order in which a set of m treatments is applied can be modeled by relative-position factors that indicate whether treatment i is carried out before or after treatment j, or by the absolute position for treatment i in the sequence. A design with the same normalized information matrix as the design with all m! sequences is D- and G-optimal for the main-effects model involving the relative-position factors. We prove that such designs are also I-optimal for this model and D-optimal as well as G- and I-optimal for the first-order model in the absolute-position factors. We propose a methodology for a complete or partial enumeration of nonequivalent designs that are optimal for both models.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Matching recovery threshold for correlated random graphs 相关随机图的匹配恢复阈值
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2305
Jian Ding, Hang Du
For two correlated graphs which are independently sub-sampled from a common Erdős–Rényi graph G(n,p), we wish to recover their latent vertex matching from the observation of these two graphs without labels. When p=n−α+o(1) for α∈(0,1], we establish a sharp information-theoretic threshold for whether it is possible to correctly match a positive fraction of vertices. Our result sharpens a constant factor in a recent work by Wu, Xu and Yu.
对于从一个公共Erdős-Rényi图G(n,p)中独立子采样的两个相关图,我们希望从这两个没有标记的图的观察中恢复它们的潜在顶点匹配。对于α∈(0,1),当p=n−α+o(1)时,我们建立了一个尖锐的信息论阈值,以确定是否有可能正确匹配顶点的正分数。我们的结果强化了Wu, Xu和Yu最近工作中的一个常数因素。
{"title":"Matching recovery threshold for correlated random graphs","authors":"Jian Ding, Hang Du","doi":"10.1214/23-aos2305","DOIUrl":"https://doi.org/10.1214/23-aos2305","url":null,"abstract":"For two correlated graphs which are independently sub-sampled from a common Erdős–Rényi graph G(n,p), we wish to recover their latent vertex matching from the observation of these two graphs without labels. When p=n−α+o(1) for α∈(0,1], we establish a sharp information-theoretic threshold for whether it is possible to correctly match a positive fraction of vertices. Our result sharpens a constant factor in a recent work by Wu, Xu and Yu.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
Annals of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1