首页 > 最新文献

arXiv - MATH - Statistics Theory最新文献

英文 中文
Multivariate Unified Skew-t Distributions And Their Properties 多元统一斜-t分布及其性质
Pub Date : 2023-11-30 DOI: arxiv-2311.18294
Kesen Wang, Maicon J. Karling, Reinaldo B. Arellano-Valle, Marc G. Genton
The unified skew-t (SUT) is a flexible parametric multivariate distributionthat accounts for skewness and heavy tails in the data. A few of its propertiescan be found scattered in the literature or in a parameterization that does notfollow the original one for unified skew-normal (SUN) distributions, yet asystematic study is lacking. In this work, explicit properties of themultivariate SUT distribution are presented, such as its stochasticrepresentations, moments, SUN-scale mixture representation, lineartransformation, additivity, marginal distribution, canonical form, quadraticform, conditional distribution, change of latent dimensions, Mardia measures ofmultivariate skewness and kurtosis, and non-identifiability issue. Theseresults are given in a parametrization that reduces to the original SUNdistribution as a sub-model, hence facilitating the use of the SUT forapplications. Several models based on the SUT distribution are provided forillustration.
统一偏态-t (SUT)是一种灵活的参数多元分布,可以解释数据中的偏态和重尾。它的一些性质可以在文献中发现,或者在不遵循统一斜正态分布(SUN)原始分布的参数化中发现,但缺乏系统的研究。本文给出了多元SUT分布的显式性质,如随机表示、矩、太阳尺度混合表示、线性变换、可加性、边际分布、规范形式、二次形式、条件分布、潜在维数的变化、多元偏度和峰度的马尔地亚测度以及不可辨识问题。这些结果是在参数化中给出的,该参数化将原始的太阳分布作为子模型,从而促进了SUT在应用中的使用。给出了基于SUT分布的几种模型作为说明。
{"title":"Multivariate Unified Skew-t Distributions And Their Properties","authors":"Kesen Wang, Maicon J. Karling, Reinaldo B. Arellano-Valle, Marc G. Genton","doi":"arxiv-2311.18294","DOIUrl":"https://doi.org/arxiv-2311.18294","url":null,"abstract":"The unified skew-t (SUT) is a flexible parametric multivariate distribution\u0000that accounts for skewness and heavy tails in the data. A few of its properties\u0000can be found scattered in the literature or in a parameterization that does not\u0000follow the original one for unified skew-normal (SUN) distributions, yet a\u0000systematic study is lacking. In this work, explicit properties of the\u0000multivariate SUT distribution are presented, such as its stochastic\u0000representations, moments, SUN-scale mixture representation, linear\u0000transformation, additivity, marginal distribution, canonical form, quadratic\u0000form, conditional distribution, change of latent dimensions, Mardia measures of\u0000multivariate skewness and kurtosis, and non-identifiability issue. These\u0000results are given in a parametrization that reduces to the original SUN\u0000distribution as a sub-model, hence facilitating the use of the SUT for\u0000applications. Several models based on the SUT distribution are provided for\u0000illustration.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"90 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian nonparametric inference in PDE models: asymptotic theory and implementation PDE模型中的贝叶斯非参数推理:渐近理论与实现
Pub Date : 2023-11-30 DOI: arxiv-2311.18322
Matteo Giordano
Parameter identification problems in partial differential equations (PDEs)consist in determining one or more unknown functional parameters in a PDE.Here, the Bayesian nonparametric approach to such problems is considered.Focusing on the representative example of inferring the diffusivity function inan elliptic PDE from noisy observations of the PDE solution, the performance ofBayesian procedures based on Gaussian process priors is investigated. Recentasymptotic theoretical guarantees establishing posterior consistency andconvergence rates are reviewed and expanded upon. An implementation of theassociated posterior-based inference is provided, and illustrated via anumerical simulation study where two different discretisation strategies aredevised. The reproducible code is available at: https://github.com/MattGiord.
偏微分方程的参数辨识问题包括确定偏微分方程中一个或多个未知的函数参数。在这里,考虑贝叶斯非参数方法来解决这类问题。以椭圆偏微分方程解的噪声观测推断其扩散函数为例,研究了基于高斯过程先验的贝叶斯算法的性能。对最近建立后验一致性和收敛率的渐近理论保证进行了回顾和扩展。提供了相关的基于后验推理的实现,并通过数值模拟研究说明了两种不同的离散化策略。可复制的代码可在:https://github.com/MattGiord。
{"title":"Bayesian nonparametric inference in PDE models: asymptotic theory and implementation","authors":"Matteo Giordano","doi":"arxiv-2311.18322","DOIUrl":"https://doi.org/arxiv-2311.18322","url":null,"abstract":"Parameter identification problems in partial differential equations (PDEs)\u0000consist in determining one or more unknown functional parameters in a PDE.\u0000Here, the Bayesian nonparametric approach to such problems is considered.\u0000Focusing on the representative example of inferring the diffusivity function in\u0000an elliptic PDE from noisy observations of the PDE solution, the performance of\u0000Bayesian procedures based on Gaussian process priors is investigated. Recent\u0000asymptotic theoretical guarantees establishing posterior consistency and\u0000convergence rates are reviewed and expanded upon. An implementation of the\u0000associated posterior-based inference is provided, and illustrated via a\u0000numerical simulation study where two different discretisation strategies are\u0000devised. The reproducible code is available at: https://github.com/MattGiord.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"84 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wasserstein GANs are Minimax Optimal Distribution Estimators Wasserstein gan是极大极小最优分布估计
Pub Date : 2023-11-30 DOI: arxiv-2311.18613
Arthur Stéphanovitch, Eddie Aamari, Clément Levrard
We provide non asymptotic rates of convergence of the Wasserstein GenerativeAdversarial networks (WGAN) estimator. We build neural networks classesrepresenting the generators and discriminators which yield a GAN that achievesthe minimax optimal rate for estimating a certain probability measure $mu$with support in $mathbb{R}^p$. The probability $mu$ is considered to be thepush forward of the Lebesgue measure on the $d$-dimensional torus$mathbb{T}^d$ by a map $g^star:mathbb{T}^drightarrow mathbb{R}^p$ ofsmoothness $beta+1$. Measuring the error with the $gamma$-H"older IntegralProbability Metric (IPM), we obtain up to logarithmic factors, the minimaxoptimal rate $O(n^{-frac{beta+gamma}{2beta +d}}vee n^{-frac{1}{2}})$where $n$ is the sample size, $beta$ determines the smoothness of the targetmeasure $mu$, $gamma$ is the smoothness of the IPM ($gamma=1$ is theWasserstein case) and $dleq p$ is the intrinsic dimension of $mu$. In theprocess, we derive a sharp interpolation inequality between H"older IPMs. Thisnovel result of theory of functions spaces generalizes classical interpolationinequalities to the case where the measures involved have densities ondifferent manifolds.
我们提供了Wasserstein生成对抗网络(WGAN)估计器的非渐近收敛率。我们构建了代表生成器和鉴别器的神经网络类,这些神经网络类产生了一个GAN,该GAN在$mathbb{R}^p$的支持下实现了估计某个概率度量$mu$的最小最大最优速率。概率$mu$被认为是勒贝格测度在$d$维环面$mathbb{T}^d$上通过光滑度$beta+1$的映射$g^star:mathbb{T}^drightarrow mathbb{R}^p$向前推进。利用$gamma$ -Hölder积分概率度量(IntegralProbability Metric, IPM)测量误差,我们得到了至多对数因子,最小最优率$O(n^{-frac{beta+gamma}{2beta +d}}vee n^{-frac{1}{2}})$其中$n$为样本量,$beta$确定了targetmeasure的平滑度$mu$, $gamma$为IPM的平滑度($gamma=1$为wasserstein情况),$dleq p$为$mu$的固有维数。在此过程中,我们得到了Hölder ipm之间的尖锐插值不等式。这个函数空间理论的新结果将经典插值不等式推广到所涉及的测度在不同流形上具有密度的情况。
{"title":"Wasserstein GANs are Minimax Optimal Distribution Estimators","authors":"Arthur Stéphanovitch, Eddie Aamari, Clément Levrard","doi":"arxiv-2311.18613","DOIUrl":"https://doi.org/arxiv-2311.18613","url":null,"abstract":"We provide non asymptotic rates of convergence of the Wasserstein Generative\u0000Adversarial networks (WGAN) estimator. We build neural networks classes\u0000representing the generators and discriminators which yield a GAN that achieves\u0000the minimax optimal rate for estimating a certain probability measure $mu$\u0000with support in $mathbb{R}^p$. The probability $mu$ is considered to be the\u0000push forward of the Lebesgue measure on the $d$-dimensional torus\u0000$mathbb{T}^d$ by a map $g^star:mathbb{T}^drightarrow mathbb{R}^p$ of\u0000smoothness $beta+1$. Measuring the error with the $gamma$-H\"older Integral\u0000Probability Metric (IPM), we obtain up to logarithmic factors, the minimax\u0000optimal rate $O(n^{-frac{beta+gamma}{2beta +d}}vee n^{-frac{1}{2}})$\u0000where $n$ is the sample size, $beta$ determines the smoothness of the target\u0000measure $mu$, $gamma$ is the smoothness of the IPM ($gamma=1$ is the\u0000Wasserstein case) and $dleq p$ is the intrinsic dimension of $mu$. In the\u0000process, we derive a sharp interpolation inequality between H\"older IPMs. This\u0000novel result of theory of functions spaces generalizes classical interpolation\u0000inequalities to the case where the measures involved have densities on\u0000different manifolds.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"90 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An analysis of multivariate measures of skewness and kurtosis of skew-elliptical distributions 偏椭圆分布的偏度和峰度的多变量测量分析
Pub Date : 2023-11-30 DOI: arxiv-2311.18176
Baishuai Zuo, Narayanaswamy Balakrishnan, Chuancun Yin
This paper examines eight measures of skewness and Mardia measure of kurtosisfor skew-elliptical distributions. Multivariate measures of skewness consideredinclude Mardia, Malkovich-Afifi, Isogai, Song, Balakrishnan-Brito-Quiroz,M$acute{o}$ri, Rohatgi and Sz$acute{e}$kely, Kollo and Srivastava measures.We first study the canonical form of skew-elliptical distributions, and thenderive exact expressions of all measures of skewness and kurtosis for thefamily of skew-elliptical distributions, except for Song's measure.Specifically, the formulas of these measures for skew normal, skew $t$, skewlogistic, skew Laplace, skew Pearson type II and skew Pearson type VIIdistributions are obtained. Next, as in Malkovich and Afifi (1973), teststatistics based on a random sample are constructed for illustrating theusefulness of the established results. In a Monte Carlo simulation study,different measures of skewness and kurtosis for $2$-dimensional skeweddistributions are calculated and compared. Finally, real data is analyzed todemonstrate all the results.
本文研究了斜椭圆分布的8种偏度度量和Mardia峰度度量。考虑的多变量偏度测量包括Mardia, Malkovich-Afifi, Isogai, Song, Balakrishnan-Brito-Quiroz,M$acute{o}$ri, Rohatgi和Sz$acute{e}$kely, Kollo和Srivastava测量。我们首先研究了斜椭圆分布的标准形式,然后导出了斜椭圆分布族的所有偏度和峰度度量的精确表达式,除了Song的度量。具体地说,得到了斜正态分布、斜t分布、斜logistic分布、斜拉普拉斯分布、斜皮尔逊II型分布和斜皮尔逊iv型分布的这些度量的公式。接下来,如Malkovich和Afifi(1973)所述,构建基于随机样本的检验统计来说明所建立结果的有效性。在蒙特卡罗模拟研究中,计算和比较了$2$维偏态分布的偏度和峰度的不同度量。最后,通过对实际数据的分析,对所得结果进行了验证。
{"title":"An analysis of multivariate measures of skewness and kurtosis of skew-elliptical distributions","authors":"Baishuai Zuo, Narayanaswamy Balakrishnan, Chuancun Yin","doi":"arxiv-2311.18176","DOIUrl":"https://doi.org/arxiv-2311.18176","url":null,"abstract":"This paper examines eight measures of skewness and Mardia measure of kurtosis\u0000for skew-elliptical distributions. Multivariate measures of skewness considered\u0000include Mardia, Malkovich-Afifi, Isogai, Song, Balakrishnan-Brito-Quiroz,\u0000M$acute{o}$ri, Rohatgi and Sz$acute{e}$kely, Kollo and Srivastava measures.\u0000We first study the canonical form of skew-elliptical distributions, and then\u0000derive exact expressions of all measures of skewness and kurtosis for the\u0000family of skew-elliptical distributions, except for Song's measure.\u0000Specifically, the formulas of these measures for skew normal, skew $t$, skew\u0000logistic, skew Laplace, skew Pearson type II and skew Pearson type VII\u0000distributions are obtained. Next, as in Malkovich and Afifi (1973), test\u0000statistics based on a random sample are constructed for illustrating the\u0000usefulness of the established results. In a Monte Carlo simulation study,\u0000different measures of skewness and kurtosis for $2$-dimensional skewed\u0000distributions are calculated and compared. Finally, real data is analyzed to\u0000demonstrate all the results.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"92 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perturbation-based Analysis of Compositional Data 基于微扰的成分数据分析
Pub Date : 2023-11-30 DOI: arxiv-2311.18501
Anton Rask Lundborg, Niklas Pfister
Existing statistical methods for compositional data analysis are inadequatefor many modern applications for two reasons. First, modern compositionaldatasets, for example in microbiome research, display traits such ashigh-dimensionality and sparsity that are poorly modelled with traditionalapproaches. Second, assessing -- in an unbiased way -- how summary statisticsof a composition (e.g., racial diversity) affect a response variable is notstraightforward. In this work, we propose a framework based on hypotheticaldata perturbations that addresses both issues. Unlike existing methods forcompositional data, we do not transform the data and instead use perturbationsto define interpretable statistical functionals on the compositions themselves,which we call average perturbation effects. These average perturbation effects,which can be employed in many applications, naturally account for confoundingthat biases frequently used marginal dependence analyses. We show how averageperturbation effects can be estimated efficiently by deriving aperturbation-dependent reparametrization and applying semiparametric estimationtechniques. We analyze the proposed estimators empirically on simulated dataand demonstrate advantages over existing techniques on US census and microbiomedata. For all proposed estimators, we provide confidence intervals with uniformasymptotic coverage guarantees.
现有的成分数据分析的统计方法不适合许多现代应用,原因有两个。首先,现代组合数据集,例如在微生物组研究中,显示出诸如高维性和稀疏性等特征,这些特征在传统方法中很难建模。其次,以无偏见的方式评估组成(例如,种族多样性)的汇总统计数据如何影响响应变量并不简单。在这项工作中,我们提出了一个基于假设数据扰动的框架来解决这两个问题。与现有的组合数据方法不同,我们不转换数据,而是使用摄动来定义组合本身的可解释统计函数,我们称之为平均摄动效应。这些平均扰动效应可以在许多应用中使用,自然地解释了经常使用边际依赖分析的偏差的混淆。我们展示了如何通过推导孔径相关的再参数化和应用半参数估计技术有效地估计平均扰动效应。我们在模拟数据上对所提出的估计器进行了实证分析,并在美国人口普查和微观生物数据上证明了其优于现有技术的优势。对于所有提出的估计,我们提供了具有均匀渐近覆盖保证的置信区间。
{"title":"Perturbation-based Analysis of Compositional Data","authors":"Anton Rask Lundborg, Niklas Pfister","doi":"arxiv-2311.18501","DOIUrl":"https://doi.org/arxiv-2311.18501","url":null,"abstract":"Existing statistical methods for compositional data analysis are inadequate\u0000for many modern applications for two reasons. First, modern compositional\u0000datasets, for example in microbiome research, display traits such as\u0000high-dimensionality and sparsity that are poorly modelled with traditional\u0000approaches. Second, assessing -- in an unbiased way -- how summary statistics\u0000of a composition (e.g., racial diversity) affect a response variable is not\u0000straightforward. In this work, we propose a framework based on hypothetical\u0000data perturbations that addresses both issues. Unlike existing methods for\u0000compositional data, we do not transform the data and instead use perturbations\u0000to define interpretable statistical functionals on the compositions themselves,\u0000which we call average perturbation effects. These average perturbation effects,\u0000which can be employed in many applications, naturally account for confounding\u0000that biases frequently used marginal dependence analyses. We show how average\u0000perturbation effects can be estimated efficiently by deriving a\u0000perturbation-dependent reparametrization and applying semiparametric estimation\u0000techniques. We analyze the proposed estimators empirically on simulated data\u0000and demonstrate advantages over existing techniques on US census and microbiome\u0000data. For all proposed estimators, we provide confidence intervals with uniform\u0000asymptotic coverage guarantees.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"86 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global Convergence of Online Identification for Mixed Linear Regression 混合线性回归在线辨识的全局收敛性
Pub Date : 2023-11-30 DOI: arxiv-2311.18506
Yujing Liu, Zhixin Liu, Lei Guo
Mixed linear regression (MLR) is a powerful model for characterizingnonlinear relationships by utilizing a mixture of linear regression sub-models.The identification of MLR is a fundamental problem, where most of the existingresults focus on offline algorithms, rely on independent and identicallydistributed (i.i.d) data assumptions, and provide local convergence resultsonly. This paper investigates the online identification and data clusteringproblems for two basic classes of MLRs, by introducing two corresponding newonline identification algorithms based on the expectation-maximization (EM)principle. It is shown that both algorithms will converge globally withoutresorting to the traditional i.i.d data assumptions. The main challenge in ourinvestigation lies in the fact that the gradient of the maximum likelihoodfunction does not have a unique zero, and a key step in our analysis is toestablish the stability of the corresponding differential equation in order toapply the celebrated Ljung's ODE method. It is also shown that thewithin-cluster error and the probability that the new data is categorized intothe correct cluster are asymptotically the same as those in the case of knownparameters. Finally, numerical simulations are provided to verify theeffectiveness of our online algorithms.
混合线性回归(MLR)是一种利用混合线性回归子模型来表征非线性关系的强大模型。MLR的识别是一个基本问题,现有的大多数结果都集中在离线算法上,依赖于独立和同分布(i.i.d)数据假设,并且只提供局部收敛结果。本文通过引入两种基于期望最大化(EM)原理的在线识别算法,研究了两类基本mlr的在线识别和数据聚类问题。结果表明,两种算法都能在全局收敛,而不需要采用传统的id数据假设。我们研究的主要挑战在于最大似然函数的梯度没有唯一的零,而我们分析的关键步骤是建立相应微分方程的稳定性,以便应用著名的Ljung的ODE方法。研究还表明,聚类内误差和新数据被分类到正确聚类的概率与已知参数情况下的误差渐近相同。最后,通过数值仿真验证了算法的有效性。
{"title":"Global Convergence of Online Identification for Mixed Linear Regression","authors":"Yujing Liu, Zhixin Liu, Lei Guo","doi":"arxiv-2311.18506","DOIUrl":"https://doi.org/arxiv-2311.18506","url":null,"abstract":"Mixed linear regression (MLR) is a powerful model for characterizing\u0000nonlinear relationships by utilizing a mixture of linear regression sub-models.\u0000The identification of MLR is a fundamental problem, where most of the existing\u0000results focus on offline algorithms, rely on independent and identically\u0000distributed (i.i.d) data assumptions, and provide local convergence results\u0000only. This paper investigates the online identification and data clustering\u0000problems for two basic classes of MLRs, by introducing two corresponding new\u0000online identification algorithms based on the expectation-maximization (EM)\u0000principle. It is shown that both algorithms will converge globally without\u0000resorting to the traditional i.i.d data assumptions. The main challenge in our\u0000investigation lies in the fact that the gradient of the maximum likelihood\u0000function does not have a unique zero, and a key step in our analysis is to\u0000establish the stability of the corresponding differential equation in order to\u0000apply the celebrated Ljung's ODE method. It is also shown that the\u0000within-cluster error and the probability that the new data is categorized into\u0000the correct cluster are asymptotically the same as those in the case of known\u0000parameters. Finally, numerical simulations are provided to verify the\u0000effectiveness of our online algorithms.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"84 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Functional Average Treatment Effect 功能平均治疗效果
Pub Date : 2023-11-30 DOI: arxiv-2312.00219
Shane Sparkes, Erika Garcia, Lu Zhang
This paper establishes the functional average as an important estimand forcausal inference. The significance of the estimand lies in its robustnessagainst traditional issues of confounding. We prove that this robustness holdseven when the probability distribution of the outcome, conditional on treatmentor some other vector of adjusting variables, differs almost arbitrarily fromits counterfactual analogue. This paper also examines possible estimators ofthe functional average, including the sample mid-range, and proposes a new typeof bootstrap for robust statistical inference: the Hoeffding bootstrap. Afterthis, the paper explores a new class of variables, the $mathcal{U}$ class ofvariables, that simplifies the estimation of functional averages. This class ofvariables is also used to establish mean exchangeability in some cases and toprovide the results of elementary statistical procedures, such as linearregression and the analysis of variance, with causal interpretations.Simulation evidence is provided. The methods of this paper are also applied toa National Health and Nutrition Survey data set to investigate the causaleffect of exercise on the blood pressure of adult smokers.
本文建立了函数平均作为因果推理的一个重要估计。该估计的意义在于它对传统的混淆问题具有鲁棒性。我们证明,当结果的概率分布,条件是处理一些其他的调整变量向量,几乎任意地不同于反事实模拟时,这种鲁棒性是成立的。本文还研究了函数平均的可能估计量,包括样本中程,并提出了一种用于稳健统计推断的新类型的自举:Hoeffding自举。在此之后,本文探讨了一类新的变量,$mathcal{U}$类变量,它简化了函数平均的估计。在某些情况下,这类变量也被用来建立平均互换性,并提供基本统计程序的结果,如线性回归和方差分析,以及因果解释。给出了仿真证据。本文的方法还应用于国家健康与营养调查数据集,以调查运动对成年吸烟者血压的因果关系。
{"title":"The Functional Average Treatment Effect","authors":"Shane Sparkes, Erika Garcia, Lu Zhang","doi":"arxiv-2312.00219","DOIUrl":"https://doi.org/arxiv-2312.00219","url":null,"abstract":"This paper establishes the functional average as an important estimand for\u0000causal inference. The significance of the estimand lies in its robustness\u0000against traditional issues of confounding. We prove that this robustness holds\u0000even when the probability distribution of the outcome, conditional on treatment\u0000or some other vector of adjusting variables, differs almost arbitrarily from\u0000its counterfactual analogue. This paper also examines possible estimators of\u0000the functional average, including the sample mid-range, and proposes a new type\u0000of bootstrap for robust statistical inference: the Hoeffding bootstrap. After\u0000this, the paper explores a new class of variables, the $mathcal{U}$ class of\u0000variables, that simplifies the estimation of functional averages. This class of\u0000variables is also used to establish mean exchangeability in some cases and to\u0000provide the results of elementary statistical procedures, such as linear\u0000regression and the analysis of variance, with causal interpretations.\u0000Simulation evidence is provided. The methods of this paper are also applied to\u0000a National Health and Nutrition Survey data set to investigate the causal\u0000effect of exercise on the blood pressure of adult smokers.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"87 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully lifted interpolating comparisons of bilinearly indexed random processes 完全提升双线性索引随机过程的内插比较
Pub Date : 2023-11-29 DOI: arxiv-2311.18092
Mihailo Stojnic
A powerful statistical interpolating concept, which we call emph{fullylifted} (fl), is introduced and presented while establishing a connectionbetween bilinearly indexed random processes and their corresponding fullydecoupled (linearly indexed) comparative alternatives. Despite on occasion veryinvolved technical considerations, the final interpolating forms and theirunderlying relations admit rather elegant expressions that provide conceivablyhighly desirable and useful tool for further studying various different aspectsof random processes and their applications. We also discuss the generality ofthe considered models and show that they encompass many well known randomstructures and optimization problems to which then the obtained resultsautomatically apply.
在建立双线性索引随机过程与其相应的完全解耦(线性索引)比较方案之间的联系时,引入并提出了一个强大的统计插值概念,我们称之为完全emph{提升}(fl)。尽管有时需要非常复杂的技术考虑,但最终的内插形式及其潜在的关系仍然具有相当优雅的表达,为进一步研究随机过程的各个不同方面及其应用提供了非常理想和有用的工具。我们还讨论了所考虑的模型的一般性,并表明它们包含许多众所周知的随机结构和优化问题,然后获得的结果自动应用。
{"title":"Fully lifted interpolating comparisons of bilinearly indexed random processes","authors":"Mihailo Stojnic","doi":"arxiv-2311.18092","DOIUrl":"https://doi.org/arxiv-2311.18092","url":null,"abstract":"A powerful statistical interpolating concept, which we call emph{fully\u0000lifted} (fl), is introduced and presented while establishing a connection\u0000between bilinearly indexed random processes and their corresponding fully\u0000decoupled (linearly indexed) comparative alternatives. Despite on occasion very\u0000involved technical considerations, the final interpolating forms and their\u0000underlying relations admit rather elegant expressions that provide conceivably\u0000highly desirable and useful tool for further studying various different aspects\u0000of random processes and their applications. We also discuss the generality of\u0000the considered models and show that they encompass many well known random\u0000structures and optimization problems to which then the obtained results\u0000automatically apply.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"91 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interaction tests with covariate-adaptive randomization 协变量自适应随机化的相互作用检验
Pub Date : 2023-11-29 DOI: arxiv-2311.17445
Likun Zhang, Wei Ma
Treatment-covariate interaction tests are commonly applied by researchers toexamine whether the treatment effect varies across patient subgroups defined bybaseline characteristics. The objective of this study is to exploretreatment-covariate interaction tests involving covariate-adaptiverandomization. Without assuming a parametric data generation model, weinvestigate usual interaction tests and observe that they tend to beconservative: specifically, their limiting rejection probabilities under thenull hypothesis do not exceed the nominal level and are typically strictlylower than it. To address this problem, we propose modifications to the usualtests to obtain corresponding exact tests. Moreover, we introduce a novel classof stratified-adjusted interaction tests that are simple, broadly applicable,and more powerful than the usual and modified tests. Our findings are relevantto two types of interaction tests: one involving stratification covariates andthe other involving additional covariates that are not used for randomization.
研究人员通常使用治疗-协变量相互作用试验来检查治疗效果在根据基线特征定义的患者亚组之间是否存在差异。本研究的目的是探讨涉及协变量自适应随机化的治疗-协变量相互作用检验。在没有假设参数数据生成模型的情况下,我们研究了通常的相互作用测试,并观察到它们往往是保守的:具体来说,它们在完全假设下的极限拒绝概率不超过名义水平,并且通常严格低于名义水平。为了解决这个问题,我们提出了对常规测试的修改,以获得相应的精确测试。此外,我们还介绍了一种新的分层调整相互作用测试,它简单,广泛适用,并且比通常的和修改的测试更强大。我们的发现与两种类型的相互作用试验有关:一种涉及分层协变量,另一种涉及未用于随机化的附加协变量。
{"title":"Interaction tests with covariate-adaptive randomization","authors":"Likun Zhang, Wei Ma","doi":"arxiv-2311.17445","DOIUrl":"https://doi.org/arxiv-2311.17445","url":null,"abstract":"Treatment-covariate interaction tests are commonly applied by researchers to\u0000examine whether the treatment effect varies across patient subgroups defined by\u0000baseline characteristics. The objective of this study is to explore\u0000treatment-covariate interaction tests involving covariate-adaptive\u0000randomization. Without assuming a parametric data generation model, we\u0000investigate usual interaction tests and observe that they tend to be\u0000conservative: specifically, their limiting rejection probabilities under the\u0000null hypothesis do not exceed the nominal level and are typically strictly\u0000lower than it. To address this problem, we propose modifications to the usual\u0000tests to obtain corresponding exact tests. Moreover, we introduce a novel class\u0000of stratified-adjusted interaction tests that are simple, broadly applicable,\u0000and more powerful than the usual and modified tests. Our findings are relevant\u0000to two types of interaction tests: one involving stratification covariates and\u0000the other involving additional covariates that are not used for randomization.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"92 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Binary perceptrons capacity via fully lifted random duality theory 基于完全提升随机对偶理论的二元感知器容量
Pub Date : 2023-11-29 DOI: arxiv-2312.00073
Mihailo Stojnic
We study the statistical capacity of the classical binary perceptrons withgeneral thresholds $kappa$. After recognizing the connection between thecapacity and the bilinearly indexed (bli) random processes, we utilize a recentprogress in studying such processes to characterize the capacity. Inparticular, we rely on emph{fully lifted} random duality theory (fl RDT)established in cite{Stojnicflrdt23} to create a general framework for studyingthe perceptrons' capacities. Successful underlying numerical evaluations arerequired for the framework (and ultimately the entire fl RDT machinery) tobecome fully practically operational. We present results obtained in thatdirections and uncover that the capacity characterizations are achieved on thesecond (first non-trivial) level of emph{stationarized} full lifting. Theobtained results emph{exactly} match the replica symmetry breaking predictionsobtained through statistical physics replica methods in cite{KraMez89}. Mostnotably, for the famous zero-threshold scenario, $kappa=0$, we uncover thewell known $alphaapprox0.8330786$ scaled capacity.
我们研究了具有一般阈值的经典二元感知器的统计能力$kappa$。在认识到容量与双线性索引(bli)随机过程之间的联系之后,我们利用研究这类过程的最新进展来表征容量。特别是,我们依赖于cite{Stojnicflrdt23}中建立的emph{完全提升}的随机对偶理论(fl RDT)来创建研究感知机能力的一般框架。成功的基础数值评估是框架(最终整个RDT机器)成为完全实际操作所必需的。我们提出了在该方向上获得的结果,并发现在emph{平稳化}全提升的第二级(第一非平凡)水平上实现了能力表征。所得结果与cite{KraMez89}中通过统计物理复制方法得到的副本对称性破缺预测emph{完全}吻合。最值得注意的是,对于著名的零阈值场景$kappa=0$,我们发现了众所周知的$alphaapprox0.8330786$缩放容量。
{"title":"Binary perceptrons capacity via fully lifted random duality theory","authors":"Mihailo Stojnic","doi":"arxiv-2312.00073","DOIUrl":"https://doi.org/arxiv-2312.00073","url":null,"abstract":"We study the statistical capacity of the classical binary perceptrons with\u0000general thresholds $kappa$. After recognizing the connection between the\u0000capacity and the bilinearly indexed (bli) random processes, we utilize a recent\u0000progress in studying such processes to characterize the capacity. In\u0000particular, we rely on emph{fully lifted} random duality theory (fl RDT)\u0000established in cite{Stojnicflrdt23} to create a general framework for studying\u0000the perceptrons' capacities. Successful underlying numerical evaluations are\u0000required for the framework (and ultimately the entire fl RDT machinery) to\u0000become fully practically operational. We present results obtained in that\u0000directions and uncover that the capacity characterizations are achieved on the\u0000second (first non-trivial) level of emph{stationarized} full lifting. The\u0000obtained results emph{exactly} match the replica symmetry breaking predictions\u0000obtained through statistical physics replica methods in cite{KraMez89}. Most\u0000notably, for the famous zero-threshold scenario, $kappa=0$, we uncover the\u0000well known $alphaapprox0.8330786$ scaled capacity.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"83 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - MATH - Statistics Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1