arXiv - MATH - Statistics Theory最新文献

英文中文

Strong consistency of an estimator by the truncated singular value decomposition for an errors-in-variables regression model with collinearity 共线性误差变量回归模型的截断奇异值分解估计量的强相合性

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-29 DOI: arxiv-2311.17407

Kensuke Aishima

In this paper, we prove strong consistency of an estimator by the truncatedsingular value decomposition for a multivariate errors-in-variables linearregression model with collinearity. This result is an extension of Gleser'sproof of the strong consistency of total least squares solutions to the casewith modern rank constraints. While the usual discussion of consistency in theabsence of solution uniqueness deals with the minimal norm solution, thecontribution of this study is to develop a theory that shows the strongconsistency of a set of solutions. The proof is based on properties oforthogonal projections, specifically properties of the Rayleigh-Ritz procedurefor computing eigenvalues. This makes it suitable for targeting problems wheresome row vectors of the matrices do not contain noise. Therefore, this papergives a proof for the regression model with the above condition on the rowvectors, resulting in a natural generalization of the strong consistency forthe standard TLS estimator.

本文利用截断奇异值分解证明了一类具有共线性的多元误差变量线性回归模型的估计量的强相合性。这个结果是Gleser对具有现代秩约束的情况的总最小二乘解的强一致性证明的推广。虽然通常讨论解不存在唯一性时的一致性处理的是最小范数解，但本研究的贡献是发展了一个理论，表明一组解的强一致性。证明是基于正交投影的性质，特别是计算特征值的瑞利-里兹过程的性质。这使得它适合于定位矩阵的某些行向量不包含噪声的问题。因此，本文给出了具有上述条件的回归模型在行向量上的证明，从而对标准TLS估计量的强相合性进行了自然推广。

引用次数: 0

Bilinearly indexed random processes -- emph{stationarization} of fully lifted interpolation 双线性索引随机过程——完全提升插值的emph{平稳化}

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-29 DOI: arxiv-2311.18097

Mihailo Stojnic

Our companion paper cite{Stojnicnflgscompyx23} introduced a very powerfulemph{fully lifted} (fl) statistical interpolating/comparison mechanism forbilinearly indexed random processes. Here, we present a particular realizationof such fl mechanism that relies on a stationarization along the interpolatingpath concept. A collection of very fundamental relations among theinterpolating parameters is uncovered, contextualized, and presented. As a nicebonus, in particular special cases, we show that the introduced machineryallows various simplifications to forms readily usable in practice. Given howmany well known random structures and optimization problems critically rely onthe results of the type considered here, the range of applications is prettymuch unlimited. We briefly point to some of these opportunities as well.

我们的论文cite{Stojnicnflgscompyx23}为双线性索引随机过程引入了一个非常emph{强大的提升}(fl)统计插值/比较机制。在这里，我们提出了这种机制的一个特殊实现，它依赖于沿插值路径概念的平稳化。在插值参数之间的非常基本的关系的集合被发现，上下文化，并提出。作为一个很好的奖励，在特殊的特殊情况下，我们表明，所引入的机器允许在实践中易于使用的形式的各种简化。考虑到有多少众所周知的随机结构和优化问题严重依赖于这里所考虑的类型的结果，应用范围几乎是无限的。我们也简要地指出其中的一些机会。

引用次数: 0

On the adaptation of causal forests to manifold data 因果森林对流形数据的适应性研究

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-28 DOI: arxiv-2311.16486

Yiyi Huo, Yingying Fan, Fang Han

Researchers often hold the belief that random forests are "the cure to theworld's ills" (Bickel, 2010). But how exactly do they achieve this? Focused onthe recently introduced causal forests (Athey and Imbens, 2016; Wager andAthey, 2018), this manuscript aims to contribute to an ongoing research trendtowards answering this question, proving that causal forests can adapt to theunknown covariate manifold structure. In particular, our analysis shows that acausal forest estimator can achieve the optimal rate of convergence forestimating the conditional average treatment effect, with the covariatedimension automatically replaced by the manifold dimension. These findingsalign with analogous observations in the realm of deep learning and resonatewith the insights presented in Peter Bickel's 2004 Rietz lecture.

研究人员经常认为随机森林是“治疗世界弊病的良药”(Bickel, 2010)。但他们究竟是如何做到的呢?关注最近引入的因果森林(Athey和Imbens, 2016;Wager和athey, 2018)，本文旨在促进正在进行的研究趋势，以回答这个问题，证明因果森林可以适应未知的协变量流形结构。特别是，我们的分析表明，因果森林估计器可以达到预测条件平均处理效果的最佳收敛速度，协变量维数自动被流形维数取代。这些发现与深度学习领域的类似观察结果相一致，并与Peter Bickel在2004年Rietz讲座中提出的见解产生了共鸣。

引用次数: 0

Optimal minimax rate of learning interaction kernels 学习交互核的最优最小最大速率

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-28 DOI: arxiv-2311.16852

Xiong Wang, Inbar Seroussi, Fei Lu

Nonparametric estimation of nonlocal interaction kernels is crucial invarious applications involving interacting particle systems. The inferencechallenge, situated at the nexus of statistical learning and inverse problems,comes from the nonlocal dependency. A central question is whether the optimalminimax rate of convergence for this problem aligns with the rate of$M^{-frac{2beta}{2beta+1}}$ in classical nonparametric regression, where $M$is the sample size and $beta$ represents the smoothness exponent of the radialkernel. Our study confirms this alignment for systems with a finite number ofparticles. We introduce a tamed least squares estimator (tLSE) that attains the optimalconvergence rate for a broad class of exchangeable distributions. The tLSEbridges the smallest eigenvalue of random matrices and Sobolev embedding. Thisestimator relies on nonasymptotic estimates for the left tail probability ofthe smallest eigenvalue of the normal matrix. The lower minimax rate is derivedusing the Fano-Tsybakov hypothesis testing method. Our findings reveal thatprovided the inverse problem in the large sample limit satisfies a coercivitycondition, the left tail probability does not alter the bias-variance tradeoff,and the optimal minimax rate remains intact. Our tLSE method offers astraightforward approach for establishing the optimal minimax rate for modelswith either local or nonlocal dependency.

非局部相互作用核的非参数估计在涉及相互作用粒子系统的应用中是至关重要的。推理的挑战，位于统计学习和反问题的联系，来自于非局部依赖。一个核心问题是这个问题的最优极大收敛率是否与经典非参数回归中的$M^{-frac{2beta}{2beta+1}}$速率一致，其中$M$是样本量，$beta$表示径向核的平滑指数。我们的研究证实了粒子数量有限的系统的这种排列。我们引入了一个驯服的最小二乘估计器(tLSE)，它对一类广泛的可交换分布获得了最优收敛率。该算法将随机矩阵的最小特征值与Sobolev嵌入连接起来。这个估计依赖于对正态矩阵最小特征值的左尾概率的非渐近估计。使用Fano-Tsybakov假设检验方法推导出较低的极大极小率。我们的研究结果表明，如果大样本极限的逆问题满足强制条件，则左尾概率不会改变偏差-方差权衡，并且最优极大极小率保持不变。我们的tLSE方法为建立具有局部或非局部依赖的模型的最优极大极小率提供了一种直接的方法。

{"title":"Optimal minimax rate of learning interaction kernels","authors":"Xiong Wang, Inbar Seroussi, Fei Lu","doi":"arxiv-2311.16852","DOIUrl":"https://doi.org/arxiv-2311.16852","url":null,"abstract":"Nonparametric estimation of nonlocal interaction kernels is crucial in\u0000various applications involving interacting particle systems. The inference\u0000challenge, situated at the nexus of statistical learning and inverse problems,\u0000comes from the nonlocal dependency. A central question is whether the optimal\u0000minimax rate of convergence for this problem aligns with the rate of\u0000$M^{-frac{2beta}{2beta+1}}$ in classical nonparametric regression, where $M$\u0000is the sample size and $beta$ represents the smoothness exponent of the radial\u0000kernel. Our study confirms this alignment for systems with a finite number of\u0000particles. We introduce a tamed least squares estimator (tLSE) that attains the optimal\u0000convergence rate for a broad class of exchangeable distributions. The tLSE\u0000bridges the smallest eigenvalue of random matrices and Sobolev embedding. This\u0000estimator relies on nonasymptotic estimates for the left tail probability of\u0000the smallest eigenvalue of the normal matrix. The lower minimax rate is derived\u0000using the Fano-Tsybakov hypothesis testing method. Our findings reveal that\u0000provided the inverse problem in the large sample limit satisfies a coercivity\u0000condition, the left tail probability does not alter the bias-variance tradeoff,\u0000and the optimal minimax rate remains intact. Our tLSE method offers a\u0000straightforward approach for establishing the optimal minimax rate for models\u0000with either local or nonlocal dependency.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"91 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimal variable acceptance sampling plan for exponential distribution using Bayesian estimate under Type I hybrid censoring 一类混合滤波下指数分布的贝叶斯估计最优变量接受抽样方案

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-28 DOI: arxiv-2311.16693

Ashlyn Maria Mathai, Mahesh Kumar

In this study, variable acceptance sampling plans under Type I hybridcensoring is designed for a lot of independent and identical units withexponential lifetimes using Bayesian estimate of the parameter $vartheta$.This approach is new from the conventional methods in acceptance sampling planwhich relay on maximum likelihood estimate and minimising of Bayes risk.Bayesian estimate is obtained using squared error loss and Linex lossfunctions. Optimisation problem is solved for minimising the testing cost undereach methods and optimal values of the plan parameters $n, t_1$ and $t_2$ arecalculated. The proposed plans are illustrated using various examples and areal life case study is also conducted. Expected testing cost of the samplingplan obtained using squared error loss function is much lower than the cost ofexisting plans using maximum likelihood estimate.

在本研究中，使用参数$vartheta$的贝叶斯估计，针对许多具有指数寿命的独立且相同的单元，设计了I型混合滤波下的可变接受抽样方案。该方法是对传统的验收抽样方法的一种创新，它依赖于最大似然估计和贝叶斯风险最小化。利用误差平方损失和Linex损失函数得到贝叶斯估计。为了使每种方法下的测试成本最小化，解决了优化问题，并重新计算了计划参数$n, t_1$和$t_2$的最优值。提出的方案用各种实例说明，并进行了实际生活中的案例研究。使用平方误差损失函数获得的采样计划的期望测试成本远低于使用最大似然估计的现有计划的成本。

引用次数: 0

Statistical inference for a service system with non-stationary arrivals and unobserved balking 具有非平稳到达和未观察到的停顿的服务系统的统计推断

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-28 DOI: arxiv-2311.16884

Shreehari Anand Bodas, Michel Mandjes, Liron Ravner

We study a multi-server queueing system with a periodic arrival rate andcustomers whose joining decision is based on their patience and a delay proxy.Specifically, each customer has a patience level sampled from a commondistribution. Upon arrival, they receive an estimate of their delay beforejoining service and then join the system only if this delay is not more thantheir patience, otherwise they balk. The main objective is to estimate theparameters pertaining to the arrival rate and patience distribution. Here thecomplication factor is that this inference should be performed based on theobserved process only, i.e., balking customers remain unobserved. We set up alikelihood function of the state dependent effective arrival process (i.e.,corresponding to the customers who join), establish strong consistency of theMLE, and derive the asymptotic distribution of the estimation error. Due to theintrinsic non-stationarity of the Poisson arrival process, the proof techniquesused in previous work become inapplicable. The novelty of the proving mechanismin this paper lies in the procedure of constructing i.i.d. objects fromdependent samples by decomposing the sample path into i.i.d. regenerationcycles. The feasibility of the MLE-approach is discussed via a sequence ofnumerical experiments, for multiple choices of functions which provide delayestimates. In particular, it is observed that the arrival rate is bestestimated at high service capacities, and the patience distribution is bestestimated at lower service capacities.

研究了具有周期性到达率的多服务器排队系统，顾客的加入决策基于耐心和延迟代理。具体来说，每个客户都有一个从公共分布中抽样的耐心水平。到达后，他们在加入服务前会收到他们的延迟估计，只有当这个延迟不超过他们的耐心时，他们才会加入系统，否则他们会犹豫。主要目标是估计与到达率和耐心分布有关的参数。这里的复杂因素是，这种推断应该只基于观察到的过程来执行，也就是说，犹豫不决的客户仍然没有被观察到。我们建立了状态相关的有效到达过程(即对应于加入的顾客)的似然函数，建立了theMLE的强一致性，并推导了估计误差的渐近分布。由于泊松到达过程固有的非平稳性，在以前的工作中使用的证明技术变得不适用。本文证明机制的新颖之处在于，通过将样本路径分解为多个再生循环，从相关样本中构造i.i.d对象。通过一系列的数值实验，讨论了mle方法在提供延迟估计的多种函数选择下的可行性。特别地，我们观察到到达率在高服务容量时是最好的估计，而耐心分布在低服务容量时是最好的估计。

{"title":"Statistical inference for a service system with non-stationary arrivals and unobserved balking","authors":"Shreehari Anand Bodas, Michel Mandjes, Liron Ravner","doi":"arxiv-2311.16884","DOIUrl":"https://doi.org/arxiv-2311.16884","url":null,"abstract":"We study a multi-server queueing system with a periodic arrival rate and\u0000customers whose joining decision is based on their patience and a delay proxy.\u0000Specifically, each customer has a patience level sampled from a common\u0000distribution. Upon arrival, they receive an estimate of their delay before\u0000joining service and then join the system only if this delay is not more than\u0000their patience, otherwise they balk. The main objective is to estimate the\u0000parameters pertaining to the arrival rate and patience distribution. Here the\u0000complication factor is that this inference should be performed based on the\u0000observed process only, i.e., balking customers remain unobserved. We set up a\u0000likelihood function of the state dependent effective arrival process (i.e.,\u0000corresponding to the customers who join), establish strong consistency of the\u0000MLE, and derive the asymptotic distribution of the estimation error. Due to the\u0000intrinsic non-stationarity of the Poisson arrival process, the proof techniques\u0000used in previous work become inapplicable. The novelty of the proving mechanism\u0000in this paper lies in the procedure of constructing i.i.d. objects from\u0000dependent samples by decomposing the sample path into i.i.d. regeneration\u0000cycles. The feasibility of the MLE-approach is discussed via a sequence of\u0000numerical experiments, for multiple choices of functions which provide delay\u0000estimates. In particular, it is observed that the arrival rate is best\u0000estimated at high service capacities, and the patience distribution is best\u0000estimated at lower service capacities.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"82 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design of variable acceptance sampling plan for exponential distribution under uncertainty 不确定条件下指数分布的可变验收抽样方案设计

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-28 DOI: arxiv-2311.17111

Mahesh Kumar, Ashlyn Maria Mathai

In an acceptance monitoring system, acceptance sampling techniques are usedto increase production, enhance control, and deliver higher-quality products ata lesser cost. It might not always be possible to define the acceptancesampling plan parameters as exact values, especially, when data hasuncertainty. In this work, acceptance sampling plans for a large number ofidentical units with exponential lifetimes are obtained by treating acceptablequality life, rejectable quality life, consumer's risk, and producer's risk asfuzzy parameters. To obtain plan parameters of sequential sampling plans andrepetitive group sampling plans, fuzzy hypothesis test is considered. Tovalidate the sampling plans obtained in this work, some examples are presented.Our results are compared with existing results in the literature. Finally, todemonstrate the application of the resulting sampling plans, a real-life casestudy is presented.

在验收监控系统中，验收抽样技术用于提高产量，加强控制，以更低的成本交付更高质量的产品。将接受抽样计划参数定义为精确值可能并不总是可能的，特别是在数据具有不确定性时。本文通过将可接受质量寿命、可拒绝质量寿命、消费者风险和生产者风险作为模糊参数，得到了具有指数寿命的大量相同单元的验收抽样方案。为了获得顺序抽样方案和重复组抽样方案的方案参数，采用了模糊假设检验。为了验证所得到的抽样方案，给出了一些算例。我们的结果与已有的文献结果进行了比较。最后，为了演示所得抽样方案的应用，给出了一个实际案例研究。

引用次数: 0

Cyber risk modeling using a two-phase Hawkes process with external excitation 基于外部激励的两阶段Hawkes过程的网络风险建模

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-27 DOI: arxiv-2311.15701

Alexandre BoumezouedCREST, Yousra CherkaouiCREST, Caroline HillairetCREST

With the growing digital transformation of the worldwide economy, cyber riskhas become a major issue. As 1 % of the world's GDP (around $1,000 billion) isallegedly lost to cybercrime every year, IT systems continue to getincreasingly interconnected, making them vulnerable to accumulation phenomenathat undermine the pooling mechanism of insurance. As highlighted in theliterature, Hawkes processes appear to be suitable models to capture contagionphenomena and clustering features of cyber events. This paper extends thestandard Hawkes modeling of cyber risk frequency by adding external shocks,modelled by the publication of cyber vulnerabilities that are deemed toincrease the likelihood of attacks in the short term. The aim of the proposedmodel is to provide a better quantification of contagion effects since, whilethe standard Hawkes model allocates all the clustering phenomena toself-excitation, our model allows to capture the external common factors thatmay explain part of the systemic pattern. We propose a Hawkes model with twokernels, one for the endogenous factor (the contagion from other cyber events)and one for the exogenous component (cyber vulnerability publications). We useparametric exponential specifications for both the internal and exogenousintensity kernels, and we compare different methods to tackle the inferenceproblem based on public datasets containing features of cyber attacks found inthe Hackmageddon database and cyber vulnerabilities from the Known ExploitedVulnerability database and the National Vulnerability Dataset. By refining theexternal excitation database selection, the degree of endogeneity of the modelis nearly halved. We illustrate our model with simulations and discuss theimpact of taking into account the external factor driven by vulnerabilities.Once an attack has occurred, response measures are implemented to limit theeffects of an attack. These measures include patching vulnerabilities andreducing the attack's contagion. We use an augmented version of the model byadding a second phase modeling a reduction in the contagion pattern from theremediation measures. Based on this model, we explore various scenarios andquantify the effect of mitigation measures of an insurance company that aims tomitigate the effects of a cyber pandemic in its insured portfolio.

随着全球经济数字化转型的不断深入，网络风险已成为一个重大问题。据称，网络犯罪每年造成的损失占全球GDP的1%(约1万亿美元)，IT系统的互联程度不断加深，这使得它们容易受到破坏保险汇集机制的累积现象的影响。正如文献中所强调的那样，霍克斯过程似乎是捕捉网络事件的传染现象和聚类特征的合适模型。本文通过增加外部冲击来扩展网络风险频率的标准Hawkes模型，通过发布被认为在短期内增加攻击可能性的网络漏洞来建模。我们提出的模型的目的是为了更好地量化传染效应，因为标准霍克斯模型将所有的聚类现象都分配给了自激励，而我们的模型允许捕捉可能解释部分系统性模式的外部共同因素。我们提出了一个具有两个核的Hawkes模型，一个用于内生因素(来自其他网络事件的传染)，一个用于外生因素(网络脆弱性出版物)。我们对内部和外部强度核都使用了参数指数规范，并比较了不同的方法来解决基于公共数据集的推理问题，这些数据集包含在Hackmageddon数据库中发现的网络攻击特征，以及来自已知漏洞数据库和国家漏洞数据集的网络漏洞。通过改进外部激励数据库的选择，模型的内生性程度降低了近一半。我们用模拟来说明我们的模型，并讨论了考虑到由漏洞驱动的外部因素的影响。一旦攻击发生，就会实施响应措施来限制攻击的影响。这些措施包括修补漏洞和减少攻击的蔓延。我们使用该模型的增强版本，通过添加第二阶段建模来减少来自调解措施的传染模式。基于该模型，我们探索了各种情景，并量化了一家保险公司缓解措施的效果，该措施旨在减轻网络大流行对其投保投资组合的影响。

{"title":"Cyber risk modeling using a two-phase Hawkes process with external excitation","authors":"Alexandre BoumezouedCREST, Yousra CherkaouiCREST, Caroline HillairetCREST","doi":"arxiv-2311.15701","DOIUrl":"https://doi.org/arxiv-2311.15701","url":null,"abstract":"With the growing digital transformation of the worldwide economy, cyber risk\u0000has become a major issue. As 1 % of the world's GDP (around $1,000 billion) is\u0000allegedly lost to cybercrime every year, IT systems continue to get\u0000increasingly interconnected, making them vulnerable to accumulation phenomena\u0000that undermine the pooling mechanism of insurance. As highlighted in the\u0000literature, Hawkes processes appear to be suitable models to capture contagion\u0000phenomena and clustering features of cyber events. This paper extends the\u0000standard Hawkes modeling of cyber risk frequency by adding external shocks,\u0000modelled by the publication of cyber vulnerabilities that are deemed to\u0000increase the likelihood of attacks in the short term. The aim of the proposed\u0000model is to provide a better quantification of contagion effects since, while\u0000the standard Hawkes model allocates all the clustering phenomena to\u0000self-excitation, our model allows to capture the external common factors that\u0000may explain part of the systemic pattern. We propose a Hawkes model with two\u0000kernels, one for the endogenous factor (the contagion from other cyber events)\u0000and one for the exogenous component (cyber vulnerability publications). We use\u0000parametric exponential specifications for both the internal and exogenous\u0000intensity kernels, and we compare different methods to tackle the inference\u0000problem based on public datasets containing features of cyber attacks found in\u0000the Hackmageddon database and cyber vulnerabilities from the Known Exploited\u0000Vulnerability database and the National Vulnerability Dataset. By refining the\u0000external excitation database selection, the degree of endogeneity of the model\u0000is nearly halved. We illustrate our model with simulations and discuss the\u0000impact of taking into account the external factor driven by vulnerabilities.\u0000Once an attack has occurred, response measures are implemented to limit the\u0000effects of an attack. These measures include patching vulnerabilities and\u0000reducing the attack's contagion. We use an augmented version of the model by\u0000adding a second phase modeling a reduction in the contagion pattern from the\u0000remediation measures. Based on this model, we explore various scenarios and\u0000quantify the effect of mitigation measures of an insurance company that aims to\u0000mitigate the effects of a cyber pandemic in its insured portfolio.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"63 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift 最大似然估计是所有你需要的良好指定协变量移位

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-27 DOI: arxiv-2311.15961

Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin

A key challenge of modern machine learning systems is to achieveOut-of-Distribution (OOD) generalization -- generalizing to target data whosedistribution differs from that of source data. Despite its significantimportance, the fundamental question of ``what are the most effectivealgorithms for OOD generalization'' remains open even under the standardsetting of covariate shift. This paper addresses this fundamental question byproving that, surprisingly, classical Maximum Likelihood Estimation (MLE)purely using source data (without any modification) achieves the minimaxoptimality for covariate shift under the well-specified setting. That is, noalgorithm performs better than MLE in this setting (up to a constant factor),justifying MLE is all you need. Our result holds for a very rich class ofparametric models, and does not require any boundedness condition on thedensity ratio. We illustrate the wide applicability of our framework byinstantiating it to three concrete examples -- linear regression, logisticregression, and phase retrieval. This paper further complement the study byproving that, under the misspecified setting, MLE is no longer the optimalchoice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimaxoptimal in certain scenarios.

现代机器学习系统的一个关键挑战是实现out -of- distribution (OOD)泛化——泛化到分布与源数据不同的目标数据。尽管具有重要意义，但即使在协变量移位的标准设定下，“什么是最有效的OOD泛化算法”这一基本问题仍然存在。本文通过证明，令人惊讶的是，纯粹使用源数据(未经任何修改)的经典最大似然估计(MLE)在良好指定的设置下实现了协变量移位的最小最大最优性，从而解决了这个基本问题。也就是说，在这种情况下，没有算法比MLE表现得更好(直到一个常数因子)，证明MLE是您所需要的。我们的结果适用于一类非常丰富的参数模型，并且不需要密度比的有界条件。我们通过实例化三个具体的例子来说明我们的框架的广泛适用性——线性回归、逻辑回归和相位检索。本文进一步证明了在错误设定下，最大加权似然估计(MWLE)不再是最优选择，而在某些情况下，最大加权似然估计(MWLE)出现为最小最大最优。

{"title":"Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift","authors":"Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin","doi":"arxiv-2311.15961","DOIUrl":"https://doi.org/arxiv-2311.15961","url":null,"abstract":"A key challenge of modern machine learning systems is to achieve\u0000Out-of-Distribution (OOD) generalization -- generalizing to target data whose\u0000distribution differs from that of source data. Despite its significant\u0000importance, the fundamental question of ``what are the most effective\u0000algorithms for OOD generalization'' remains open even under the standard\u0000setting of covariate shift. This paper addresses this fundamental question by\u0000proving that, surprisingly, classical Maximum Likelihood Estimation (MLE)\u0000purely using source data (without any modification) achieves the minimax\u0000optimality for covariate shift under the well-specified setting. That is, no\u0000algorithm performs better than MLE in this setting (up to a constant factor),\u0000justifying MLE is all you need. Our result holds for a very rich class of\u0000parametric models, and does not require any boundedness condition on the\u0000density ratio. We illustrate the wide applicability of our framework by\u0000instantiating it to three concrete examples -- linear regression, logistic\u0000regression, and phase retrieval. This paper further complement the study by\u0000proving that, under the misspecified setting, MLE is no longer the optimal\u0000choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax\u0000optimal in certain scenarios.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stab-GKnock: Controlled variable selection for partially linear models using generalized knockoffs 使用广义仿制品的部分线性模型的控制变量选择

arXiv - MATH - Statistics Theory

Pub Date : 2023-11-27 DOI: arxiv-2311.15982

Han Su, Panxu Yuan, Qingyang Sun, Mengxi Yi, Gaorong Li

The recently proposed fixed-X knockoff is a powerful variable selectionprocedure that controls the false discovery rate (FDR) in any finite-samplesetting, yet its theoretical insights are difficult to show beyond Gaussianlinear models. In this paper, we make the first attempt to extend the fixed-Xknockoff to partially linear models by using generalized knockoff features, andpropose a new stability generalized knockoff (Stab-GKnock) procedure byincorporating selection probability as feature importance score. We provide FDRcontrol and power guarantee under some regularity conditions. In addition, wepropose a two-stage method under high dimensionality by introducing a new jointfeature screening procedure, with guaranteed sure screening property. Extensivesimulation studies are conducted to evaluate the finite-sample performance ofthe proposed method. A real data example is also provided for illustration.

最近提出的固定x仿制品是一个强大的变量选择过程，可以控制任何有限样本设置中的错误发现率(FDR)，但其理论见解很难超越高斯线性模型。本文首次尝试利用广义仿冒特征将固定x仿冒扩展到部分线性模型，并提出了一种以选择概率作为特征重要分数的稳定性广义仿冒(Stab-GKnock)方法。在一定的规则条件下提供fdr控制和功率保证。此外，我们通过引入一种新的联合特征筛选程序，提出了一种高维下的两阶段筛选方法，具有保证的筛选性能。进行了广泛的仿真研究，以评估所提出的方法的有限样本性能。本文还提供了一个实际的数据示例进行说明。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - MATH - Statistics Theory

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀