arXiv - MATH - Statistics Theory最新文献_第2页

Deviation and moment inequalities for Banach-valued $U$-statistics 巴纳奇值 U$ 统计量的偏差和矩不等式

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.01902

Davide GiraudoIRMA, UNISTRA UFR MI

We show a deviation inequality for U-statistics of independent data takingvalues in a separable Banach space which satisfies some smoothness assumptions.We then provide applications to rates in the law of large numbers forU-statistics, a H{"o}lderian functional central limit theorem and a momentinequality for incomplete $U$-statistics.

我们展示了独立数据在可分离巴拿赫空间取值的 U 统计量的偏差不等式，该不等式满足一些平稳性假设，然后我们提供了 U 统计量大数定律中的比率、H{"o}lderian 函数中心极限定理和不完整 $U$ 统计量的矩量不等式的应用。

引用次数: 0

Multimatrix variate distributions 多矩阵变量分布

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.02498

José A. Díaz-García, Francisco J. Caro-Lopera

A new family of distributions indexed by the class of matrix variatecontoured elliptically distribution is proposed as an extension of somebimatrix variate distributions. The termed emph{multimatrix variatedistributions} open new perspectives for the classical distribution theory,usually based on probabilistic independent models and preferred untestedfitting laws. Most of the multimatrix models here derived are invariant underthe spherical family, a fact that solves the testing and prior knowledge of theunderlying distributions and elucidates the statistical methodology incontrasts with some weakness of current studies as copulas. The paper alsoincludes a number of diverse special cases, properties and generalisations. Thenew joint distributions allows several unthinkable combinations for copulas,such as scalars, vectors and matrices, all of them adjustable to the requiredmodels of the experts. The proposed joint distributions are also easilycomputable, then several applications are plausible. In particular, anexhaustive example in molecular docking on SARS-CoV-2 presents the results onmatrix dependent samples.

作为一些矩阵变异分布的扩展，提出了一个以矩阵变异椭圆分布类为索引的新分布族。多矩阵变分分布为经典分布理论开辟了新的视角，经典分布理论通常基于独立的概率模型和未经测试的拟合法则。本文推导出的大多数多矩阵模型在球面族下是不变的，这一事实解决了基础分布的测试和先验知识问题，并阐明了与当前研究中的一些弱点（如协整）形成对比的统计方法。本文还包括许多不同的特例、性质和概括。新的联合分布允许对 copulas 进行几种难以想象的组合，如标量、向量和矩阵，所有这些都可根据专家所需的模型进行调整。所提出的联合分布也很容易计算，因此有几种应用是可行的。特别是，SARS-CoV-2 分子对接中的一个详尽例子展示了矩阵依赖样本的结果。

{"title":"Multimatrix variate distributions","authors":"José A. Díaz-García, Francisco J. Caro-Lopera","doi":"arxiv-2405.02498","DOIUrl":"https://doi.org/arxiv-2405.02498","url":null,"abstract":"A new family of distributions indexed by the class of matrix variate\u0000contoured elliptically distribution is proposed as an extension of some\u0000bimatrix variate distributions. The termed emph{multimatrix variate\u0000distributions} open new perspectives for the classical distribution theory,\u0000usually based on probabilistic independent models and preferred untested\u0000fitting laws. Most of the multimatrix models here derived are invariant under\u0000the spherical family, a fact that solves the testing and prior knowledge of the\u0000underlying distributions and elucidates the statistical methodology in\u0000contrasts with some weakness of current studies as copulas. The paper also\u0000includes a number of diverse special cases, properties and generalisations. The\u0000new joint distributions allows several unthinkable combinations for copulas,\u0000such as scalars, vectors and matrices, all of them adjustable to the required\u0000models of the experts. The proposed joint distributions are also easily\u0000computable, then several applications are plausible. In particular, an\u0000exhaustive example in molecular docking on SARS-CoV-2 presents the results on\u0000matrix dependent samples.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Full Adagrad algorithm with O(Nd) operations 操作次数为 O(Nd) 的全 Adagrad 算法

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.01908

Antoine Godichon-BaggioniLPSM, Wei LuLMI, Bruno PortierLMI

A novel approach is given to overcome the computational challenges of thefull-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochasticoptimization. By developing a recursive method that estimates the inverse ofthe square root of the covariance of the gradient, alongside a streamingvariant for parameter updates, the study offers efficient and practicalalgorithms for large-scale applications. This innovative strategy significantlyreduces the complexity and resource demands typically associated withfull-matrix methods, enabling more effective optimization processes. Moreover,the convergence rates of the proposed estimators and their asymptoticefficiency are given. Their effectiveness is demonstrated through numericalstudies.

本研究给出了一种新方法来克服随机优化中全矩阵自适应梯度算法（Full AdaGrad）的计算挑战。通过开发一种估计梯度协方差平方根倒数的递归方法，以及一种用于参数更新的流式变量，该研究为大规模应用提供了高效实用的算法。这种创新策略大大降低了通常与全矩阵方法相关的复杂性和资源需求，从而实现了更有效的优化过程。此外，还给出了所提估计器的收敛率及其渐近效率。通过数值研究证明了它们的有效性。

引用次数: 0

Improved distance correlation estimation 改进距离相关性估计

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.01958

Blanca E. Monroy-Castillo, M. A, Jácome, Ricardo Cao

Distance correlation is a novel class of multivariate dependence measure,taking positive values between 0 and 1, and applicable to random vectors ofarbitrary dimensions, not necessarily equal. It offers several advantages overthe well-known Pearson correlation coefficient, the most important is thatdistance correlation equals zero if and only if the random vectors areindependent. There are two different estimators of the distance correlation available inthe literature. The first one, proposed by Sz'ekely et al. (2007), is based onan asymptotically unbiased estimator of the distance covariance which turns outto be a V-statistic. The second one builds on an unbiased estimator of thedistance covariance proposed in Sz'ekely et al. (2014), proved to be anU-statistic by Sz'ekely and Huo (2016). This study evaluates their efficiency(mean squared error) and compares computational times for both methods underdifferent dependence structures. Under conditions of independence ornear-independence, the V-estimates are biased, while the U-estimator frequentlycannot be computed due to negative values. To address this challenge, a convexlinear combination of the former estimators is proposed and studied, yieldinggood results regardless of the level of dependence.

距离相关性是一类新的多元依赖性度量，其正值介于 0 和 1 之间，适用于任意维度的随机向量，不一定相等。与众所周知的皮尔逊相关系数相比，它有几个优点，其中最重要的是，如果且仅如果随机向量是独立的，则距离相关性等于零。文献中有两种不同的距离相关性估计值。第一个是 Sz'ekely 等人（2007 年）提出的，它基于距离协方差的渐近无偏估计值，该估计值被证明是一个 V 统计量。第二个估计是基于 Sz'ekely 等人（2014 年）提出的距离协方差无偏估计，Sz'ekely 和 Huo（2016 年）证明它是一个 U 统计量。本研究评估了这两种方法的效率（均方误差），并比较了这两种方法在不同依赖结构下的计算时间。在独立或近似独立的条件下，V估计值是有偏差的，而U估计值经常由于负值而无法计算。为了解决这一难题，我们提出并研究了前两种估计方法的凸线性组合，无论依赖程度如何，都能获得良好的结果。

{"title":"Improved distance correlation estimation","authors":"Blanca E. Monroy-Castillo, M. A, Jácome, Ricardo Cao","doi":"arxiv-2405.01958","DOIUrl":"https://doi.org/arxiv-2405.01958","url":null,"abstract":"Distance correlation is a novel class of multivariate dependence measure,\u0000taking positive values between 0 and 1, and applicable to random vectors of\u0000arbitrary dimensions, not necessarily equal. It offers several advantages over\u0000the well-known Pearson correlation coefficient, the most important is that\u0000distance correlation equals zero if and only if the random vectors are\u0000independent. There are two different estimators of the distance correlation available in\u0000the literature. The first one, proposed by Sz'ekely et al. (2007), is based on\u0000an asymptotically unbiased estimator of the distance covariance which turns out\u0000to be a V-statistic. The second one builds on an unbiased estimator of the\u0000distance covariance proposed in Sz'ekely et al. (2014), proved to be an\u0000U-statistic by Sz'ekely and Huo (2016). This study evaluates their efficiency\u0000(mean squared error) and compares computational times for both methods under\u0000different dependence structures. Under conditions of independence or\u0000near-independence, the V-estimates are biased, while the U-estimator frequently\u0000cannot be computed due to negative values. To address this challenge, a convex\u0000linear combination of the former estimators is proposed and studied, yielding\u0000good results regardless of the level of dependence.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Finite Sample Analysis and Bounds of Generalization Error of Gradient Descent in In-Context Linear Regression 文本线性回归中梯度下降的有限样本分析和广义误差边界

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.02462

Karthik Duraisamy

Recent studies show that transformer-based architectures emulate gradientdescent during a forward pass, contributing to in-context learning capabilities- an ability where the model adapts to new tasks based on a sequence of promptexamples without being explicitly trained or fine tuned to do so. This workinvestigates the generalization properties of a single step of gradient descentin the context of linear regression with well-specified models. A random designsetting is considered and analytical expressions are derived for thestatistical properties of generalization error in a non-asymptotic (finitesample) setting. These expressions are notable for avoiding arbitraryconstants, and thus offer robust quantitative information and scalingrelationships. These results are contrasted with those from classical leastsquares regression (for which analogous finite sample bounds are also derived),shedding light on systematic and noise components, as well as optimal stepsizes. Additionally, identities involving high-order products of Gaussianrandom matrices are presented as a byproduct of the analysis.

最近的研究表明，基于变压器的架构可以在前向传递过程中模拟梯度下降，从而提高上下文学习能力，即模型可以根据一系列提示示例适应新任务，而无需进行明确的训练或微调。这项工作研究了梯度下降单步法在线性回归中的泛化特性。研究考虑了随机设计设置，并推导出了非渐近（有限样本）设置下广义误差统计特性的分析表达式。这些表达式避免了任意常数，因此提供了可靠的定量信息和比例关系。这些结果与经典最小二乘回归（也推导出了类似的有限样本约束）的结果进行了对比，揭示了系统性和噪声成分以及最佳步长。此外，作为分析的副产品，还提出了涉及高斯随机矩阵高阶乘积的特性。

{"title":"Finite Sample Analysis and Bounds of Generalization Error of Gradient Descent in In-Context Linear Regression","authors":"Karthik Duraisamy","doi":"arxiv-2405.02462","DOIUrl":"https://doi.org/arxiv-2405.02462","url":null,"abstract":"Recent studies show that transformer-based architectures emulate gradient\u0000descent during a forward pass, contributing to in-context learning capabilities\u0000- an ability where the model adapts to new tasks based on a sequence of prompt\u0000examples without being explicitly trained or fine tuned to do so. This work\u0000investigates the generalization properties of a single step of gradient descent\u0000in the context of linear regression with well-specified models. A random design\u0000setting is considered and analytical expressions are derived for the\u0000statistical properties of generalization error in a non-asymptotic (finite\u0000sample) setting. These expressions are notable for avoiding arbitrary\u0000constants, and thus offer robust quantitative information and scaling\u0000relationships. These results are contrasted with those from classical least\u0000squares regression (for which analogous finite sample bounds are also derived),\u0000shedding light on systematic and noise components, as well as optimal step\u0000sizes. Additionally, identities involving high-order products of Gaussian\u0000random matrices are presented as a byproduct of the analysis.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery 统计顺序决策数学：随机匪帮中的集中、风险意识和建模，并应用于减肥手术

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.01994

Patrick Saux

This thesis aims to study some of the mathematical challenges that arise inthe analysis of statistical sequential decision-making algorithms forpostoperative patients follow-up. Stochastic bandits (multiarmed, contextual)model the learning of a sequence of actions (policy) by an agent in anuncertain environment in order to maximise observed rewards. To learn optimalpolicies, bandit algorithms have to balance the exploitation of currentknowledge and the exploration of uncertain actions. Such algorithms havelargely been studied and deployed in industrial applications with largedatasets, low-risk decisions and clear modelling assumptions, such asclickthrough rate maximisation in online advertising. By contrast, digitalhealth recommendations call for a whole new paradigm of small samples,risk-averse agents and complex, nonparametric modelling. To this end, wedeveloped new safe, anytime-valid concentration bounds, (Bregman, empiricalChernoff), introduced a new framework for risk-aware contextual bandits (withelicitable risk measures) and analysed a novel class of nonparametric banditalgorithms under weak assumptions (Dirichlet sampling). In addition to thetheoretical guarantees, these results are supported by in-depth empiricalevidence. Finally, as a first step towards personalised postoperative follow-uprecommendations, we developed with medical doctors and surgeons aninterpretable machine learning model to predict the long-term weighttrajectories of patients after bariatric surgery.

本论文旨在研究术后患者随访的统计顺序决策算法分析中出现的一些数学难题。随机匪徒（多臂、情境）是一个代理在不确定的环境中学习一连串行动（策略）的模型，目的是使观察到的回报最大化。为了学习最优策略，匪帮算法必须在利用当前知识和探索不确定行动之间取得平衡。这类算法主要在具有大型数据集、低风险决策和明确建模假设的工业应用中进行研究和部署，例如在线广告中的点击率最大化。相比之下，数字健康建议需要一种全新的模式，即小样本、规避风险的代理和复杂的非参数建模。为此，我们开发了新的安全、随时有效的浓度边界（Bregman、经验切尔诺夫），引入了风险感知情境匪帮的新框架（具有可复制的风险度量），并分析了弱假设下的一类新型非参数匪帮算法（狄利克特采样）。除了理论保证外，这些结果还得到了深入的经验证据的支持。最后，作为个性化术后随访建议的第一步，我们与医生和外科医生共同开发了一个可解释的机器学习模型，用于预测减肥手术后患者的长期体重轨迹。

{"title":"Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery","authors":"Patrick Saux","doi":"arxiv-2405.01994","DOIUrl":"https://doi.org/arxiv-2405.01994","url":null,"abstract":"This thesis aims to study some of the mathematical challenges that arise in\u0000the analysis of statistical sequential decision-making algorithms for\u0000postoperative patients follow-up. Stochastic bandits (multiarmed, contextual)\u0000model the learning of a sequence of actions (policy) by an agent in an\u0000uncertain environment in order to maximise observed rewards. To learn optimal\u0000policies, bandit algorithms have to balance the exploitation of current\u0000knowledge and the exploration of uncertain actions. Such algorithms have\u0000largely been studied and deployed in industrial applications with large\u0000datasets, low-risk decisions and clear modelling assumptions, such as\u0000clickthrough rate maximisation in online advertising. By contrast, digital\u0000health recommendations call for a whole new paradigm of small samples,\u0000risk-averse agents and complex, nonparametric modelling. To this end, we\u0000developed new safe, anytime-valid concentration bounds, (Bregman, empirical\u0000Chernoff), introduced a new framework for risk-aware contextual bandits (with\u0000elicitable risk measures) and analysed a novel class of nonparametric bandit\u0000algorithms under weak assumptions (Dirichlet sampling). In addition to the\u0000theoretical guarantees, these results are supported by in-depth empirical\u0000evidence. Finally, as a first step towards personalised postoperative follow-up\u0000recommendations, we developed with medical doctors and surgeons an\u0000interpretable machine learning model to predict the long-term weight\u0000trajectories of patients after bariatric surgery.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning 机器学习中有效不确定性量化的保形预测方法比较研究

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-03 DOI: arxiv-2405.02082

Nicolas Dewolf

In the past decades, most work in the area of data analysis and machinelearning was focused on optimizing predictive models and getting better resultsthan what was possible with existing models. To what extent the metrics withwhich such improvements were measured were accurately capturing the intendedgoal, whether the numerical differences in the resulting values weresignificant, or whether uncertainty played a role in this study and if itshould have been taken into account, was of secondary importance. Whereasprobability theory, be it frequentist or Bayesian, used to be the gold standardin science before the advent of the supercomputer, it was quickly replaced infavor of black box models and sheer computing power because of their ability tohandle large data sets. This evolution sadly happened at the expense ofinterpretability and trustworthiness. However, while people are still trying toimprove the predictive power of their models, the community is starting torealize that for many applications it is not so much the exact prediction thatis of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world whereeveryone is aware of uncertainty, of how important it is and how to embrace itinstead of fearing it. A specific, though general, framework that allows anyoneto obtain accurate uncertainty estimates is singled out and analysed. Certainaspects and applications of the framework -- dubbed `conformal prediction' --are studied in detail. Whereas many approaches to uncertainty quantificationmake strong assumptions about the data, conformal prediction is, at the time ofwriting, the only framework that deserves the title `distribution-free'. Noparametric assumptions have to be made and the nonparametric results also holdwithout having to resort to the law of large numbers in the asymptotic regime.

在过去的几十年中，数据分析和机器学习领域的大部分工作都集中在优化预测模型和获得比现有模型更好的结果上。至于衡量这种改进的指标在多大程度上准确地捕捉到了预期目标，得出的数值差异是否显著，或者不确定性在本研究中是否发挥作用以及是否应该将其考虑在内，这些都是次要的。在超级计算机出现之前，无论是频数理论还是贝叶斯理论，概率论都曾是科学界的黄金标准，但很快就被黑箱模型和纯粹的计算能力所取代，因为它们能够处理大量数据集。可悲的是，这种演变是以牺牲可解释性和可信度为代价的。不过，尽管人们仍在努力提高模型的预测能力，但社会各界已经开始意识到，对于许多应用来说，重要的不是精确预测，而是可变性或不确定性。本论文中的工作试图进一步追求一个人人都能意识到不确定性、不确定性的重要性以及如何拥抱不确定性而不是惧怕不确定性的世界。论文挑出并分析了一个允许任何人获得准确的不确定性估计的具体（尽管是一般）框架。该框架被称为 "共形预测"，对其某些方面和应用进行了详细研究。许多不确定性量化方法都对数据做出了强有力的假设，而保形预测在本文撰写时是唯一一个配得上 "无分布 "称号的框架。它不需要任何参数假设，而且非参数结果也是成立的，无需在渐近机制中求助于大数定律。

{"title":"A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning","authors":"Nicolas Dewolf","doi":"arxiv-2405.02082","DOIUrl":"https://doi.org/arxiv-2405.02082","url":null,"abstract":"In the past decades, most work in the area of data analysis and machine\u0000learning was focused on optimizing predictive models and getting better results\u0000than what was possible with existing models. To what extent the metrics with\u0000which such improvements were measured were accurately capturing the intended\u0000goal, whether the numerical differences in the resulting values were\u0000significant, or whether uncertainty played a role in this study and if it\u0000should have been taken into account, was of secondary importance. Whereas\u0000probability theory, be it frequentist or Bayesian, used to be the gold standard\u0000in science before the advent of the supercomputer, it was quickly replaced in\u0000favor of black box models and sheer computing power because of their ability to\u0000handle large data sets. This evolution sadly happened at the expense of\u0000interpretability and trustworthiness. However, while people are still trying to\u0000improve the predictive power of their models, the community is starting to\u0000realize that for many applications it is not so much the exact prediction that\u0000is of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world where\u0000everyone is aware of uncertainty, of how important it is and how to embrace it\u0000instead of fearing it. A specific, though general, framework that allows anyone\u0000to obtain accurate uncertainty estimates is singled out and analysed. Certain\u0000aspects and applications of the framework -- dubbed `conformal prediction' --\u0000are studied in detail. Whereas many approaches to uncertainty quantification\u0000make strong assumptions about the data, conformal prediction is, at the time of\u0000writing, the only framework that deserves the title `distribution-free'. No\u0000parametric assumptions have to be made and the nonparametric results also hold\u0000without having to resort to the law of large numbers in the asymptotic regime.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Gapeev-Shiryaev Conjecture 加佩耶夫-希尔亚耶夫猜想

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-02 DOI: arxiv-2405.01685

Philip A. Ernst, Goran Peskir

The Gapeev-Shiryaev conjecture (originating in Gapeev and Shiryaev (2011) andGapeev and Shiryaev (2013)) can be broadly stated as follows: Monotonicity ofthe signal-to-noise ratio implies monotonicity of the optimal stoppingboundaries. The conjecture was originally formulated both within (i) sequentialtesting problems for diffusion processes (where one needs to decide which ofthe two drifts is being indirectly observed) and (ii) quickest detectionproblems for diffusion processes (where one needs to detect when the initialdrift changes to a new drift). In this paper we present proofs of theGapeev-Shiryaev conjecture both in (i) the sequential testing setting (underLipschitz/Holder coefficients of the underlying SDEs) and (ii) the quickestdetection setting (under analytic coefficients of the underlying SDEs). Themethod of proof in the sequential testing setting relies upon a stochastic timechange and pathwise comparison arguments. Both arguments break down in thequickest detection setting and get replaced by arguments arising from astochastic maximum principle for hypoelliptic equations (satisfying Hormander'scondition) that is of independent interest. Verification of the Gapeev-Shiryaevconjecture establishes the fact that sequential testing and quickest detectionproblems with monotone signal-to-noise ratios are amenable to known methods ofsolution.

Gapeev-Shiryaev 猜想（源于 Gapeev 和 Shiryaev (2011) 以及 Gapeev 和 Shiryaev (2013)）大致可表述如下：信噪比的单调性意味着最优停止边界的单调性。该猜想最初是在以下两个问题中提出的：(i) 扩散过程的顺序检测问题（需要确定两个漂移中哪个正在被间接观测）；(ii) 扩散过程的最快检测问题（需要检测初始漂移何时变为新漂移）。在本文中，我们从以下两个方面提出了加皮耶夫-希尔亚耶夫猜想的证明：(i) 顺序测试环境（在基础 SDE 的利普齐兹/霍尔系数条件下）和 (ii) 最快检测环境（在基础 SDE 的解析系数条件下）。顺序检测环境下的证明方法依赖于随机时间变化和路径比较论证。这两个论证在快速检测设置中都被打破了，取而代之的是由次椭圆方程（满足霍曼德条件）的随机最大原则所产生的论证，而这两个论证都是独立的。加佩夫-希尔亚耶夫猜想的验证确立了这样一个事实，即具有单调信噪比的顺序检验和最快检测问题是可以用已知方法解决的。

{"title":"The Gapeev-Shiryaev Conjecture","authors":"Philip A. Ernst, Goran Peskir","doi":"arxiv-2405.01685","DOIUrl":"https://doi.org/arxiv-2405.01685","url":null,"abstract":"The Gapeev-Shiryaev conjecture (originating in Gapeev and Shiryaev (2011) and\u0000Gapeev and Shiryaev (2013)) can be broadly stated as follows: Monotonicity of\u0000the signal-to-noise ratio implies monotonicity of the optimal stopping\u0000boundaries. The conjecture was originally formulated both within (i) sequential\u0000testing problems for diffusion processes (where one needs to decide which of\u0000the two drifts is being indirectly observed) and (ii) quickest detection\u0000problems for diffusion processes (where one needs to detect when the initial\u0000drift changes to a new drift). In this paper we present proofs of the\u0000Gapeev-Shiryaev conjecture both in (i) the sequential testing setting (under\u0000Lipschitz/Holder coefficients of the underlying SDEs) and (ii) the quickest\u0000detection setting (under analytic coefficients of the underlying SDEs). The\u0000method of proof in the sequential testing setting relies upon a stochastic time\u0000change and pathwise comparison arguments. Both arguments break down in the\u0000quickest detection setting and get replaced by arguments arising from a\u0000stochastic maximum principle for hypoelliptic equations (satisfying Hormander's\u0000condition) that is of independent interest. Verification of the Gapeev-Shiryaev\u0000conjecture establishes the fact that sequential testing and quickest detection\u0000problems with monotone signal-to-noise ratios are amenable to known methods of\u0000solution.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Minimax Regret Learning for Data with Heterogeneous Subgroups 针对异质分组数据的最小回归学习

arXiv - MATH - Statistics Theory

Pub Date : 2024-05-02 DOI: arxiv-2405.01709

Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu

Modern complex datasets often consist of various sub-populations. To developrobust and generalizable methods in the presence of sub-populationheterogeneity, it is important to guarantee a uniform learning performanceinstead of an average one. In many applications, prior information is oftenavailable on which sub-population or group the data points belong to. Given theobserved groups of data, we develop a min-max-regret (MMR) learning frameworkfor general supervised learning, which targets to minimize the worst-groupregret. Motivated from the regret-based decision theoretic framework, theproposed MMR is distinguished from the value-based or risk-based robustlearning methods in the existing literature. The regret criterion featuresseveral robustness and invariance properties simultaneously. In terms ofgeneralizability, we develop the theoretical guarantee for the worst-caseregret over a super-population of the meta data, which incorporates theobserved sub-populations, their mixtures, as well as other unseensub-populations that could be approximated by the observed ones. We demonstratethe effectiveness of our method through extensive simulation studies and anapplication to kidney transplantation data from hundreds of transplant centers.

现代复杂数据集通常由各种子群组成。要想在存在子群异质性的情况下开发出稳健、可推广的方法，必须保证统一的学习性能，而不是平均性能。在许多应用中，数据点属于哪个子群或组，往往可以获得先验信息。鉴于观察到的数据组，我们开发了一种用于一般监督学习的最小-最大-遗憾（MMR）学习框架，其目标是最小化最差组遗憾。受基于遗憾的决策理论框架的启发，我们提出的 MMR 有别于现有文献中基于价值或风险的鲁棒学习方法。遗憾准则同时具有稳健性和不变性等特征。在通用性方面，我们从理论上保证了元数据超群的最差后悔值，超群包括观测到的子群、它们的混合物以及可以用观测到的子群近似的其他未知子群。我们通过大量的模拟研究和对数百个移植中心的肾移植数据的应用，证明了我们方法的有效性。

{"title":"Minimax Regret Learning for Data with Heterogeneous Subgroups","authors":"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu","doi":"arxiv-2405.01709","DOIUrl":"https://doi.org/arxiv-2405.01709","url":null,"abstract":"Modern complex datasets often consist of various sub-populations. To develop\u0000robust and generalizable methods in the presence of sub-population\u0000heterogeneity, it is important to guarantee a uniform learning performance\u0000instead of an average one. In many applications, prior information is often\u0000available on which sub-population or group the data points belong to. Given the\u0000observed groups of data, we develop a min-max-regret (MMR) learning framework\u0000for general supervised learning, which targets to minimize the worst-group\u0000regret. Motivated from the regret-based decision theoretic framework, the\u0000proposed MMR is distinguished from the value-based or risk-based robust\u0000learning methods in the existing literature. The regret criterion features\u0000several robustness and invariance properties simultaneously. In terms of\u0000generalizability, we develop the theoretical guarantee for the worst-case\u0000regret over a super-population of the meta data, which incorporates the\u0000observed sub-populations, their mixtures, as well as other unseen\u0000sub-populations that could be approximated by the observed ones. We demonstrate\u0000the effectiveness of our method through extensive simulation studies and an\u0000application to kidney transplantation data from hundreds of transplant centers.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"152 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian Inference for Estimating Heat Sources through Temperature Assimilation 通过温度同化估算热源的贝叶斯推断法

arXiv - MATH - Statistics Theory

Pub Date : 2024-04-18 DOI: arxiv-2405.02319

Hanieh Mousavi, Jeff D. Eldredge

This paper introduces a Bayesian inference framework for two-dimensionalsteady-state heat conduction, focusing on the estimation of unknown distributedheat sources in a thermally-conducting medium with uniform conductivity. Thegoal is to infer heater locations, strengths, and shapes using temperatureassimilation in the Euclidean space, employing a Fourier series to representeach heater's shape. The Markov Chain Monte Carlo (MCMC) method, incorporatingthe random-walk Metropolis-Hasting algorithm and parallel tempering, isutilized for posterior distribution exploration in both unbounded andwall-bounded domains. Strong correlations between heat strength and heater areaprompt caution against simultaneously estimating these two quantities. It isfound that multiple solutions arise in cases where the number of temperaturesensors is less than the number of unknown states. Moreover, smaller heatersintroduce greater uncertainty in estimated strength. The diffusive nature ofheat conduction smooths out any deformations in the temperature contours,especially in the presence of multiple heaters positioned near each other,impacting convergence. In wall-bounded domains with Neumann boundaryconditions, the inference of heater parameters tends to be more accurate thanin unbounded domains.

本文介绍了二维稳态热传导的贝叶斯推理框架，重点是对具有均匀传导性的导热介质中的未知分布式热源进行估计。目标是利用欧几里得空间的温度同化推断加热器的位置、强度和形状，并采用傅里叶级数来表示每个加热器的形状。马尔可夫链蒙特卡罗（MCMC）方法结合了随机漫步 Metropolis-Hasting 算法和并行回火，用于在无界和有界域中探索后验分布。热强度和加热器面积之间的强相关性提醒我们不要同时估计这两个量。研究发现，在温度传感器数量少于未知状态数量的情况下，会出现多个解决方案。此外，加热器越小，估计强度的不确定性越大。热传导的扩散性质会平滑温度等值线的任何变形，尤其是在多个加热器相互靠近的情况下，从而影响收敛性。在具有新曼边界条件的壁边界域中，加热器参数的推断往往比无边界域更精确。

{"title":"Bayesian Inference for Estimating Heat Sources through Temperature Assimilation","authors":"Hanieh Mousavi, Jeff D. Eldredge","doi":"arxiv-2405.02319","DOIUrl":"https://doi.org/arxiv-2405.02319","url":null,"abstract":"This paper introduces a Bayesian inference framework for two-dimensional\u0000steady-state heat conduction, focusing on the estimation of unknown distributed\u0000heat sources in a thermally-conducting medium with uniform conductivity. The\u0000goal is to infer heater locations, strengths, and shapes using temperature\u0000assimilation in the Euclidean space, employing a Fourier series to represent\u0000each heater's shape. The Markov Chain Monte Carlo (MCMC) method, incorporating\u0000the random-walk Metropolis-Hasting algorithm and parallel tempering, is\u0000utilized for posterior distribution exploration in both unbounded and\u0000wall-bounded domains. Strong correlations between heat strength and heater area\u0000prompt caution against simultaneously estimating these two quantities. It is\u0000found that multiple solutions arise in cases where the number of temperature\u0000sensors is less than the number of unknown states. Moreover, smaller heaters\u0000introduce greater uncertainty in estimated strength. The diffusive nature of\u0000heat conduction smooths out any deformations in the temperature contours,\u0000especially in the presence of multiple heaters positioned near each other,\u0000impacting convergence. In wall-bounded domains with Neumann boundary\u0000conditions, the inference of heater parameters tends to be more accurate than\u0000in unbounded domains.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0