arXiv - STAT - Machine Learning最新文献

英文中文

Federated $mathcal{X}$-armed Bandit with Flexible Personalisation 具有灵活个性化功能的联合 $mathcal{X}$-armed Bandit

arXiv - STAT - Machine Learning

Pub Date : 2024-09-11 DOI: arxiv-2409.07251

Ali Arabzadeh, James A. Grant, David S. Leslie

This paper introduces a novel approach to personalised federated learningwithin the $mathcal{X}$-armed bandit framework, addressing the challenge ofoptimising both local and global objectives in a highly heterogeneousenvironment. Our method employs a surrogate objective function that combinesindividual client preferences with aggregated global knowledge, allowing for aflexible trade-off between personalisation and collective learning. We proposea phase-based elimination algorithm that achieves sublinear regret withlogarithmic communication overhead, making it well-suited for federatedsettings. Theoretical analysis and empirical evaluations demonstrate theeffectiveness of our approach compared to existing methods. Potentialapplications of this work span various domains, including healthcare, smarthome devices, and e-commerce, where balancing personalisation with globalinsights is crucial.

本文介绍了一种在$mathcal{X}$-armed bandit框架内进行个性化联合学习的新方法，以解决在高度异构环境中优化局部和全局目标的挑战。我们的方法采用了一种替代目标函数，该函数结合了个人客户偏好和聚合全局知识，允许在个性化和集体学习之间灵活权衡。我们提出了一种基于阶段的消除算法，它能以对数的通信开销实现亚线性遗憾，因此非常适合联合设置。理论分析和经验评估证明，与现有方法相比，我们的方法非常有效。这项工作的潜在应用领域涉及医疗保健、智能家居设备和电子商务等多个领域，在这些领域中，平衡个性化与全球洞察力至关重要。

引用次数: 0

Is merging worth it? Securely evaluating the information gain for causal dataset acquisition 合并值得吗？安全评估因果数据集获取的信息增益

arXiv - STAT - Machine Learning

Pub Date : 2024-09-11 DOI: arxiv-2409.07215

Jake Fawkes, Lucile Ter-Minassian, Desi Ivanova, Uri Shalit, Chris Holmes

Merging datasets across institutions is a lengthy and costly procedure,especially when it involves private information. Data hosts may therefore wantto prospectively gauge which datasets are most beneficial to merge with,without revealing sensitive information. For causal estimation this isparticularly challenging as the value of a merge will depend not only on thereduction in epistemic uncertainty but also the improvement in overlap. Toaddress this challenge, we introduce the first cryptographically secureinformation-theoretic approach for quantifying the value of a merge in thecontext of heterogeneous treatment effect estimation. We do this by evaluatingthe Expected Information Gain (EIG) and utilising multi-party computation toensure it can be securely computed without revealing any raw data. As wedemonstrate, this can be used with differential privacy (DP) to ensure privacyrequirements whilst preserving more accurate computation than naive DP alone.To the best of our knowledge, this work presents the first privacy-preservingmethod for dataset acquisition tailored to causal estimation. We demonstratethe effectiveness and reliability of our method on a range of simulated andrealistic benchmarks. The code is available anonymously.

跨机构合并数据集是一个漫长而昂贵的过程，尤其是在涉及私人信息的情况下。因此，数据主办方可能希望在不泄露敏感信息的情况下，前瞻性地评估哪些数据集最有利于合并。对于因果估计来说，这尤其具有挑战性，因为合并的价值不仅取决于认识不确定性的降低，还取决于重叠性的提高。为了应对这一挑战，我们引入了第一种加密安全信息理论方法，用于量化异质治疗效果估计中合并的价值。我们通过评估预期信息增益 (EIG)，并利用多方计算来确保它可以在不泄露任何原始数据的情况下安全计算。正如我们所演示的，这可以与差分隐私（DP）一起使用，以确保隐私要求，同时保留比单纯的天真 DP 更精确的计算。我们在一系列模拟和现实基准上证明了我们方法的有效性和可靠性。代码可匿名获取。

{"title":"Is merging worth it? Securely evaluating the information gain for causal dataset acquisition","authors":"Jake Fawkes, Lucile Ter-Minassian, Desi Ivanova, Uri Shalit, Chris Holmes","doi":"arxiv-2409.07215","DOIUrl":"https://doi.org/arxiv-2409.07215","url":null,"abstract":"Merging datasets across institutions is a lengthy and costly procedure,\u0000especially when it involves private information. Data hosts may therefore want\u0000to prospectively gauge which datasets are most beneficial to merge with,\u0000without revealing sensitive information. For causal estimation this is\u0000particularly challenging as the value of a merge will depend not only on the\u0000reduction in epistemic uncertainty but also the improvement in overlap. To\u0000address this challenge, we introduce the first cryptographically secure\u0000information-theoretic approach for quantifying the value of a merge in the\u0000context of heterogeneous treatment effect estimation. We do this by evaluating\u0000the Expected Information Gain (EIG) and utilising multi-party computation to\u0000ensure it can be securely computed without revealing any raw data. As we\u0000demonstrate, this can be used with differential privacy (DP) to ensure privacy\u0000requirements whilst preserving more accurate computation than naive DP alone.\u0000To the best of our knowledge, this work presents the first privacy-preserving\u0000method for dataset acquisition tailored to causal estimation. We demonstrate\u0000the effectiveness and reliability of our method on a range of simulated and\u0000realistic benchmarks. The code is available anonymously.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Weather-Informed Probabilistic Forecasting and Scenario Generation in Power Systems 电力系统中的气象信息概率预测和情景生成

arXiv - STAT - Machine Learning

Pub Date : 2024-09-11 DOI: arxiv-2409.07637

Hanyu Zhang, Reza Zandehshahvar, Mathieu Tanneau, Pascal Van Hentenryck

The integration of renewable energy sources (RES) into power grids presentssignificant challenges due to their intrinsic stochasticity and uncertainty,necessitating the development of new techniques for reliable and efficientforecasting. This paper proposes a method combining probabilistic forecastingand Gaussian copula for day-ahead prediction and scenario generation of load,wind, and solar power in high-dimensional contexts. By incorporating weathercovariates and restoring spatio-temporal correlations, the proposed methodenhances the reliability of probabilistic forecasts in RES. Extensive numericalexperiments compare the effectiveness of different time series models, withperformance evaluated using comprehensive metrics on a real-world andhigh-dimensional dataset from Midcontinent Independent System Operator (MISO).The results highlight the importance of weather information and demonstrate theefficacy of the Gaussian copula in generating realistic scenarios, with theproposed weather-informed Temporal Fusion Transformer (WI-TFT) model showingsuperior performance.

由于其固有的随机性和不确定性，将可再生能源（RES）并入电网面临着巨大挑战，因此有必要开发可靠、高效的预测新技术。本文提出了一种结合概率预测和高斯协库拉的方法，用于高维背景下负荷、风能和太阳能发电的日前预测和情景生成。通过纳入天气变量和恢复时空相关性，所提出的方法提高了可再生能源中概率预测的可靠性。广泛的数值实验比较了不同时间序列模型的有效性，并在中洲独立系统运营商（MISO）的真实高维数据集上使用综合指标对性能进行了评估。结果凸显了天气信息的重要性，并证明了高斯协方差在生成真实情景方面的有效性，而提出的天气信息时空融合变换器（WI-TFT）模型则显示出更优越的性能。

{"title":"Weather-Informed Probabilistic Forecasting and Scenario Generation in Power Systems","authors":"Hanyu Zhang, Reza Zandehshahvar, Mathieu Tanneau, Pascal Van Hentenryck","doi":"arxiv-2409.07637","DOIUrl":"https://doi.org/arxiv-2409.07637","url":null,"abstract":"The integration of renewable energy sources (RES) into power grids presents\u0000significant challenges due to their intrinsic stochasticity and uncertainty,\u0000necessitating the development of new techniques for reliable and efficient\u0000forecasting. This paper proposes a method combining probabilistic forecasting\u0000and Gaussian copula for day-ahead prediction and scenario generation of load,\u0000wind, and solar power in high-dimensional contexts. By incorporating weather\u0000covariates and restoring spatio-temporal correlations, the proposed method\u0000enhances the reliability of probabilistic forecasts in RES. Extensive numerical\u0000experiments compare the effectiveness of different time series models, with\u0000performance evaluated using comprehensive metrics on a real-world and\u0000high-dimensional dataset from Midcontinent Independent System Operator (MISO).\u0000The results highlight the importance of weather information and demonstrate the\u0000efficacy of the Gaussian copula in generating realistic scenarios, with the\u0000proposed weather-informed Temporal Fusion Transformer (WI-TFT) model showing\u0000superior performance.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"183 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Scalable Algorithm for Active Learning 主动学习的可扩展算法

arXiv - STAT - Machine Learning

Pub Date : 2024-09-11 DOI: arxiv-2409.07392

Youguang Chen, Zheyu Wen, George Biros

FIRAL is a recently proposed deterministic active learning algorithm formulticlass classification using logistic regression. It was shown to outperformthe state-of-the-art in terms of accuracy and robustness and comes withtheoretical performance guarantees. However, its scalability suffers whendealing with datasets featuring a large number of points $n$, dimensions $d$,and classes $c$, due to its $mathcal{O}(c^2d^2+nc^2d)$ storage and$mathcal{O}(c^3(nd^2 + bd^3 + bn))$ computational complexity where $b$ is thenumber of points to select in active learning. To address these challenges, wepropose an approximate algorithm with storage requirements reduced to$mathcal{O}(n(d+c) + cd^2)$ and a computational complexity of$mathcal{O}(bncd^2)$. Additionally, we present a parallel implementation onGPUs. We demonstrate the accuracy and scalability of our approach using MNIST,CIFAR-10, Caltech101, and ImageNet. The accuracy tests reveal no deteriorationin accuracy compared to FIRAL. We report strong and weak scaling tests on up to12 GPUs, for three million point synthetic dataset.

FIRAL 是最近提出的一种使用逻辑回归进行多类分类的确定性主动学习算法。研究表明，该算法在准确性和鲁棒性方面优于最先进的算法，并具有理论性能保证。然而，在处理具有大量点数 $n$、维数 $d$ 和类数 $c$ 的数据集时，由于其存储空间 $mathcal{O}(c^2d^2+nc^2d)$ 和计算复杂度 $mathcal{O}(c^3(nd^2 + bd^3 + bn))$（其中 $b$ 是主动学习中要选择的点数），其可扩展性受到了影响。为了应对这些挑战，我们提出了一种近似算法，其存储需求降至$mathcal{O}(n(d+c) + cd^2)$，计算复杂度为$mathcal{O}(bncd^2)$。此外，我们还介绍了在 GPU 上的并行实现。我们使用 MNIST、CIFAR-10、Caltech101 和 ImageNet 演示了我们方法的准确性和可扩展性。准确性测试表明，与 FIRAL 相比，准确性没有下降。我们报告了在多达 12 个 GPU 上对 300 万点合成数据集进行的强扩展和弱扩展测试。

引用次数: 0

Learning Deep Kernels for Non-Parametric Independence Testing 学习用于非参数独立性检验的深度核

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06890

Nathaniel Xu, Feng Liu, Danica J. Sutherland

The Hilbert-Schmidt Independence Criterion (HSIC) is a powerful tool fornonparametric detection of dependence between random variables. It cruciallydepends, however, on the selection of reasonable kernels; commonly-used choiceslike the Gaussian kernel, or the kernel that yields the distance covariance,are sufficient only for amply sized samples from data distributions withrelatively simple forms of dependence. We propose a scheme for selecting thekernels used in an HSIC-based independence test, based on maximizing anestimate of the asymptotic test power. We prove that maximizing this estimateindeed approximately maximizes the true power of the test, and demonstrate thatour learned kernels can identify forms of structured dependence between randomvariables in various experiments.

希尔伯特-施密特独立准则（Hilbert-Schmidt Independence Criterion，HSIC）是一种用于非参数检测随机变量之间依赖关系的强大工具。然而，它的关键在于选择合理的核；常用的选择，如高斯核或产生距离协方差的核，只适用于具有相对简单依赖形式的数据分布的足够大小的样本。我们提出了一种方案，用于选择基于 HSIC 的独立性检验中使用的核，其基础是最大化渐近检验功率的估计值。我们证明，最大化这一估计值实际上近似最大化了检验的真实功率，并证明我们学习的核可以识别各种实验中随机变量之间的结构依赖形式。

引用次数: 0

Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework 分区网络空间几何：统一的理论和计算框架

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06302

Stephen Y Zhang, Fangfei Lan, Youjia Zhou, Agnese Barbensi, Michael P H Stumpf, Bei Wang, Tom Needham

Interactions and relations between objects may be pairwise or higher-order innature, and so network-valued data are ubiquitous in the real world. The "spaceof networks", however, has a complex structure that cannot be adequatelydescribed using conventional statistical tools. We introduce ameasure-theoretic formalism for modeling generalized network structures such asgraphs, hypergraphs, or graphs whose nodes come with a partition intocategorical classes. We then propose a metric that extends theGromov-Wasserstein distance between graphs and the co-optimal transportdistance between hypergraphs. We characterize the geometry of this space,thereby providing a unified theoretical treatment of generalized networks thatencompasses the cases of pairwise, as well as higher-order, relations. Inparticular, we show that our metric is an Alexandrov space of non-negativecurvature, and leverage this structure to define gradients for certainfunctionals commonly arising in geometric data analysis tasks. We extend ouranalysis to the setting where vertices have additional label information, andderive efficient computational schemes to use in practice. Equipped with thesetheoretical and computational tools, we demonstrate the utility of ourframework in a suite of applications, including hypergraph alignment,clustering and dictionary learning from ensemble data, multi-omics alignment,as well as multiscale network alignment.

物体之间的相互作用和关系可能是成对的，也可能是高阶的，因此网络价值数据在现实世界中无处不在。然而，"网络空间 "具有复杂的结构，传统的统计工具无法对其进行充分描述。我们介绍了一种度量理论形式主义，用于对广义网络结构建模，如图、超图或节点带有类别分区的图。然后，我们提出了一种度量，它扩展了图之间的格罗莫夫-瓦瑟斯坦距离和超图之间的共优传输距离。我们描述了这一空间的几何特征，从而为广义网络提供了一种统一的理论处理方法，它涵盖了成对关系和高阶关系的情况。特别是，我们证明了我们的度量是一个非负曲率的亚历山德罗夫空间，并利用这一结构定义了几何数据分析任务中常见的某些函数的梯度。我们将分析扩展到顶点具有额外标签信息的情况，并提出了在实践中使用的高效计算方案。有了这些理论和计算工具，我们展示了我们的框架在一系列应用中的实用性，包括超图配准、从集合数据中进行聚类和字典学习、多组学配准以及多尺度网络配准。

{"title":"Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework","authors":"Stephen Y Zhang, Fangfei Lan, Youjia Zhou, Agnese Barbensi, Michael P H Stumpf, Bei Wang, Tom Needham","doi":"arxiv-2409.06302","DOIUrl":"https://doi.org/arxiv-2409.06302","url":null,"abstract":"Interactions and relations between objects may be pairwise or higher-order in\u0000nature, and so network-valued data are ubiquitous in the real world. The \"space\u0000of networks\", however, has a complex structure that cannot be adequately\u0000described using conventional statistical tools. We introduce a\u0000measure-theoretic formalism for modeling generalized network structures such as\u0000graphs, hypergraphs, or graphs whose nodes come with a partition into\u0000categorical classes. We then propose a metric that extends the\u0000Gromov-Wasserstein distance between graphs and the co-optimal transport\u0000distance between hypergraphs. We characterize the geometry of this space,\u0000thereby providing a unified theoretical treatment of generalized networks that\u0000encompasses the cases of pairwise, as well as higher-order, relations. In\u0000particular, we show that our metric is an Alexandrov space of non-negative\u0000curvature, and leverage this structure to define gradients for certain\u0000functionals commonly arising in geometric data analysis tasks. We extend our\u0000analysis to the setting where vertices have additional label information, and\u0000derive efficient computational schemes to use in practice. Equipped with these\u0000theoretical and computational tools, we demonstrate the utility of our\u0000framework in a suite of applications, including hypergraph alignment,\u0000clustering and dictionary learning from ensemble data, multi-omics alignment,\u0000as well as multiscale network alignment.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling 使用 $K$ 近邻重采样进行限价订单簿模拟和交易评估

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06514

Michael Giegrich, Roel Oomen, Christoph Reisinger

In this paper, we show how $K$-nearest neighbor ($K$-NN) resampling, anoff-policy evaluation method proposed in cite{giegrich2023k}, can be appliedto simulate limit order book (LOB) markets and how it can be used to evaluateand calibrate trading strategies. Using historical LOB data, we demonstratethat our simulation method is capable of recreating realistic LOB dynamics andthat synthetic trading within the simulation leads to a market impact in linewith the corresponding literature. Compared to other statistical LOB simulationmethods, our algorithm has theoretical convergence guarantees under generalconditions, does not require optimization, is easy to implement andcomputationally efficient. Furthermore, we show that in a benchmark comparisonour method outperforms a deep learning-based algorithm for several keystatistics. In the context of a LOB with pro-rata type matching, we demonstratehow our algorithm can calibrate the size of limit orders for a liquidationstrategy. Finally, we describe how $K$-NN resampling can be modified forchoices of higher dimensional state spaces.

在本文中，我们展示了如何将 cite{giegrich2023k}中提出的关闭策略评估方法--$K$-近邻（$K$-NN）重采样--应用于模拟限价订单簿（LOB）市场，以及如何将其用于评估和校准交易策略。利用历史限价订单簿数据，我们证明了我们的模拟方法能够再现真实的限价订单簿动态，而且模拟中的合成交易对市场的影响与相应的文献相符。与其他统计 LOB 仿真方法相比，我们的算法在一般条件下具有理论上的收敛性保证，无需优化，易于实现，而且计算效率高。此外，我们还证明，在基准比较中，我们的方法在几个关键指标上优于基于深度学习的算法。在按比例类型匹配的 LOB 背景下，我们展示了我们的算法如何为清算策略校准限价订单的大小。最后，我们介绍了如何修改 $K$-NN 重采样来选择更高维的状态空间。

{"title":"Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling","authors":"Michael Giegrich, Roel Oomen, Christoph Reisinger","doi":"arxiv-2409.06514","DOIUrl":"https://doi.org/arxiv-2409.06514","url":null,"abstract":"In this paper, we show how $K$-nearest neighbor ($K$-NN) resampling, an\u0000off-policy evaluation method proposed in cite{giegrich2023k}, can be applied\u0000to simulate limit order book (LOB) markets and how it can be used to evaluate\u0000and calibrate trading strategies. Using historical LOB data, we demonstrate\u0000that our simulation method is capable of recreating realistic LOB dynamics and\u0000that synthetic trading within the simulation leads to a market impact in line\u0000with the corresponding literature. Compared to other statistical LOB simulation\u0000methods, our algorithm has theoretical convergence guarantees under general\u0000conditions, does not require optimization, is easy to implement and\u0000computationally efficient. Furthermore, we show that in a benchmark comparison\u0000our method outperforms a deep learning-based algorithm for several key\u0000statistics. In the context of a LOB with pro-rata type matching, we demonstrate\u0000how our algorithm can calibrate the size of limit orders for a liquidation\u0000strategy. Finally, we describe how $K$-NN resampling can be modified for\u0000choices of higher dimensional state spaces.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport 粒子物理学中的稳健半参数信号检测，分类器通过最优传输相互关联

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06399

Purvasha Chakravarti, Lucas Kania, Olaf Behnke, Mikael Kuusela, Larry Wasserman

Searches of new signals in particle physics are usually done by training asupervised classifier to separate a signal model from the known Standard Modelphysics (also called the background model). However, even when the signal modelis correct, systematic errors in the background model can influence supervisedclassifiers and might adversely affect the signal detection procedure. Totackle this problem, one approach is to use the (possibly misspecified)classifier only to perform a preliminary signal-enrichment step and then tocarry out a bump hunt on the signal-rich sample using only the realexperimental data. For this procedure to work, we need a classifier constrainedto be decorrelated with one or more protected variables used for the signaldetection step. We do this by considering an optimal transport map of theclassifier output that makes it independent of the protected variable(s) forthe background. We then fit a semi-parametric mixture model to the distributionof the protected variable after making cuts on the transformed classifier todetect the presence of a signal. We compare and contrast this decorrelationmethod with previous approaches, show that the decorrelation procedure isrobust to moderate background misspecification, and analyse the power of thesignal detection test as a function of the cut on the classifier.

粒子物理中新信号的搜索通常是通过训练监督分类器，将信号模型与已知的标准模型物理（也称为背景模型）区分开来。然而，即使信号模型是正确的，背景模型中的系统误差也会影响监督分类器，并可能对信号探测过程产生不利影响。为了解决这个问题，一种方法是仅使用（可能是错误定义的）分类器来执行初步的信号富集步骤，然后仅使用再实验数据对信号丰富的样本进行碰撞检测。为使这一步骤奏效，我们需要一个分类器，它必须与信号检测步骤中使用的一个或多个保护变量不相关。为此，我们需要考虑分类器输出的最佳传输图，使其与背景保护变量无关。然后，我们在对转换后的分类器进行切割后，对受保护变量的分布拟合一个半参数混合模型，以检测信号的存在。我们将这种去相关性方法与以前的方法进行了比较和对比，证明去相关性程序对中等程度的背景误设是稳健的，并分析了信号检测检验的功率与分类器上的切分的函数关系。

{"title":"Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport","authors":"Purvasha Chakravarti, Lucas Kania, Olaf Behnke, Mikael Kuusela, Larry Wasserman","doi":"arxiv-2409.06399","DOIUrl":"https://doi.org/arxiv-2409.06399","url":null,"abstract":"Searches of new signals in particle physics are usually done by training a\u0000supervised classifier to separate a signal model from the known Standard Model\u0000physics (also called the background model). However, even when the signal model\u0000is correct, systematic errors in the background model can influence supervised\u0000classifiers and might adversely affect the signal detection procedure. To\u0000tackle this problem, one approach is to use the (possibly misspecified)\u0000classifier only to perform a preliminary signal-enrichment step and then to\u0000carry out a bump hunt on the signal-rich sample using only the real\u0000experimental data. For this procedure to work, we need a classifier constrained\u0000to be decorrelated with one or more protected variables used for the signal\u0000detection step. We do this by considering an optimal transport map of the\u0000classifier output that makes it independent of the protected variable(s) for\u0000the background. We then fit a semi-parametric mixture model to the distribution\u0000of the protected variable after making cuts on the transformed classifier to\u0000detect the presence of a signal. We compare and contrast this decorrelation\u0000method with previous approaches, show that the decorrelation procedure is\u0000robust to moderate background misspecification, and analyse the power of the\u0000signal detection test as a function of the cut on the classifier.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancing Causal Inference: A Nonparametric Approach to ATE and CATE Estimation with Continuous Treatments 推进因果推论：利用连续治疗进行 ATE 和 CATE 估算的非参数方法

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06593

Hugo Gobato Souto, Francisco Louzada Neto

This paper introduces a generalized ps-BART model for the estimation ofAverage Treatment Effect (ATE) and Conditional Average Treatment Effect (CATE)in continuous treatments, addressing limitations of the Bayesian Causal Forest(BCF) model. The ps-BART model's nonparametric nature allows for flexibility incapturing nonlinear relationships between treatment and outcome variables.Across three distinct sets of Data Generating Processes (DGPs), the ps-BARTmodel consistently outperforms the BCF model, particularly in highly nonlinearsettings. The ps-BART model's robustness in uncertainty estimation and accuracyin both point-wise and probabilistic estimation demonstrate its utility forreal-world applications. This research fills a crucial gap in causal inferenceliterature, providing a tool better suited for nonlinear treatment-outcomerelationships and opening avenues for further exploration in the domain ofcontinuous treatment effect estimation.

本文针对贝叶斯因果森林（BCF）模型的局限性，介绍了一种用于估计连续治疗中平均治疗效果（ATE）和条件平均治疗效果（CATE）的广义 ps-BART 模型。在三组不同的数据生成过程（DGPs）中，ps-BART 模型始终优于 BCF 模型，尤其是在高度非线性设置中。ps-BART 模型在不确定性估计方面的稳健性以及在点估计和概率估计方面的准确性证明了它在现实世界应用中的实用性。这项研究填补了因果推断文献中的一个重要空白，提供了一种更适合非线性治疗-结果关系的工具，并为进一步探索连续治疗效果估计领域开辟了道路。

引用次数: 0

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems 函数约束算法解决凸简单双层问题

arXiv - STAT - Machine Learning

Pub Date : 2024-09-10 DOI: arxiv-2409.06530

Huaqing Zhang, Lesi Chen, Jing Xu, Jingzhao Zhang

This paper studies simple bilevel problems, where a convex upper-levelfunction is minimized over the optimal solutions of a convex lower-levelproblem. We first show the fundamental difficulty of simple bilevel problems,that the approximate optimal value of such problems is not obtainable byfirst-order zero-respecting algorithms. Then we follow recent works to pursuethe weak approximate solutions. For this goal, we propose novel near-optimalmethods for smooth and nonsmooth problems by reformulating them intofunctionally constrained problems.

本文研究的是简单双级问题，即在凸低级问题的最优解上最小化一个凸高级函数。我们首先说明了简单两级问题的基本难点，即这类问题的近似最优值无法用一阶零尊重算法求得。然后，我们根据最近的研究成果，寻求弱近似解。为此，我们将光滑和非光滑问题重新表述为功能受限问题，从而提出了新的近优方法。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - STAT - Machine Learning

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀