Wasserstein Distributionally Robust Multiclass Support Vector Machine

arXiv - STAT - Machine Learning Pub Date : 2024-09-12 DOI:arxiv-2409.08409

Michael Ibrahim, Heraldo Rozas, Nagi Gebraeel

{"title":"Wasserstein Distributionally Robust Multiclass Support Vector Machine","authors":"Michael Ibrahim, Heraldo Rozas, Nagi Gebraeel","doi":"arxiv-2409.08409","DOIUrl":null,"url":null,"abstract":"We study the problem of multiclass classification for settings where data\nfeatures $\\mathbf{x}$ and their labels $\\mathbf{y}$ are uncertain. We identify\nthat distributionally robust one-vs-all (OVA) classifiers often struggle in\nsettings with imbalanced data. To address this issue, we use Wasserstein\ndistributionally robust optimization to develop a robust version of the\nmulticlass support vector machine (SVM) characterized by the Crammer-Singer\n(CS) loss. First, we prove that the CS loss is bounded from above by a\nLipschitz continuous function for all $\\mathbf{x} \\in \\mathcal{X}$ and\n$\\mathbf{y} \\in \\mathcal{Y}$, then we exploit strong duality results to express\nthe dual of the worst-case risk problem, and we show that the worst-case risk\nminimization problem admits a tractable convex reformulation due to the\nregularity of the CS loss. Moreover, we develop a kernel version of our\nproposed model to account for nonlinear class separation, and we show that it\nadmits a tractable convex upper bound. We also propose a projected subgradient\nmethod algorithm for a special case of our proposed linear model to improve\nscalability. Our numerical experiments demonstrate that our model outperforms\nstate-of-the art OVA models in settings where the training data is highly\nimbalanced. We also show through experiments on popular real-world datasets\nthat our proposed model often outperforms its regularized counterpart as the\nfirst accounts for uncertain labels unlike the latter.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"47 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We study the problem of multiclass classification for settings where data features $\mathbf{x}$ and their labels $\mathbf{y}$ are uncertain. We identify that distributionally robust one-vs-all (OVA) classifiers often struggle in settings with imbalanced data. To address this issue, we use Wasserstein distributionally robust optimization to develop a robust version of the multiclass support vector machine (SVM) characterized by the Crammer-Singer (CS) loss. First, we prove that the CS loss is bounded from above by a Lipschitz continuous function for all $\mathbf{x} \in \mathcal{X}$ and $\mathbf{y} \in \mathcal{Y}$, then we exploit strong duality results to express the dual of the worst-case risk problem, and we show that the worst-case risk minimization problem admits a tractable convex reformulation due to the regularity of the CS loss. Moreover, we develop a kernel version of our proposed model to account for nonlinear class separation, and we show that it admits a tractable convex upper bound. We also propose a projected subgradient method algorithm for a special case of our proposed linear model to improve scalability. Our numerical experiments demonstrate that our model outperforms state-of-the art OVA models in settings where the training data is highly imbalanced. We also show through experiments on popular real-world datasets that our proposed model often outperforms its regularized counterpart as the first accounts for uncertain labels unlike the latter.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

瓦瑟斯坦分布式鲁棒多类支持向量机

我们研究了在数据特征 $\mathbf{x}$ 及其标签 $\mathbf{y}$ 不确定的情况下的多类分类问题。我们发现，在数据不平衡的情况下，具有分布稳健性的 "一视同仁"（OVA）分类器往往会陷入困境。为了解决这个问题，我们使用 Wassersteindistributionally robust optimization（瓦瑟斯特分布稳健优化）开发了以 Crammer-Singer（CS）损失为特征的多类支持向量机（SVM）的稳健版本。首先，我们证明 CS 损失对于所有 $\mathbf{x} 都是由一个 Lipschitz 连续函数从上而下限定的。\和 $\mathbf{y} 的利普西兹连续函数的边界。\那么我们就可以利用强对偶性结果来解释最坏情况风险问题的对偶性，我们还证明了由于 CS 损失的奇异性，最坏情况风险最小化问题允许一个可控的凸重拟。此外，我们还开发了一个核版本的拟议模型，以考虑非线性类分离，并证明它包含一个可处理的凸上界。我们还针对我们提出的线性模型的一个特例提出了一种投影子梯度法算法，以提高可计算性。我们的数值实验证明，在训练数据高度不平衡的情况下，我们的模型优于现有的 OVA 模型。我们还通过在流行的真实世界数据集上的实验表明，我们提出的模型往往优于其正则化的对应模型，因为前者考虑了不确定的标签，而后者则不同。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - STAT - Machine Learning

自引率

0.00%

发文量

期刊最新文献

Fitting Multilevel Factor Models Cartan moving frames and the data manifolds Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks Recurrent Interpolants for Probabilistic Time Series Prediction PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities