Smoothed Analysis for Learning Concepts with Low Intrinsic Dimension

arXiv - CS - Computational Complexity Pub Date : 2024-07-01 DOI:arxiv-2407.00966

Gautam Chandrasekaran, Adam Klivans, Vasilis Kontonis, Raghu Meka, Konstantinos Stavropoulos

{"title":"Smoothed Analysis for Learning Concepts with Low Intrinsic Dimension","authors":"Gautam Chandrasekaran, Adam Klivans, Vasilis Kontonis, Raghu Meka, Konstantinos Stavropoulos","doi":"arxiv-2407.00966","DOIUrl":null,"url":null,"abstract":"In traditional models of supervised learning, the goal of a learner -- given\nexamples from an arbitrary joint distribution on $\\mathbb{R}^d \\times \\{\\pm\n1\\}$ -- is to output a hypothesis that is competitive (to within $\\epsilon$) of\nthe best fitting concept from some class. In order to escape strong hardness\nresults for learning even simple concept classes, we introduce a\nsmoothed-analysis framework that requires a learner to compete only with the\nbest classifier that is robust to small random Gaussian perturbation. This subtle change allows us to give a wide array of learning results for any\nconcept that (1) depends on a low-dimensional subspace (aka multi-index model)\nand (2) has a bounded Gaussian surface area. This class includes functions of\nhalfspaces and (low-dimensional) convex sets, cases that are only known to be\nlearnable in non-smoothed settings with respect to highly structured\ndistributions such as Gaussians. Surprisingly, our analysis also yields new results for traditional\nnon-smoothed frameworks such as learning with margin. In particular, we obtain\nthe first algorithm for agnostically learning intersections of $k$-halfspaces\nin time $k^{poly(\\frac{\\log k}{\\epsilon \\gamma}) }$ where $\\gamma$ is the\nmargin parameter. Before our work, the best-known runtime was exponential in\n$k$ (Arriaga and Vempala, 1999).","PeriodicalId":501024,"journal":{"name":"arXiv - CS - Computational Complexity","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Complexity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.00966","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In traditional models of supervised learning, the goal of a learner -- given examples from an arbitrary joint distribution on $\mathbb{R}^d \times \{\pm 1\}$ -- is to output a hypothesis that is competitive (to within $\epsilon$) of the best fitting concept from some class. In order to escape strong hardness results for learning even simple concept classes, we introduce a smoothed-analysis framework that requires a learner to compete only with the best classifier that is robust to small random Gaussian perturbation. This subtle change allows us to give a wide array of learning results for any concept that (1) depends on a low-dimensional subspace (aka multi-index model) and (2) has a bounded Gaussian surface area. This class includes functions of halfspaces and (low-dimensional) convex sets, cases that are only known to be learnable in non-smoothed settings with respect to highly structured distributions such as Gaussians. Surprisingly, our analysis also yields new results for traditional non-smoothed frameworks such as learning with margin. In particular, we obtain the first algorithm for agnostically learning intersections of $k$-halfspaces in time $k^{poly(\frac{\log k}{\epsilon \gamma}) }$ where $\gamma$ is the margin parameter. Before our work, the best-known runtime was exponential in $k$ (Arriaga and Vempala, 1999).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

学习低内在维度概念的平滑分析法

在传统的监督学习模型中，学习者的目标--给定来自$\mathbb{R}^d \times \{\pm1\}$ 上任意联合分布的样本--是输出一个与某个类别中的最佳拟合概念具有竞争性（在$\epsilon$范围内）的假设。为了在学习即使是简单的概念类时也能摆脱强硬度结果，我们引入了平滑分析框架，要求学习者只与对小随机高斯扰动具有鲁棒性的最佳分类器竞争。这种微妙的变化使我们能够为以下任何概念提供大量学习结果：（1）依赖于低维子空间（又称多指数模型）；（2）具有有界高斯表面积。这一类概念包括半空间函数和（低维）凸集函数，这些情况只有在高斯等高度结构化分布的非平滑设置中才能学习。令人惊讶的是，我们的分析还为传统的非平滑框架（如边际学习）提供了新结果。特别是，我们获得了第一种算法，可以在$k^{poly(\frac\{log k}{\epsilon \gamma}) }$（其中$\gamma$是边际参数）的时间内精确学习$k$半空间的交集。在我们的工作之前，最著名的运行时间是以 $k$ 为指数的（Arriaga 和 Vempala，1999 年）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Computational Complexity

自引率

0.00%

发文量

期刊最新文献

New Direct Sum Tests Complexity and algorithms for Swap median and relation to other consensus problems Journalists, Emotions, and the Introduction of Generative AI Chatbots: A Large-Scale Analysis of Tweets Before and After the Launch of ChatGPT Almost-catalytic Computation Fast Simulation of Cellular Automata by Self-Composition