Benign Overfitting for $α$ Sub-exponential Input

arXiv - STAT - Statistics Theory Pub Date : 2024-09-01 DOI:arxiv-2409.00733

Kota Okudo, Kei Kobayashi

引用次数: 0

Abstract

This paper investigates the phenomenon of benign overfitting in binary classification problems with heavy-tailed input distributions. We extend the analysis of maximum margin classifiers to $\alpha$ sub-exponential distributions, where $\alpha \in (0,2]$, generalizing previous work that focused on sub-gaussian inputs. Our main result provides generalization error bounds for linear classifiers trained using gradient descent on unregularized logistic loss in this heavy-tailed setting. We prove that under certain conditions on the dimensionality $p$ and feature vector magnitude $\|\mu\|$, the misclassification error of the maximum margin classifier asymptotically approaches the noise level. This work contributes to the understanding of benign overfitting in more robust distribution settings and demonstrates that the phenomenon persists even with heavier-tailed inputs than previously studied.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

α$亚指数输入的良性过度拟合

本文研究了具有重尾输入分布的二元分类问题中的良性过拟合现象。我们将最大边际分类器的分析扩展到了 $\alpha$ 亚指数分布，其中 $\alpha \ in (0,2]$，这是对之前专注于亚高斯输入的工作的推广。我们的主要结果为在这种重尾情况下使用梯度下降非规则化逻辑损失训练的线性分类器提供了广义误差边界。我们证明，在维度 $p$ 和特征向量大小 $\|\mu\|$ 的特定条件下，最大边际分类器的误分类误差会渐近地接近噪声水平。这项工作有助于理解更稳健分布设置中的良性过拟合，并证明即使输入的尾部比以前研究的更重，这种现象也会持续存在。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - STAT - Statistics Theory

自引率

0.00%

发文量

期刊最新文献

Cyclicity Analysis of the Ornstein-Uhlenbeck Process Linear hypothesis testing in high-dimensional heteroscedastics via random integration Asymptotics for conformal inference Sparse Factor Analysis for Categorical Data with the Group-Sparse Generalized Singular Value Decomposition Incremental effects for continuous exposures