Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin
{"title":"Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift","authors":"Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin","doi":"arxiv-2311.15961","DOIUrl":null,"url":null,"abstract":"A key challenge of modern machine learning systems is to achieve\nOut-of-Distribution (OOD) generalization -- generalizing to target data whose\ndistribution differs from that of source data. Despite its significant\nimportance, the fundamental question of ``what are the most effective\nalgorithms for OOD generalization'' remains open even under the standard\nsetting of covariate shift. This paper addresses this fundamental question by\nproving that, surprisingly, classical Maximum Likelihood Estimation (MLE)\npurely using source data (without any modification) achieves the minimax\noptimality for covariate shift under the well-specified setting. That is, no\nalgorithm performs better than MLE in this setting (up to a constant factor),\njustifying MLE is all you need. Our result holds for a very rich class of\nparametric models, and does not require any boundedness condition on the\ndensity ratio. We illustrate the wide applicability of our framework by\ninstantiating it to three concrete examples -- linear regression, logistic\nregression, and phase retrieval. This paper further complement the study by\nproving that, under the misspecified setting, MLE is no longer the optimal\nchoice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax\noptimal in certain scenarios.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"41 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2311.15961","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A key challenge of modern machine learning systems is to achieve
Out-of-Distribution (OOD) generalization -- generalizing to target data whose
distribution differs from that of source data. Despite its significant
importance, the fundamental question of ``what are the most effective
algorithms for OOD generalization'' remains open even under the standard
setting of covariate shift. This paper addresses this fundamental question by
proving that, surprisingly, classical Maximum Likelihood Estimation (MLE)
purely using source data (without any modification) achieves the minimax
optimality for covariate shift under the well-specified setting. That is, no
algorithm performs better than MLE in this setting (up to a constant factor),
justifying MLE is all you need. Our result holds for a very rich class of
parametric models, and does not require any boundedness condition on the
density ratio. We illustrate the wide applicability of our framework by
instantiating it to three concrete examples -- linear regression, logistic
regression, and phase retrieval. This paper further complement the study by
proving that, under the misspecified setting, MLE is no longer the optimal
choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax
optimal in certain scenarios.
现代机器学习系统的一个关键挑战是实现out -of- distribution (OOD)泛化——泛化到分布与源数据不同的目标数据。尽管具有重要意义,但即使在协变量移位的标准设定下,“什么是最有效的OOD泛化算法”这一基本问题仍然存在。本文通过证明,令人惊讶的是,纯粹使用源数据(未经任何修改)的经典最大似然估计(MLE)在良好指定的设置下实现了协变量移位的最小最大最优性,从而解决了这个基本问题。也就是说,在这种情况下,没有算法比MLE表现得更好(直到一个常数因子),证明MLE是您所需要的。我们的结果适用于一类非常丰富的参数模型,并且不需要密度比的有界条件。我们通过实例化三个具体的例子来说明我们的框架的广泛适用性——线性回归、逻辑回归和相位检索。本文进一步证明了在错误设定下,最大加权似然估计(MWLE)不再是最优选择,而在某些情况下,最大加权似然估计(MWLE)出现为最小最大最优。