Minimax Regret Learning for Data with Heterogeneous Subgroups

Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu
{"title":"Minimax Regret Learning for Data with Heterogeneous Subgroups","authors":"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu","doi":"arxiv-2405.01709","DOIUrl":null,"url":null,"abstract":"Modern complex datasets often consist of various sub-populations. To develop\nrobust and generalizable methods in the presence of sub-population\nheterogeneity, it is important to guarantee a uniform learning performance\ninstead of an average one. In many applications, prior information is often\navailable on which sub-population or group the data points belong to. Given the\nobserved groups of data, we develop a min-max-regret (MMR) learning framework\nfor general supervised learning, which targets to minimize the worst-group\nregret. Motivated from the regret-based decision theoretic framework, the\nproposed MMR is distinguished from the value-based or risk-based robust\nlearning methods in the existing literature. The regret criterion features\nseveral robustness and invariance properties simultaneously. In terms of\ngeneralizability, we develop the theoretical guarantee for the worst-case\nregret over a super-population of the meta data, which incorporates the\nobserved sub-populations, their mixtures, as well as other unseen\nsub-populations that could be approximated by the observed ones. We demonstrate\nthe effectiveness of our method through extensive simulation studies and an\napplication to kidney transplantation data from hundreds of transplant centers.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"152 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.01709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Modern complex datasets often consist of various sub-populations. To develop robust and generalizable methods in the presence of sub-population heterogeneity, it is important to guarantee a uniform learning performance instead of an average one. In many applications, prior information is often available on which sub-population or group the data points belong to. Given the observed groups of data, we develop a min-max-regret (MMR) learning framework for general supervised learning, which targets to minimize the worst-group regret. Motivated from the regret-based decision theoretic framework, the proposed MMR is distinguished from the value-based or risk-based robust learning methods in the existing literature. The regret criterion features several robustness and invariance properties simultaneously. In terms of generalizability, we develop the theoretical guarantee for the worst-case regret over a super-population of the meta data, which incorporates the observed sub-populations, their mixtures, as well as other unseen sub-populations that could be approximated by the observed ones. We demonstrate the effectiveness of our method through extensive simulation studies and an application to kidney transplantation data from hundreds of transplant centers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对异质分组数据的最小回归学习
现代复杂数据集通常由各种子群组成。要想在存在子群异质性的情况下开发出稳健、可推广的方法,必须保证统一的学习性能,而不是平均性能。在许多应用中,数据点属于哪个子群或组,往往可以获得先验信息。鉴于观察到的数据组,我们开发了一种用于一般监督学习的最小-最大-遗憾(MMR)学习框架,其目标是最小化最差组遗憾。受基于遗憾的决策理论框架的启发,我们提出的 MMR 有别于现有文献中基于价值或风险的鲁棒学习方法。遗憾准则同时具有稳健性和不变性等特征。在通用性方面,我们从理论上保证了元数据超群的最差后悔值,超群包括观测到的子群、它们的混合物以及可以用观测到的子群近似的其他未知子群。我们通过大量的模拟研究和对数百个移植中心的肾移植数据的应用,证明了我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Precision-based designs for sequential randomized experiments Strang Splitting for Parametric Inference in Second-order Stochastic Differential Equations Stability of a Generalized Debiased Lasso with Applications to Resampling-Based Variable Selection Tuning parameter selection in econometrics Limiting Behavior of Maxima under Dependence
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1