基于区域主动学习的分层自适应区域构建。

Zhipeng Luo, Milos Hauskrecht
{"title":"基于区域主动学习的分层自适应区域构建。","authors":"Zhipeng Luo,&nbsp;Milos Hauskrecht","doi":"10.1137/1.9781611975673.50","DOIUrl":null,"url":null,"abstract":"<p><p>Learning of classification models in practice often relies on human annotation effort in which humans assign class labels to data instances. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To solve this problem, instead of soliciting instance-based annotation we explore <i>region</i>-based annotation as the human feedback. A region is defined as a hyper-cubic subspace of the input space <i>X</i> and it covers a subpopulation of data instances that fall into this region. Each region is labeled with a number in [0,1] (in binary classification setting), representing a human estimate of the positive (or negative) class proportion in the subpopulation. To quickly discover pure regions (in terms of class proportion) in the data, we have developed a novel active learning framework that constructs regions in a <i>hierarchical</i> and <i>adaptive</i> way. <i>Hierarchical</i> means that regions are incrementally built into a hierarchical tree, which is done by repeatedly splitting the input space. <i>Adaptive</i> means that our framework can adaptively choose the best heuristic for each of the region splits. Through experiments on numerous datasets we demonstrate that our framework can identify pure regions in very few region queries. Thus our approach is shown to be effective in learning classification models from very limited human feedback.</p>","PeriodicalId":74533,"journal":{"name":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","volume":"2019 ","pages":"441-449"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1137/1.9781611975673.50","citationCount":"3","resultStr":"{\"title\":\"Region-Based Active Learning with Hierarchical and Adaptive Region Construction.\",\"authors\":\"Zhipeng Luo,&nbsp;Milos Hauskrecht\",\"doi\":\"10.1137/1.9781611975673.50\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Learning of classification models in practice often relies on human annotation effort in which humans assign class labels to data instances. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To solve this problem, instead of soliciting instance-based annotation we explore <i>region</i>-based annotation as the human feedback. A region is defined as a hyper-cubic subspace of the input space <i>X</i> and it covers a subpopulation of data instances that fall into this region. Each region is labeled with a number in [0,1] (in binary classification setting), representing a human estimate of the positive (or negative) class proportion in the subpopulation. To quickly discover pure regions (in terms of class proportion) in the data, we have developed a novel active learning framework that constructs regions in a <i>hierarchical</i> and <i>adaptive</i> way. <i>Hierarchical</i> means that regions are incrementally built into a hierarchical tree, which is done by repeatedly splitting the input space. <i>Adaptive</i> means that our framework can adaptively choose the best heuristic for each of the region splits. Through experiments on numerous datasets we demonstrate that our framework can identify pure regions in very few region queries. Thus our approach is shown to be effective in learning classification models from very limited human feedback.</p>\",\"PeriodicalId\":74533,\"journal\":{\"name\":\"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining\",\"volume\":\"2019 \",\"pages\":\"441-449\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1137/1.9781611975673.50\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1137/1.9781611975673.50\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611975673.50","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

在实践中,分类模型的学习通常依赖于人类的注释工作,其中人类将类标签分配给数据实例。由于这个过程非常耗时和昂贵,因此找到降低注释成本的有效方法对于构建这样的模型至关重要。为了解决这个问题,我们探索了基于区域的标注作为人类反馈,而不是请求基于实例的标注。区域被定义为输入空间X的超立方子空间,它覆盖了属于该区域的数据实例的子种群。每个区域用[0,1]中的数字标记(在二元分类设置中),代表人类对子种群中正(或负)类比例的估计。为了快速发现数据中的纯区域(就类比例而言),我们开发了一种新的主动学习框架,该框架以分层和自适应的方式构建区域。分层意味着将区域增量地构建到分层树中,这是通过重复分割输入空间来完成的。自适应意味着我们的框架可以自适应地为每个区域分割选择最佳启发式。通过对大量数据集的实验,我们证明了我们的框架可以在很少的区域查询中识别纯区域。因此,我们的方法在从非常有限的人类反馈中学习分类模型方面是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Region-Based Active Learning with Hierarchical and Adaptive Region Construction.

Learning of classification models in practice often relies on human annotation effort in which humans assign class labels to data instances. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To solve this problem, instead of soliciting instance-based annotation we explore region-based annotation as the human feedback. A region is defined as a hyper-cubic subspace of the input space X and it covers a subpopulation of data instances that fall into this region. Each region is labeled with a number in [0,1] (in binary classification setting), representing a human estimate of the positive (or negative) class proportion in the subpopulation. To quickly discover pure regions (in terms of class proportion) in the data, we have developed a novel active learning framework that constructs regions in a hierarchical and adaptive way. Hierarchical means that regions are incrementally built into a hierarchical tree, which is done by repeatedly splitting the input space. Adaptive means that our framework can adaptively choose the best heuristic for each of the region splits. Through experiments on numerous datasets we demonstrate that our framework can identify pure regions in very few region queries. Thus our approach is shown to be effective in learning classification models from very limited human feedback.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated Fusion of Multimodal Electronic Health Records for Better Medical Predictions. MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation. FAME: Fragment-based Conditional Molecular Generation for Phenotypic Drug Discovery. Harmonic Alignment. GRIA: Graphical Regularization for Integrative Analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1