连续流挖掘的少样本和对抗表示学习

Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI:10.1145/3366423.3380153

Zhuoyi Wang, Yigong Wang, Yu Lin, Evan Delord, L. Khan

{"title":"连续流挖掘的少样本和对抗表示学习","authors":"Zhuoyi Wang, Yigong Wang, Yu Lin, Evan Delord, L. Khan","doi":"10.1145/3366423.3380153","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have primarily been demonstrated to be useful for closed-world classification problems where the number of categories is fixed. However, DNNs notoriously fail when tasked with label prediction in a non-stationary data stream scenario, which has the continuous emergence of the unknown or novel class (categories not in the training set). For example, new topics continually emerge in social media or e-commerce. To solve this challenge, a DNN should not only be able to detect the novel class effectively but also incrementally learn new concepts from limited samples over time. Literature that addresses both problems simultaneously is limited. In this paper, we focus on improving the generalization of the model on the novel classes, and making the model continually learn from only a few samples from the novel categories. Different from existing approaches that rely on abundant labeled instances to re-train/update the model, we propose a new approach based on Few Sample and Adversarial Representation Learning (FSAR). The key novelty is that we introduce the adversarial confusion term into both the representation learning and few-sample learning process, which reduces the over-confidence of the model on the seen classes, further enhance the generalization of the model to detect and learn new categories with only a few samples. We train the FSAR operated in two stages: first, FSAR learns an intra-class compacted and inter-class separated feature embedding to detect the novel classes; next, we collect a few labeled samples belong to the new categories, utilize episode-training to exploit the intrinsic features for few-sample learning. We evaluated FSAR on different datasets, using extensive experimental results from various simulated stream benchmarks to show that FSAR effectively outperforms current state-of-the-art approaches.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"74 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Few-Sample and Adversarial Representation Learning for Continual Stream Mining\",\"authors\":\"Zhuoyi Wang, Yigong Wang, Yu Lin, Evan Delord, L. Khan\",\"doi\":\"10.1145/3366423.3380153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Neural Networks (DNNs) have primarily been demonstrated to be useful for closed-world classification problems where the number of categories is fixed. However, DNNs notoriously fail when tasked with label prediction in a non-stationary data stream scenario, which has the continuous emergence of the unknown or novel class (categories not in the training set). For example, new topics continually emerge in social media or e-commerce. To solve this challenge, a DNN should not only be able to detect the novel class effectively but also incrementally learn new concepts from limited samples over time. Literature that addresses both problems simultaneously is limited. In this paper, we focus on improving the generalization of the model on the novel classes, and making the model continually learn from only a few samples from the novel categories. Different from existing approaches that rely on abundant labeled instances to re-train/update the model, we propose a new approach based on Few Sample and Adversarial Representation Learning (FSAR). The key novelty is that we introduce the adversarial confusion term into both the representation learning and few-sample learning process, which reduces the over-confidence of the model on the seen classes, further enhance the generalization of the model to detect and learn new categories with only a few samples. We train the FSAR operated in two stages: first, FSAR learns an intra-class compacted and inter-class separated feature embedding to detect the novel classes; next, we collect a few labeled samples belong to the new categories, utilize episode-training to exploit the intrinsic features for few-sample learning. We evaluated FSAR on different datasets, using extensive experimental results from various simulated stream benchmarks to show that FSAR effectively outperforms current state-of-the-art approaches.\",\"PeriodicalId\":20754,\"journal\":{\"name\":\"Proceedings of The Web Conference 2020\",\"volume\":\"74 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of The Web Conference 2020\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366423.3380153\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The Web Conference 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366423.3380153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

深度神经网络(dnn)已被证明主要用于封闭世界分类问题，其中类别数量是固定的。然而，当dnn在非平稳数据流场景中进行标签预测时，会出现未知或新类(不在训练集中的类别)的不断出现，这是出了名的失败。例如，社交媒体或电子商务中不断出现新的话题。为了解决这一挑战，深度神经网络不仅要能够有效地检测新类别，还要能够随着时间的推移从有限的样本中逐步学习新概念。同时解决这两个问题的文献是有限的。在本文中，我们的重点是提高模型在新类别上的泛化能力，使模型只从新类别的少数样本中进行持续学习。与现有的依赖大量标记实例来重新训练/更新模型的方法不同，我们提出了一种基于少样本和对抗表示学习(FSAR)的新方法。关键的新颖之处在于，我们在表示学习和少样本学习过程中都引入了对抗混淆项，这减少了模型对已知类别的过度置信度，进一步增强了模型的泛化能力，可以用少量样本来检测和学习新的类别。我们分两个阶段对FSAR进行训练:首先，FSAR学习类内压缩和类间分离的特征嵌入来检测新的类;接下来，我们收集一些属于新类别的标记样本，利用情节训练来挖掘其内在特征进行少样本学习。我们在不同的数据集上评估了FSAR，使用了来自各种模拟流基准的大量实验结果，以表明FSAR有效地优于当前最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Few-Sample and Adversarial Representation Learning for Continual Stream Mining

Deep Neural Networks (DNNs) have primarily been demonstrated to be useful for closed-world classification problems where the number of categories is fixed. However, DNNs notoriously fail when tasked with label prediction in a non-stationary data stream scenario, which has the continuous emergence of the unknown or novel class (categories not in the training set). For example, new topics continually emerge in social media or e-commerce. To solve this challenge, a DNN should not only be able to detect the novel class effectively but also incrementally learn new concepts from limited samples over time. Literature that addresses both problems simultaneously is limited. In this paper, we focus on improving the generalization of the model on the novel classes, and making the model continually learn from only a few samples from the novel categories. Different from existing approaches that rely on abundant labeled instances to re-train/update the model, we propose a new approach based on Few Sample and Adversarial Representation Learning (FSAR). The key novelty is that we introduce the adversarial confusion term into both the representation learning and few-sample learning process, which reduces the over-confidence of the model on the seen classes, further enhance the generalization of the model to detect and learn new categories with only a few samples. We train the FSAR operated in two stages: first, FSAR learns an intra-class compacted and inter-class separated feature embedding to detect the novel classes; next, we collect a few labeled samples belong to the new categories, utilize episode-training to exploit the intrinsic features for few-sample learning. We evaluated FSAR on different datasets, using extensive experimental results from various simulated stream benchmarks to show that FSAR effectively outperforms current state-of-the-art approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of The Web Conference 2020

自引率

0.00%

发文量