Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times

2010 IEEE International Conference on Data Mining Pub Date : 2010-12-13 DOI:10.1109/ICDM.2010.120

Jin Shieh, Eamonn J. Keogh

{"title":"Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times","authors":"Jin Shieh, Eamonn J. Keogh","doi":"10.1109/ICDM.2010.120","DOIUrl":null,"url":null,"abstract":"Classification of items taken from data streams requires algorithms that operate in time sensitive and computationally constrained environments. Often, the available time for classification is not known a priori and may change as a consequence of external circumstances. Many traditional algorithms are unable to provide satisfactory performance while supporting the highly variable response times that exemplify such applications. In such contexts, anytime algorithms, which are amenable to trading time for accuracy, have been found to be exceptionally useful and constitute an area of increasing research activity. Previous techniques for improving anytime classification have generally been concerned with optimizing the probability of correctly classifying individual objects. However, as we shall see, serially optimizing the probability of correctly classifying individual objects K times, generally gives inferior results to batch optimizing the probability of correctly classifying K objects. In this work, we show that this simple observation can be exploited to improve overall classification performance by using an anytime framework to allocate resources among a set of objects buffered from a fast arriving stream. Our ideas are independent of object arrival behavior, and, perhaps unintuitively, even in data streams with constant arrival rates our technique exhibits a marked improvement in performance. The utility of our approach is demonstrated with extensive experimental evaluations conducted on a wide range of diverse datasets.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2010.120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

Classification of items taken from data streams requires algorithms that operate in time sensitive and computationally constrained environments. Often, the available time for classification is not known a priori and may change as a consequence of external circumstances. Many traditional algorithms are unable to provide satisfactory performance while supporting the highly variable response times that exemplify such applications. In such contexts, anytime algorithms, which are amenable to trading time for accuracy, have been found to be exceptionally useful and constitute an area of increasing research activity. Previous techniques for improving anytime classification have generally been concerned with optimizing the probability of correctly classifying individual objects. However, as we shall see, serially optimizing the probability of correctly classifying individual objects K times, generally gives inferior results to batch optimizing the probability of correctly classifying K objects. In this work, we show that this simple observation can be exploited to improve overall classification performance by using an anytime framework to allocate resources among a set of objects buffered from a fast arriving stream. Our ideas are independent of object arrival behavior, and, perhaps unintuitively, even in data streams with constant arrival rates our technique exhibits a marked improvement in performance. The utility of our approach is demonstrated with extensive experimental evaluations conducted on a wide range of diverse datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

抛光正确的苹果:随时分类也有利于数据流与恒定的到达时间

从数据流中提取的项目分类需要在时间敏感和计算受限的环境中运行的算法。通常，可用的分类时间不是先验的，可能会因外部环境而改变。许多传统算法无法提供令人满意的性能，同时支持高度可变的响应时间，例如这类应用程序。在这种情况下，随时算法被发现是非常有用的，它可以用时间来换取准确性，并构成了一个日益增加的研究活动领域。以前用于改进任意时间分类的技术通常关注于优化正确分类单个对象的概率。然而，正如我们将看到的，连续优化正确分类单个对象K次的概率，通常会得到比批量优化正确分类K个对象的概率更差的结果。在这项工作中，我们展示了可以利用这个简单的观察来提高整体分类性能，方法是使用随时框架在一组从快速到达流中缓冲的对象之间分配资源。我们的想法是独立于对象到达行为的，并且，也许不直观，即使在具有恒定到达率的数据流中，我们的技术在性能上也有显着的改进。我们的方法的效用通过在广泛的不同数据集上进行的广泛的实验评估来证明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2010 IEEE International Conference on Data Mining

自引率

0.00%

发文量

期刊最新文献

Generalized Probabilistic Matrix Factorizations for Collaborative Filtering MoodCast: Emotion Prediction via Dynamic Continuous Factor Graph Model Finding Local Anomalies in Very High Dimensional Space Efficient Probabilistic Latent Semantic Analysis with Sparsity Control Enhancing Single-Objective Projective Clustering Ensembles