Liadan O'Callaghan, A. Meyerson, R. Motwani, Nina Mishra, S. Guha
{"title":"Streaming-data algorithms for high-quality clustering","authors":"Liadan O'Callaghan, A. Meyerson, R. Motwani, Nina Mishra, S. Guha","doi":"10.1109/ICDE.2002.994785","DOIUrl":null,"url":null,"abstract":"Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"681","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 18th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2002.994785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 681
Abstract
Streaming data analysis has recently attracted attention in numerous applications including telephone records, Web documents and click streams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.