{"title":"使用上下文树的数据流聚类和建模","authors":"Wei Jiang, Pierre Brice","doi":"10.1109/ICSSSM.2009.5175016","DOIUrl":null,"url":null,"abstract":"Many applications such as telecommunication and commercial video broadcasting streams, computer systems logs, and web clicks are categorical or mixed-value data streams that exhibit context-dependency. Models that try to capture this context-dependency tend not to be scalable. This paper offers a solution to the scalability problem of these models by providing a method for generating them around relevant aggregates of these data streams rather than the individual samples. The approach expands existing clustering techniques for static categorical data sets to predictive models of data streams based on Variable Length Markov models of clusters. The paper includes theoretical and experimental evaluations of the technique as well as comparison with other prominent clustering techniques for categorical data streams.","PeriodicalId":287881,"journal":{"name":"2009 6th International Conference on Service Systems and Service Management","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Data stream clustering and modeling using context-trees\",\"authors\":\"Wei Jiang, Pierre Brice\",\"doi\":\"10.1109/ICSSSM.2009.5175016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many applications such as telecommunication and commercial video broadcasting streams, computer systems logs, and web clicks are categorical or mixed-value data streams that exhibit context-dependency. Models that try to capture this context-dependency tend not to be scalable. This paper offers a solution to the scalability problem of these models by providing a method for generating them around relevant aggregates of these data streams rather than the individual samples. The approach expands existing clustering techniques for static categorical data sets to predictive models of data streams based on Variable Length Markov models of clusters. The paper includes theoretical and experimental evaluations of the technique as well as comparison with other prominent clustering techniques for categorical data streams.\",\"PeriodicalId\":287881,\"journal\":{\"name\":\"2009 6th International Conference on Service Systems and Service Management\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 6th International Conference on Service Systems and Service Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSSM.2009.5175016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 6th International Conference on Service Systems and Service Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSSM.2009.5175016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data stream clustering and modeling using context-trees
Many applications such as telecommunication and commercial video broadcasting streams, computer systems logs, and web clicks are categorical or mixed-value data streams that exhibit context-dependency. Models that try to capture this context-dependency tend not to be scalable. This paper offers a solution to the scalability problem of these models by providing a method for generating them around relevant aggregates of these data streams rather than the individual samples. The approach expands existing clustering techniques for static categorical data sets to predictive models of data streams based on Variable Length Markov models of clusters. The paper includes theoretical and experimental evaluations of the technique as well as comparison with other prominent clustering techniques for categorical data streams.