{"title":"描述Web用户访问:Web日志集群的事务性方法","authors":"F. Giannotti, C. Gozzi, G. Manco","doi":"10.1109/ITCC.2002.1000408","DOIUrl":null,"url":null,"abstract":"We present a partitioning method able to manage Web log sessions. Sessions are assimilable to transactions, i.e., tuples of variable size of categorical data. We adapt the standard definition of mathematical distance used in the K-Means algorithm to represent transactions dissimilarity, and redefine the notion of cluster centroid. The cluster centroid is used as the representative of the common properties of cluster elements. We show that using our concept of cluster centroid together with Jaccard distance we obtain results that are comparable with standard approaches, but substantially improve their efficiency.","PeriodicalId":115190,"journal":{"name":"Proceedings. International Conference on Information Technology: Coding and Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Characterizing Web user accesses: a transactional approach to Web log clustering\",\"authors\":\"F. Giannotti, C. Gozzi, G. Manco\",\"doi\":\"10.1109/ITCC.2002.1000408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a partitioning method able to manage Web log sessions. Sessions are assimilable to transactions, i.e., tuples of variable size of categorical data. We adapt the standard definition of mathematical distance used in the K-Means algorithm to represent transactions dissimilarity, and redefine the notion of cluster centroid. The cluster centroid is used as the representative of the common properties of cluster elements. We show that using our concept of cluster centroid together with Jaccard distance we obtain results that are comparable with standard approaches, but substantially improve their efficiency.\",\"PeriodicalId\":115190,\"journal\":{\"name\":\"Proceedings. International Conference on Information Technology: Coding and Computing\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Information Technology: Coding and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITCC.2002.1000408\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Information Technology: Coding and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCC.2002.1000408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Characterizing Web user accesses: a transactional approach to Web log clustering
We present a partitioning method able to manage Web log sessions. Sessions are assimilable to transactions, i.e., tuples of variable size of categorical data. We adapt the standard definition of mathematical distance used in the K-Means algorithm to represent transactions dissimilarity, and redefine the notion of cluster centroid. The cluster centroid is used as the representative of the common properties of cluster elements. We show that using our concept of cluster centroid together with Jaccard distance we obtain results that are comparable with standard approaches, but substantially improve their efficiency.