D. Le, Cao Dai Pham, Van Tuan Luu, Vanha Tran, Dang Hai Nguyen
{"title":"An Efficient Algorithm for Mining Maximal Co-location Pattern Using Instance-trees","authors":"D. Le, Cao Dai Pham, Van Tuan Luu, Vanha Tran, Dang Hai Nguyen","doi":"10.1109/NICS54270.2021.9701511","DOIUrl":null,"url":null,"abstract":"Prevalent co-location patterns, which refer to groups of features whose instances frequently appear together in nearby geographic space, are one of the main branches of spatial data mining. As the data volume continues to increase, it is redundant if all patterns are discovered. Maximal co-location patterns (MCPs) are a compressed representation of all these patterns and they provide a new insight into the interaction among different spatial features to discover more valuable knowledge from data sets. Increasing the volume of spatial data sets makes discovering MCPs still very challenging. We dedicate this study to designing an efficient MCP mining algorithm. First, features in size-2 patterns are regarded as a sparse graph, MCP candidates are generated by enumerating maximal cliques from the sparse graph. Second, we design two instance-tree structures, star neighbor- and sibling node-based instance-trees to store neighbor relationships of instances. All maximal co-location instances of the candidates are yielded efficiently from these instance-tree structures. Finally, a MCP candidate is marked as prevalent if its participation index, which is calculated based on the maximal co-location instances, is not smaller than a minimum prevalence threshold given by users. The efficiency of the proposed algorithm is proved by comparison with the previous algorithms on both synthetic and real data sets.","PeriodicalId":296963,"journal":{"name":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 8th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS54270.2021.9701511","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Prevalent co-location patterns, which refer to groups of features whose instances frequently appear together in nearby geographic space, are one of the main branches of spatial data mining. As the data volume continues to increase, it is redundant if all patterns are discovered. Maximal co-location patterns (MCPs) are a compressed representation of all these patterns and they provide a new insight into the interaction among different spatial features to discover more valuable knowledge from data sets. Increasing the volume of spatial data sets makes discovering MCPs still very challenging. We dedicate this study to designing an efficient MCP mining algorithm. First, features in size-2 patterns are regarded as a sparse graph, MCP candidates are generated by enumerating maximal cliques from the sparse graph. Second, we design two instance-tree structures, star neighbor- and sibling node-based instance-trees to store neighbor relationships of instances. All maximal co-location instances of the candidates are yielded efficiently from these instance-tree structures. Finally, a MCP candidate is marked as prevalent if its participation index, which is calculated based on the maximal co-location instances, is not smaller than a minimum prevalence threshold given by users. The efficiency of the proposed algorithm is proved by comparison with the previous algorithms on both synthetic and real data sets.