{"title":"Approximating 4-cliques in streaming graphs: the power of dual sampling","authors":"Anmol Mann, Venkatesh Srinivasan, Alex Thomo","doi":"10.1145/3487351.3489471","DOIUrl":null,"url":null,"abstract":"Clique counting is considered to be a challenging problem in graph mining. The reason is combinatorial explosion; even moderate graphs with a few million edges could have clique counts in the order of many billions. In this paper, we propose a fast and scalable algorithm for approximating 4-clique counts in a single-pass streaming model. By leveraging a combination of sampling approaches, we estimate the 4-clique count with high accuracy. Our algorithm performs well on massive graphs containing several billions of 4-cliques, and terminates within a reasonable amount of time.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"210 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487351.3489471","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Clique counting is considered to be a challenging problem in graph mining. The reason is combinatorial explosion; even moderate graphs with a few million edges could have clique counts in the order of many billions. In this paper, we propose a fast and scalable algorithm for approximating 4-clique counts in a single-pass streaming model. By leveraging a combination of sampling approaches, we estimate the 4-clique count with high accuracy. Our algorithm performs well on massive graphs containing several billions of 4-cliques, and terminates within a reasonable amount of time.