Georgios Kalogeras, Vassilios D. Tsakanikas, Ioannis Ballas, Vassilios Aggelopoulos, Vassilios Tampakas
{"title":"大规模社区检测:Apache Spark和Neo4j的比较研究","authors":"Georgios Kalogeras, Vassilios D. Tsakanikas, Ioannis Ballas, Vassilios Aggelopoulos, Vassilios Tampakas","doi":"10.1145/3575879.3575961","DOIUrl":null,"url":null,"abstract":"The proliferation of data generation devices, including IoT and edge computing has led to the big data paradigm, which has considerably placed pressure on well-established relational databases during the last decade. Researchers have proposed several alternative database models in order to model the captured data more efficiently. Among these approaches, graph databases seem the most promising candidate to supplement relational schemes. Within this study, a comparison is performed among Neo4j, one of the leading graph databases, and Apache Spark, a unified engine for distributed large-scale data processing environment, in terms of processing limits. More specifically, the two frameworks are compared on their capacity to execute community detection algorithms.","PeriodicalId":164036,"journal":{"name":"Proceedings of the 26th Pan-Hellenic Conference on Informatics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Community Detection at scale: A comparison study among Apache Spark and Neo4j\",\"authors\":\"Georgios Kalogeras, Vassilios D. Tsakanikas, Ioannis Ballas, Vassilios Aggelopoulos, Vassilios Tampakas\",\"doi\":\"10.1145/3575879.3575961\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The proliferation of data generation devices, including IoT and edge computing has led to the big data paradigm, which has considerably placed pressure on well-established relational databases during the last decade. Researchers have proposed several alternative database models in order to model the captured data more efficiently. Among these approaches, graph databases seem the most promising candidate to supplement relational schemes. Within this study, a comparison is performed among Neo4j, one of the leading graph databases, and Apache Spark, a unified engine for distributed large-scale data processing environment, in terms of processing limits. More specifically, the two frameworks are compared on their capacity to execute community detection algorithms.\",\"PeriodicalId\":164036,\"journal\":{\"name\":\"Proceedings of the 26th Pan-Hellenic Conference on Informatics\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th Pan-Hellenic Conference on Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3575879.3575961\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th Pan-Hellenic Conference on Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3575879.3575961","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Community Detection at scale: A comparison study among Apache Spark and Neo4j
The proliferation of data generation devices, including IoT and edge computing has led to the big data paradigm, which has considerably placed pressure on well-established relational databases during the last decade. Researchers have proposed several alternative database models in order to model the captured data more efficiently. Among these approaches, graph databases seem the most promising candidate to supplement relational schemes. Within this study, a comparison is performed among Neo4j, one of the leading graph databases, and Apache Spark, a unified engine for distributed large-scale data processing environment, in terms of processing limits. More specifically, the two frameworks are compared on their capacity to execute community detection algorithms.