{"title":"An evaluation of relational and NoSQL distributed databases on a low-power cluster.","authors":"Lucas Ferreira da Silva, João V F Lima","doi":"10.1007/s11227-023-05166-7","DOIUrl":null,"url":null,"abstract":"<p><p>The constant growth of social media, unconventional web technologies, mobile applications, and Internet of Things (IoT) devices create challenges for cloud data systems in order to support huge datasets and very high request rates. NoSQL databases, such as Cassandra and HBase, and relational SQL databases with replication, such as Citus/PostgreSQL, have been used to increase horizontal scalability and high availability of data store systems. In this paper, we evaluated three distributed databases on a low-power low-cost cluster of commodity Single-Board Computers (SBC): relational Citus/PostgreSQL and NoSQL databases Cassandra and HBase. The cluster has 15 Raspberry Pi 3 nodes with Docker Swarm orchestration tool for service deployment and ingress load balancing over SBCs. We believe that a low-cost SBC cluster can support cloud serving goals such as scale-out, elasticity, and high availability. Experimental results clearly demonstrated that there is a trade-off between performance and replication, which provides availability and partition tolerance. Besides, both properties are essential in the context of distributed systems with low-power boards. Cassandra attained better results with its consistency levels specified by the client. Both Citus and HBase enable consistency but it penalizes performance as the number of replicas increases.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":" ","pages":"1-19"},"PeriodicalIF":2.5000,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10035467/pdf/","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-023-05166-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 3
Abstract
The constant growth of social media, unconventional web technologies, mobile applications, and Internet of Things (IoT) devices create challenges for cloud data systems in order to support huge datasets and very high request rates. NoSQL databases, such as Cassandra and HBase, and relational SQL databases with replication, such as Citus/PostgreSQL, have been used to increase horizontal scalability and high availability of data store systems. In this paper, we evaluated three distributed databases on a low-power low-cost cluster of commodity Single-Board Computers (SBC): relational Citus/PostgreSQL and NoSQL databases Cassandra and HBase. The cluster has 15 Raspberry Pi 3 nodes with Docker Swarm orchestration tool for service deployment and ingress load balancing over SBCs. We believe that a low-cost SBC cluster can support cloud serving goals such as scale-out, elasticity, and high availability. Experimental results clearly demonstrated that there is a trade-off between performance and replication, which provides availability and partition tolerance. Besides, both properties are essential in the context of distributed systems with low-power boards. Cassandra attained better results with its consistency levels specified by the client. Both Citus and HBase enable consistency but it penalizes performance as the number of replicas increases.
社交媒体、非传统网络技术、移动应用程序和物联网(IoT)设备的不断增长给云数据系统带来了挑战,以支持庞大的数据集和极高的请求率。NoSQL数据库,如Cassandra和HBase,以及具有复制功能的关系型SQL数据库,例如Citus/PostgreSQL,已被用于提高数据存储系统的水平可扩展性和高可用性。在本文中,我们评估了低功耗、低成本的商品单板计算机(SBC)集群上的三个分布式数据库:关系型Citus/PostgreSQL和NoSQL数据库Cassandra和HBase。该集群有15个Raspberry Pi 3节点,带有Docker Swarm编排工具,用于通过SBC进行服务部署和入口负载平衡。我们相信,低成本的SBC集群可以支持云服务目标,如扩展、弹性和高可用性。实验结果清楚地表明,在性能和复制之间存在权衡,这提供了可用性和分区容忍度。此外,在具有低功耗板的分布式系统中,这两种特性都是必不可少的。Cassandra通过客户端指定的一致性级别获得了更好的结果。Citus和HBase都可以实现一致性,但随着复制副本数量的增加,这会降低性能。
期刊介绍:
The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs.
Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.