Distributed Bayesian Inference for Large-Scale IoT Systems

IF 3.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Big Data and Cognitive Computing Pub Date : 2023-12-19 DOI:10.3390/bdcc8010001
Eleni Vlachou, Aristeidis Karras, Christos N. Karras, Leonidas Theodorakopoulos, C. Halkiopoulos, S. Sioutas
{"title":"Distributed Bayesian Inference for Large-Scale IoT Systems","authors":"Eleni Vlachou, Aristeidis Karras, Christos N. Karras, Leonidas Theodorakopoulos, C. Halkiopoulos, S. Sioutas","doi":"10.3390/bdcc8010001","DOIUrl":null,"url":null,"abstract":"In this work, we present a Distributed Bayesian Inference Classifier for Large-Scale Systems, where we assess its performance and scalability on distributed environments such as PySpark. The presented classifier consistently showcases efficient inference time, irrespective of the variations in the size of the test set, implying a robust ability to handle escalating data sizes without a proportional increase in computational demands. Notably, throughout the experiments, there is an observed increase in memory usage with growing test set sizes, this increment is sublinear, demonstrating the proficiency of the classifier in memory resource management. This behavior is consistent with the typical tendencies of PySpark tasks, which witness increasing memory consumption due to data partitioning and various data operations as datasets expand. CPU resource utilization, which is another crucial factor, also remains stable, emphasizing the capability of the classifier to manage larger computational workloads without significant resource strain. From a classification perspective, the Bayesian Logistic Regression Spark Classifier consistently achieves reliable performance metrics, with a particular focus on high specificity, indicating its aptness for applications where pinpointing true negatives is crucial. In summary, based on all experiments conducted under various data sizes, our classifier emerges as a top contender for scalability-driven applications in IoT systems, highlighting its dependable performance, adept resource management, and consistent prediction accuracy.","PeriodicalId":36397,"journal":{"name":"Big Data and Cognitive Computing","volume":" 7","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data and Cognitive Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/bdcc8010001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In this work, we present a Distributed Bayesian Inference Classifier for Large-Scale Systems, where we assess its performance and scalability on distributed environments such as PySpark. The presented classifier consistently showcases efficient inference time, irrespective of the variations in the size of the test set, implying a robust ability to handle escalating data sizes without a proportional increase in computational demands. Notably, throughout the experiments, there is an observed increase in memory usage with growing test set sizes, this increment is sublinear, demonstrating the proficiency of the classifier in memory resource management. This behavior is consistent with the typical tendencies of PySpark tasks, which witness increasing memory consumption due to data partitioning and various data operations as datasets expand. CPU resource utilization, which is another crucial factor, also remains stable, emphasizing the capability of the classifier to manage larger computational workloads without significant resource strain. From a classification perspective, the Bayesian Logistic Regression Spark Classifier consistently achieves reliable performance metrics, with a particular focus on high specificity, indicating its aptness for applications where pinpointing true negatives is crucial. In summary, based on all experiments conducted under various data sizes, our classifier emerges as a top contender for scalability-driven applications in IoT systems, highlighting its dependable performance, adept resource management, and consistent prediction accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大规模物联网系统的分布式贝叶斯推理
在这项工作中,我们介绍了一种适用于大规模系统的分布式贝叶斯推理分类器,并评估了它在 PySpark 等分布式环境中的性能和可扩展性。无论测试集的规模如何变化,所提出的分类器都能始终如一地显示出高效的推理时间,这意味着该分类器具有强大的能力来处理不断升级的数据规模,而不会相应增加计算需求。值得注意的是,在整个实验过程中,观察到内存使用量随着测试集大小的增加而增加,但这种增加是亚线性的,这表明分类器在内存资源管理方面非常熟练。这种行为与 PySpark 任务的典型趋势一致,即随着数据集的扩大,数据分区和各种数据操作会导致内存消耗增加。作为另一个关键因素的 CPU 资源利用率也保持稳定,这突出表明分类器有能力管理更大的计算工作量,而不会造成明显的资源压力。从分类的角度来看,贝叶斯逻辑回归 Spark 分类器始终保持着可靠的性能指标,尤其是在高特异性方面,这表明它非常适合于精确定位真阴性的应用。总之,基于在各种数据规模下进行的所有实验,我们的分类器成为物联网系统中可扩展性驱动型应用的最佳竞争者,突出了其可靠的性能、出色的资源管理和一致的预测准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Big Data and Cognitive Computing
Big Data and Cognitive Computing Business, Management and Accounting-Management Information Systems
CiteScore
7.10
自引率
8.10%
发文量
128
审稿时长
11 weeks
期刊最新文献
A Survey of Incremental Deep Learning for Defect Detection in Manufacturing BNMI-DINA: A Bayesian Cognitive Diagnosis Model for Enhanced Personalized Learning Semantic Similarity of Common Verbal Expressions in Older Adults through a Pre-Trained Model Knowledge-Based and Generative-AI-Driven Pedagogical Conversational Agents: A Comparative Study of Grice’s Cooperative Principles and Trust Distributed Bayesian Inference for Large-Scale IoT Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1