首页 > 最新文献

Journal of Parallel and Distributed Computing最新文献

英文 中文
Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues) 封面1 -完整的扉页(每期)/特刊扉页(每期)
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-17 DOI: 10.1016/S0743-7315(25)00161-3
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(25)00161-3","DOIUrl":"10.1016/S0743-7315(25)00161-3","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105194"},"PeriodicalIF":4.0,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy and performance efficient NAND flash translation layer architecture for low-latency edge applications 低延迟边缘应用的高能效NAND闪存转换层架构
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-17 DOI: 10.1016/j.jpdc.2025.105200
Ranjeeth Sekhar CB , Diksha Shekhawat , Jugal Gandhi , M. Santosh , Jai Gopal Pandey
This paper presents a hardware architecture, design, and implementation of a flash translation layer (FTL) for NAND flash memory devices. The proposed FTL system incorporates a hybrid logical-to-physical mapping scheme, wear leveling, garbage collection, bad block management, and error correcting codes (ECC). This architecture is designed to optimize the NAND flash memory management process, enhancing the overall performance and reliability. In this, hybrid logical-to-physical mapping and wear leveling schemes efficiently manage data placement, mitigating the inherent challenges of NAND flash memory, such as limited write endurance and page-based programming constraints. To validate the proposed hardware FTL, experimental evaluations have been performed, demonstrating its efficiency in terms of latency, dynamic power, and throughput. Hardware implementation is carried out on the Xilinx Zynq UltraScale+ ZCU102 platform containing the xczu9eg-2ffvb1156-2-e FPGA device. The proposed work has comparable resource utilization and improved datapath delay, throughput, and dynamic power, with practical implications for edge computing applications.
本文提出一种NAND快闪储存装置的快闪转换层(FTL)的硬体架构、设计与实作。提出的FTL系统包含混合逻辑到物理映射方案、磨损均衡、垃圾收集、坏块管理和纠错码(ECC)。该架构旨在优化NAND闪存管理流程,提高整体性能和可靠性。在这种情况下,混合逻辑到物理映射和损耗均衡方案有效地管理数据放置,减轻了NAND闪存的固有挑战,例如有限的写入持久性和基于页面的编程约束。为了验证所提出的硬件超光速,进行了实验评估,证明了其在延迟、动态功率和吞吐量方面的效率。硬件实现在Xilinx Zynq UltraScale+ ZCU102平台上进行,该平台包含xczu9egg -2ffvb1156-2-e FPGA器件。提出的工作具有相当的资源利用率和改进的数据路径延迟、吞吐量和动态功率,对边缘计算应用具有实际意义。
{"title":"Energy and performance efficient NAND flash translation layer architecture for low-latency edge applications","authors":"Ranjeeth Sekhar CB ,&nbsp;Diksha Shekhawat ,&nbsp;Jugal Gandhi ,&nbsp;M. Santosh ,&nbsp;Jai Gopal Pandey","doi":"10.1016/j.jpdc.2025.105200","DOIUrl":"10.1016/j.jpdc.2025.105200","url":null,"abstract":"<div><div>This paper presents a hardware architecture, design, and implementation of a flash translation layer (FTL) for NAND flash memory devices. The proposed FTL system incorporates a hybrid logical-to-physical mapping scheme, wear leveling, garbage collection, bad block management, and error correcting codes (ECC). This architecture is designed to optimize the NAND flash memory management process, enhancing the overall performance and reliability. In this, hybrid logical-to-physical mapping and wear leveling schemes efficiently manage data placement, mitigating the inherent challenges of NAND flash memory, such as limited write endurance and page-based programming constraints. To validate the proposed hardware FTL, experimental evaluations have been performed, demonstrating its efficiency in terms of latency, dynamic power, and throughput. Hardware implementation is carried out on the Xilinx Zynq UltraScale+ ZCU102 platform containing the xczu9eg-2ffvb1156-2-e FPGA device. The proposed work has comparable resource utilization and improved datapath delay, throughput, and dynamic power, with practical implications for edge computing applications.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"209 ","pages":"Article 105200"},"PeriodicalIF":4.0,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new switch buffer architecture for dragonfly networks 蜻蜓网络的一种新的交换缓冲架构
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-15 DOI: 10.1016/j.jpdc.2025.105199
Alejandro Cano , Cristóbal Camarero , Carmen Martínez , Ramón Beivide
Dragonfly networks offer a viable solution for large-scale supercomputers and datacenters. However, developing efficient routing mechanisms for these networks presents significant challenges. Current solutions often lead to unstable network behavior due to congestion and fairness issues, exacerbating performance variability and the tail-latency problem. An analysis of the topology and its standard deadlock avoidance mechanisms reveals that server access to global network links varies based on their location in the network, resulting in throughput unfairness. To address this issue, this paper introduces a novel switch buffer architecture which reduces head-of-line blocking and enhances fairness, to significantly improve overall network performance. Despite offering comparable cost to existing solutions, the proposed buffer architecture proves superior performance. Real-world synthetic simulations scenarios further confirm these findings, showing performance improvements between 10 % and 47 % against conventional solutions in medium sized Dragonflies.
蜻蜓网络为大型超级计算机和数据中心提供了一个可行的解决方案。然而,为这些网络开发有效的路由机制提出了重大挑战。由于拥塞和公平性问题,当前的解决方案常常导致网络行为不稳定,加剧了性能可变性和尾部延迟问题。对拓扑结构及其标准死锁避免机制的分析表明,服务器对全局网络链路的访问因其在网络中的位置而异,从而导致吞吐量不公平。为了解决这一问题,本文引入了一种新颖的交换机缓冲结构,该结构减少了线路头阻塞并增强了公平性,从而显著提高了网络的整体性能。尽管提供了与现有解决方案相当的成本,但所提出的缓冲体系结构证明了卓越的性能。真实世界的合成模拟场景进一步证实了这些发现,显示中型蜻蜓的性能比传统解决方案提高了10%到47%。
{"title":"A new switch buffer architecture for dragonfly networks","authors":"Alejandro Cano ,&nbsp;Cristóbal Camarero ,&nbsp;Carmen Martínez ,&nbsp;Ramón Beivide","doi":"10.1016/j.jpdc.2025.105199","DOIUrl":"10.1016/j.jpdc.2025.105199","url":null,"abstract":"<div><div>Dragonfly networks offer a viable solution for large-scale supercomputers and datacenters. However, developing efficient routing mechanisms for these networks presents significant challenges. Current solutions often lead to unstable network behavior due to congestion and fairness issues, exacerbating performance variability and the tail-latency problem. An analysis of the topology and its standard deadlock avoidance mechanisms reveals that server access to global network links varies based on their location in the network, resulting in throughput unfairness. To address this issue, this paper introduces a novel switch buffer architecture which reduces head-of-line blocking and enhances fairness, to significantly improve overall network performance. Despite offering comparable cost to existing solutions, the proposed buffer architecture proves superior performance. Real-world synthetic simulations scenarios further confirm these findings, showing performance improvements between 10 % and 47 % against conventional solutions in medium sized Dragonflies.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"209 ","pages":"Article 105199"},"PeriodicalIF":4.0,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BMSES: Blockchain and mobile edge computing-based secure and energy-efficient system for healthcare data management BMSES:区块链和基于移动边缘计算的医疗保健数据管理安全和节能系统
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-13 DOI: 10.1016/j.jpdc.2025.105198
Md Nurul Hasan, Suyel Namasudra
Blockchain technology is rapidly being adopted across various sectors, including healthcare, finance, and agriculture, due to its key features like decentralization, immutability, and consensus mechanisms. These features ensure security, privacy, transparency, and accountability. On the other hand, Mobile Edge Computing (MEC) extends cloud computing capabilities to mobile devices through a distributed system. However, existing studies, which combine blockchain with MEC, often overlook the impact of data offloading on system performance. This paper proposes a Blockchain and MEC-based Secure and Energy-efficient System (BMSES) for sharing Internet of Medical Things (IoMT) data securely among patients and doctors, while optimizing energy consumption through task offloading to MEC servers. Here, the Non-Orthogonal Multiple Access (NOMA) protocol is utilized for efficient channel sharing among multiple users, which offers low cost, reduced latency, and low power consumption. The proposed scheme optimizes energy consumption by efficiently managing task delegation and resource allocation in MEC. Additionally, smart contracts automate blockchain operations, enhancing efficiency and security. The proposed scheme is evaluated in terms of energy consumption, transmission rate, latency, and offloading delay. The results of the experiments show the effectiveness of the proposed scheme compared to state-of-the-art offloading approaches in terms of improving energy efficiency, transmission rate, offloading delay, and latency of the system.
区块链技术由于其去中心化、不变性和共识机制等关键特性,正迅速被各个领域采用,包括医疗保健、金融和农业。这些特性确保了安全性、隐私性、透明性和可问责性。另一方面,移动边缘计算(MEC)通过分布式系统将云计算能力扩展到移动设备。然而,现有的研究将区块链与MEC结合起来,往往忽略了数据卸载对系统性能的影响。本文提出了一种基于区块链和MEC的安全与节能系统(BMSES),用于在患者和医生之间安全地共享医疗物联网(IoMT)数据,同时通过将任务卸载到MEC服务器来优化能耗。在这里,利用非正交多址(NOMA)协议在多个用户之间进行有效的信道共享,从而提供低成本、低延迟和低功耗。该方案通过有效地管理MEC中的任务分配和资源分配来优化能耗。此外,智能合约使区块链操作自动化,提高了效率和安全性。从能耗、传输速率、延迟和卸载延迟等方面对该方案进行了评估。实验结果表明,与目前最先进的卸载方法相比,该方案在提高能源效率、传输速率、卸载延迟和系统延迟方面是有效的。
{"title":"BMSES: Blockchain and mobile edge computing-based secure and energy-efficient system for healthcare data management","authors":"Md Nurul Hasan,&nbsp;Suyel Namasudra","doi":"10.1016/j.jpdc.2025.105198","DOIUrl":"10.1016/j.jpdc.2025.105198","url":null,"abstract":"<div><div>Blockchain technology is rapidly being adopted across various sectors, including healthcare, finance, and agriculture, due to its key features like decentralization, immutability, and consensus mechanisms. These features ensure security, privacy, transparency, and accountability. On the other hand, Mobile Edge Computing (MEC) extends cloud computing capabilities to mobile devices through a distributed system. However, existing studies, which combine blockchain with MEC, often overlook the impact of data offloading on system performance. This paper proposes a Blockchain and MEC-based Secure and Energy-efficient System (BMSES) for sharing Internet of Medical Things (IoMT) data securely among patients and doctors, while optimizing energy consumption through task offloading to MEC servers. Here, the Non-Orthogonal Multiple Access (NOMA) protocol is utilized for efficient channel sharing among multiple users, which offers low cost, reduced latency, and low power consumption. The proposed scheme optimizes energy consumption by efficiently managing task delegation and resource allocation in MEC. Additionally, smart contracts automate blockchain operations, enhancing efficiency and security. The proposed scheme is evaluated in terms of energy consumption, transmission rate, latency, and offloading delay. The results of the experiments show the effectiveness of the proposed scheme compared to state-of-the-art offloading approaches in terms of improving energy efficiency, transmission rate, offloading delay, and latency of the system.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"208 ","pages":"Article 105198"},"PeriodicalIF":4.0,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reload: Deep reinforcement learning-based workload distribution for collaborative edges 重载:基于深度强化学习的协同边缘工作负载分配
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-05 DOI: 10.1016/j.jpdc.2025.105191
Yu Liang , Jidong Ge , Jie Wu , Sheng Zhang , Shiwu Wen , Bin Luo
Edge computing is one of the promising technologies that aim to enable timely computation at the network edge. Major service providers started to deploy geographically-distributed edge servers several years ago. A major challenge in geographically distributed edges is task scheduling, i.e., how to assign various tasks submitted by mobile users to distributed edges so as to optimize some metric. However, it is not easy to perform task scheduling in geographically-distributed edges. We observed that there are three major challenges in designing an efficient task scheduling solution in dynamic edge environments: heterogeneous tasks, dynamic edge networks, and heterogeneous edge servers. In this paper, we pursue a black-box solution for task scheduling in the collaborative edge environment while not relying on detailed analytical performance modeling. We propose Reload, an intelligent deep reinforcement learning-based task scheduler. Reload learns a policy purely based on the known information, without foreseeing the future. Reload depicts its policy as a neural network that maps “raw” observations to scheduling actions. During training, Reload starts out knowing nothing and gradually learns to make better scheduling decisions through reinforcement, in the form of reward signals for past decisions. Reload leverages Advantage Actor Critic to train the policy network. We evaluate Reload using extensive simulations.
边缘计算是一种很有前途的技术,旨在实现网络边缘的及时计算。几年前,主要的服务提供商开始部署地理分布的边缘服务器。任务调度是地理分布边的一个主要挑战,即如何将移动用户提交的各种任务分配到分布边以优化某些度量。然而,在地理分布的边缘上执行任务调度并不容易。我们观察到,在动态边缘环境中设计有效的任务调度解决方案存在三个主要挑战:异构任务、动态边缘网络和异构边缘服务器。在本文中,我们在协作边缘环境中寻求任务调度的黑盒解决方案,而不依赖于详细的分析性能建模。我们提出Reload,一个基于深度强化学习的智能任务调度程序。Reload完全根据已知信息学习策略,而不需要预测未来。Reload将其策略描述为一个神经网络,将“原始”观察映射到调度操作。在训练过程中,Reload一开始什么都不知道,然后通过对过去决策的奖励信号的强化,逐渐学会做出更好的调度决策。Reload利用优势演员评论家来训练政策网络。我们使用大量的模拟来评估Reload。
{"title":"Reload: Deep reinforcement learning-based workload distribution for collaborative edges","authors":"Yu Liang ,&nbsp;Jidong Ge ,&nbsp;Jie Wu ,&nbsp;Sheng Zhang ,&nbsp;Shiwu Wen ,&nbsp;Bin Luo","doi":"10.1016/j.jpdc.2025.105191","DOIUrl":"10.1016/j.jpdc.2025.105191","url":null,"abstract":"<div><div>Edge computing is one of the promising technologies that aim to enable timely computation at the network edge. Major service providers started to deploy geographically-distributed edge servers several years ago. A major challenge in geographically distributed edges is task scheduling, i.e., how to assign various tasks submitted by mobile users to distributed edges so as to optimize some metric. However, it is not easy to perform task scheduling in geographically-distributed edges. We observed that there are three major challenges in designing an efficient task scheduling solution in dynamic edge environments: heterogeneous tasks, dynamic edge networks, and heterogeneous edge servers. In this paper, we pursue a black-box solution for task scheduling in the collaborative edge environment while not relying on detailed analytical performance modeling. We propose Reload, an intelligent deep reinforcement learning-based task scheduler. Reload learns a policy purely based on the known information, without foreseeing the future. Reload depicts its policy as a neural network that maps “raw” observations to scheduling actions. During training, Reload starts out knowing nothing and gradually learns to make better scheduling decisions through reinforcement, in the form of reward signals for past decisions. Reload leverages Advantage Actor Critic to train the policy network. We evaluate Reload using extensive simulations.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"209 ","pages":"Article 105191"},"PeriodicalIF":4.0,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the privacy preservation and secure communication in fog-based mobile crowdsensing 基于雾的移动众测隐私保护与安全通信研究
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-02 DOI: 10.1016/j.jpdc.2025.105184
Sunday Oyinlola Ogundoyin , Ismaila Adeniyi Kamil , Isaac Adewale Ojedokun , John Oluwaseun Babalola , Vincent Omollo Nyangaresi
The rapid growth of mobile crowdsensing (MCS) presents significant opportunities for large-scale data collection through the collaborative use of smart devices. However, the increasing volume of sensitive data generated by MCS services poses critical challenges in terms of privacy and security, particularly latency-sensitive applications. While some solutions exist, many still lack the scalability, robustness, efficiency, and privacy preservation required by MCS services. This paper proposes a privacy-preserving and secure fog-based MCS (FB-MCS) scheme, comprising a four-tier fog computing architecture that incorporates dynamic lower-tier fog (LTF) and static upper-tier fog (UTF) systems to enable efficient data aggregation, privacy protection, and secure communication. The proposed scheme utilizes secret sharing and homomorphic MAC techniques for efficient and verifiable data aggregation over multi-dimensional data, allowing a data requester to verify the accuracy and integrity of the aggregated results. The scheme employs a lightweight elliptic curve cryptography (ECC) to ensure secure authentication without overburdening resource-constrained devices. An adaptive fog node selection strategy, based on trust and mobility, is proposed for reliable real-time task allocation. Extensive security analysis demonstrates that the scheme not only guarantees privacy preservation, integrity of aggregation results, strong anonymity, un-linkability, traceability, and resistance to well-known attacks but also achieves data confidentiality and unforgeability in the random oracle model under Type I and Type II adversaries, assuming the Computational Diffie-Hellman Problem (CDHP) and Discrete Logarithm Problem (DLP) are intractable. Moreover, performance assessments indicate that the proposed scheme surpasses previous advanced solutions, achieving a 48 % – 280 % improvement in efficiency.
移动群体感知(MCS)的快速发展为通过智能设备的协同使用进行大规模数据收集提供了重要机会。然而,MCS服务生成的敏感数据量不断增加,在隐私和安全方面带来了严峻的挑战,特别是对延迟敏感的应用程序。虽然存在一些解决方案,但许多解决方案仍然缺乏MCS服务所需的可伸缩性、健壮性、效率和隐私保护。本文提出了一种隐私保护和安全的基于雾的MCS (FB-MCS)方案,该方案包括一个四层雾计算架构,该架构结合了动态下层雾(LTF)和静态上层雾(UTF)系统,以实现高效的数据聚合、隐私保护和安全通信。该方案利用秘密共享和同态MAC技术对多维数据进行有效和可验证的数据聚合,允许数据请求者验证聚合结果的准确性和完整性。该方案采用轻量级的椭圆曲线加密(ECC)来确保安全认证,而不会给资源受限的设备带来过重的负担。提出了一种基于信任和可移动性的自适应雾节点选择策略,用于实时可靠的任务分配。大量的安全性分析表明,该方案不仅保证了隐私保护、聚合结果的完整性、强匿名性、不可链接性、可追溯性和抗知名攻击,而且在假设计算Diffie-Hellman问题(CDHP)和离散对数问题(DLP)难以处理的情况下,在I类和II类对手的随机oracle模型中实现了数据的保密性和不可伪造性。此外,性能评估表明,所提出的方案超过了以前的先进解决方案,实现了48% - 280%的效率提高。
{"title":"On the privacy preservation and secure communication in fog-based mobile crowdsensing","authors":"Sunday Oyinlola Ogundoyin ,&nbsp;Ismaila Adeniyi Kamil ,&nbsp;Isaac Adewale Ojedokun ,&nbsp;John Oluwaseun Babalola ,&nbsp;Vincent Omollo Nyangaresi","doi":"10.1016/j.jpdc.2025.105184","DOIUrl":"10.1016/j.jpdc.2025.105184","url":null,"abstract":"<div><div>The rapid growth of mobile crowdsensing (MCS) presents significant opportunities for large-scale data collection through the collaborative use of smart devices. However, the increasing volume of sensitive data generated by MCS services poses critical challenges in terms of privacy and security, particularly latency-sensitive applications. While some solutions exist, many still lack the scalability, robustness, efficiency, and privacy preservation required by MCS services. This paper proposes a privacy-preserving and secure fog-based MCS (FB-MCS) scheme, comprising a four-tier fog computing architecture that incorporates dynamic lower-tier fog (LTF) and static upper-tier fog (UTF) systems to enable efficient data aggregation, privacy protection, and secure communication. The proposed scheme utilizes secret sharing and homomorphic MAC techniques for efficient and verifiable data aggregation over multi-dimensional data, allowing a data requester to verify the accuracy and integrity of the aggregated results. The scheme employs a lightweight elliptic curve cryptography (ECC) to ensure secure authentication without overburdening resource-constrained devices. An adaptive fog node selection strategy, based on trust and mobility, is proposed for reliable real-time task allocation. Extensive security analysis demonstrates that the scheme not only guarantees privacy preservation, integrity of aggregation results, strong anonymity, un-linkability, traceability, and resistance to well-known attacks but also achieves data confidentiality and unforgeability in the random oracle model under Type I and Type II adversaries, assuming the Computational Diffie-Hellman Problem (CDHP) and Discrete Logarithm Problem (DLP) are intractable. Moreover, performance assessments indicate that the proposed scheme surpasses previous advanced solutions, achieving a 48 % – 280 % improvement in efficiency.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"208 ","pages":"Article 105184"},"PeriodicalIF":4.0,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145520966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diverse attack detection in IoT using hybrid deep convolutional with capsule auto encoder for intrusion detection model 利用混合深度卷积和胶囊自动编码器的入侵检测模型进行物联网中的多种攻击检测
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-28 DOI: 10.1016/j.jpdc.2025.105190
M. Dharmalingam , Kamalraj Subramaniam , Ashwin M , N. Nandhagopal
One rapidly expanding technology that is effectively employed in many different applications is the Internet of Things (IoT) network. There are daily communication issues in an Internet of Things network because of the enormous number of connecting nodes. A cloud service is used as a backend by the IoT platform to process data and manage remote control. A successful intrusion detection system (IDS) that can track computer sources and generate data on suspicious or unusual activity is essential for managing the growing complexity of cyberattacks. The security of the IoT network may increasingly become a major issue as IoT technology becomes widely used. Due to the large number and diversity of IoT devices, protecting IoT systems with conventional IDS is difficult. This proposed approach can mitigate the problems of existing studies and achieve better results in the process of detection IoT attacks. Initially, min-max normalization is used to pre-process the inputs to speed up training and improve the efficiency of the proposed model. From the normalized data, a new Adaptive Eagle Cat Optimization (AECO) with unique hunting and search functions is used to select features. Finally, based on the selected features, a Hybrid Deep Convolutional with Capsule Auto Encoder (Hybrid_DCAE) is proposed to classify various intruders, and an Enhanced Gannet Optimization Algorithm (EGOA) is used to fine-tune the parameters for improved system performance. The results analysis shows that the proposed model achieved 96.8 % accuracy in the BoT-IoT dataset, 97.21 % in CICIDS-2017, 97.55 % in UNSW-NB15, and 97.4 % in the DS2OS dataset.
物联网(IoT)网络是一种快速扩展的技术,它被有效地应用于许多不同的应用中。在物联网网络中,由于连接节点的数量巨大,因此存在日常通信问题。云服务被物联网平台用作后端,用于处理数据和管理远程控制。一个成功的入侵检测系统(IDS)可以跟踪计算机来源并生成可疑或异常活动的数据,对于管理日益复杂的网络攻击至关重要。随着物联网技术的广泛应用,物联网网络的安全性可能日益成为一个主要问题。由于物联网设备的数量和多样性,用传统的IDS保护物联网系统是困难的。提出的方法可以减轻现有研究的问题,并在检测物联网攻击的过程中取得更好的结果。首先,使用最小-最大归一化对输入进行预处理,以加快训练速度,提高模型的效率。从归一化的数据中,使用一种新的具有独特的狩猎和搜索功能的自适应鹰猫优化(AECO)来选择特征。最后,基于所选特征,提出了一种带有胶囊自动编码器的混合深度卷积算法(Hybrid_DCAE)对各种入侵者进行分类,并使用增强塘鹅优化算法(EGOA)对参数进行微调,以提高系统性能。结果分析表明,该模型在BoT-IoT数据集中的准确率为96.8%,在CICIDS-2017、UNSW-NB15和ds2s数据集中的准确率分别为97.21%、97.55%和97.4%。
{"title":"Diverse attack detection in IoT using hybrid deep convolutional with capsule auto encoder for intrusion detection model","authors":"M. Dharmalingam ,&nbsp;Kamalraj Subramaniam ,&nbsp;Ashwin M ,&nbsp;N. Nandhagopal","doi":"10.1016/j.jpdc.2025.105190","DOIUrl":"10.1016/j.jpdc.2025.105190","url":null,"abstract":"<div><div>One rapidly expanding technology that is effectively employed in many different applications is the Internet of Things (IoT) network. There are daily communication issues in an Internet of Things network because of the enormous number of connecting nodes. A cloud service is used as a backend by the IoT platform to process data and manage remote control. A successful intrusion detection system (IDS) that can track computer sources and generate data on suspicious or unusual activity is essential for managing the growing complexity of cyberattacks. The security of the IoT network may increasingly become a major issue as IoT technology becomes widely used. Due to the large number and diversity of IoT devices, protecting IoT systems with conventional IDS is difficult. This proposed approach can mitigate the problems of existing studies and achieve better results in the process of detection IoT attacks. Initially, min-max normalization is used to pre-process the inputs to speed up training and improve the efficiency of the proposed model. From the normalized data, a new Adaptive Eagle Cat Optimization (AECO) with unique hunting and search functions is used to select features. Finally, based on the selected features, a Hybrid Deep Convolutional with Capsule Auto Encoder (Hybrid_DCAE) is proposed to classify various intruders, and an Enhanced Gannet Optimization Algorithm (EGOA) is used to fine-tune the parameters for improved system performance. The results analysis shows that the proposed model achieved 96.8 % accuracy in the BoT-IoT dataset, 97.21 % in CICIDS-2017, 97.55 % in UNSW-NB15, and 97.4 % in the DS2OS dataset.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"208 ","pages":"Article 105190"},"PeriodicalIF":4.0,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145520965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-modal model partition strategy for end-edge collaborative inference 端缘协同推理的多模态模型划分策略
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-28 DOI: 10.1016/j.jpdc.2025.105189
Dongkun Huo , Yingting Zhou , Yixue Hao , Long Hu , Yijun Mo , Min Chen , Iztok Humar
Advances in artificial intelligence(AI) have significantly boosted the application of intelligent models, and deploying deep neural network models for device inference is increasingly common. However, it is difficult for resource-constrained devices to handle the huge computational load of neural networks. So partitioning models to co-compute at the edge cloud and terminals accelerates real-time inference at the edge. Existing research overlooks resource allocation and collaborative decision-making for dynamic edge networks, and high-dimensional features cause transmission delay. To address this, we propose a feature-sensitive compression algorithm that implements differentiated compression based on feature importance to reduce communication load while maintaining inference accuracy. Then, we design a reinforcement learning approach for resource allocation and an online model partition algorithm using contextual bandits, leveraging compressed features for adaptive decisions in dynamic environments. Finally, we conduct a large number of experiments on different types of networks, and the results show that our approach can reduce the inference delay by up to 65.4 % and save up to 77.6 % of energy consumption.
人工智能(AI)的进步极大地推动了智能模型的应用,部署深度神经网络模型进行设备推理越来越普遍。然而,资源受限的设备很难处理神经网络的巨大计算负荷。因此,在边缘云和终端上进行协同计算的划分模型加速了边缘的实时推理。现有研究忽略了动态边缘网络的资源分配和协同决策,高维特征导致传输延迟。为了解决这个问题,我们提出了一种特征敏感压缩算法,该算法基于特征重要性实现差异化压缩,以减少通信负载,同时保持推理准确性。然后,我们设计了一种用于资源分配的强化学习方法和使用上下文强盗的在线模型划分算法,利用压缩特征在动态环境中进行自适应决策。最后,我们在不同类型的网络上进行了大量的实验,结果表明,我们的方法可以将推理延迟降低高达65.4%,节省高达77.6%的能耗。
{"title":"Multi-modal model partition strategy for end-edge collaborative inference","authors":"Dongkun Huo ,&nbsp;Yingting Zhou ,&nbsp;Yixue Hao ,&nbsp;Long Hu ,&nbsp;Yijun Mo ,&nbsp;Min Chen ,&nbsp;Iztok Humar","doi":"10.1016/j.jpdc.2025.105189","DOIUrl":"10.1016/j.jpdc.2025.105189","url":null,"abstract":"<div><div>Advances in artificial intelligence(AI) have significantly boosted the application of intelligent models, and deploying deep neural network models for device inference is increasingly common. However, it is difficult for resource-constrained devices to handle the huge computational load of neural networks. So partitioning models to co-compute at the edge cloud and terminals accelerates real-time inference at the edge. Existing research overlooks resource allocation and collaborative decision-making for dynamic edge networks, and high-dimensional features cause transmission delay. To address this, we propose a feature-sensitive compression algorithm that implements differentiated compression based on feature importance to reduce communication load while maintaining inference accuracy. Then, we design a reinforcement learning approach for resource allocation and an online model partition algorithm using contextual bandits, leveraging compressed features for adaptive decisions in dynamic environments. Finally, we conduct a large number of experiments on different types of networks, and the results show that our approach can reduce the inference delay by up to 65.4 % and save up to 77.6 % of energy consumption.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"208 ","pages":"Article 105189"},"PeriodicalIF":4.0,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145435477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the development of high-performance, multi-GPU applications on heterogeneous systems leveraging SYCL 在利用SYCL的异构系统上开发高性能、多gpu应用程序
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-24 DOI: 10.1016/j.jpdc.2025.105188
Francisco J. Andújar , Rocío Carratalá-Sáez , Yuri Torres , Arturo Gonzalez-Escribano , Diego R. Llanos
Computational platforms for high-performance scientific applications are increasingly heterogeneous, incorporating multiple GPU accelerators. However, differences in GPU vendors, architectures, and programming models challenge performance portability and ease of development. SYCL provides a unified programming approach, enabling applications to target NVIDIA and AMD GPUs simultaneously while offering higher-level abstractions for data and task management. This paper evaluates SYCL’s performance and development effort using the Finite Time Lyapunov Exponent (FTLE) calculation as a case study. We compare SYCL’s AdaptiveCpp (Ahead-Of-Time and Just-In-Time) and Intel oneAPI compilers, along with different data management strategies (Unified Shared Memory and buffers), against equivalent CUDA and HIP implementations. Our analysis considers single and multi-GPU execution, including heterogeneous setups with GPUs from different vendors. Results show that, while SYCL introduces additional development effort compared to native CUDA and HIP implementations, it enables multi-vendor portability with minimal performance overhead when using specific design options. Based on our findings, we provide development guidelines to help programmers decide when to use SYCL versus vendor-specific alternatives.
用于高性能科学应用的计算平台越来越异构,包括多个GPU加速器。然而,GPU供应商、架构和编程模型的差异挑战了性能可移植性和开发的便利性。SYCL提供了一种统一的编程方法,使应用程序能够同时针对NVIDIA和AMD gpu,同时为数据和任务管理提供更高级别的抽象。本文使用有限时间李雅普诺夫指数(FTLE)计算作为案例研究来评估SYCL的性能和开发工作。我们比较了SYCL的AdaptiveCpp (Ahead-Of-Time和Just-In-Time)和Intel oneAPI编译器,以及不同的数据管理策略(统一共享内存和缓冲区),以及等效的CUDA和HIP实现。我们的分析考虑了单gpu和多gpu的执行,包括来自不同供应商的gpu的异构设置。结果表明,虽然与本地CUDA和HIP实现相比,SYCL引入了额外的开发工作,但在使用特定设计选项时,它能够以最小的性能开销实现多供应商可移植性。根据我们的发现,我们提供了开发指南,以帮助程序员决定何时使用SYCL还是特定于供应商的替代方案。
{"title":"On the development of high-performance, multi-GPU applications on heterogeneous systems leveraging SYCL","authors":"Francisco J. Andújar ,&nbsp;Rocío Carratalá-Sáez ,&nbsp;Yuri Torres ,&nbsp;Arturo Gonzalez-Escribano ,&nbsp;Diego R. Llanos","doi":"10.1016/j.jpdc.2025.105188","DOIUrl":"10.1016/j.jpdc.2025.105188","url":null,"abstract":"<div><div>Computational platforms for high-performance scientific applications are increasingly heterogeneous, incorporating multiple GPU accelerators. However, differences in GPU vendors, architectures, and programming models challenge performance portability and ease of development. SYCL provides a unified programming approach, enabling applications to target NVIDIA and AMD GPUs simultaneously while offering higher-level abstractions for data and task management. This paper evaluates SYCL’s performance and development effort using the Finite Time Lyapunov Exponent (FTLE) calculation as a case study. We compare SYCL’s AdaptiveCpp (Ahead-Of-Time and Just-In-Time) and Intel oneAPI compilers, along with different data management strategies (Unified Shared Memory and buffers), against equivalent CUDA and HIP implementations. Our analysis considers single and multi-GPU execution, including heterogeneous setups with GPUs from different vendors. Results show that, while SYCL introduces additional development effort compared to native CUDA and HIP implementations, it enables multi-vendor portability with minimal performance overhead when using specific design options. Based on our findings, we provide development guidelines to help programmers decide when to use SYCL versus vendor-specific alternatives.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105188"},"PeriodicalIF":4.0,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145467079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VLSI design and its hardware implementation for optimal image dehazing with adaptive bilateral filtering 自适应双边滤波图像去雾的VLSI设计及其硬件实现
IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-24 DOI: 10.1016/j.jpdc.2025.105186
A. Arul Edwin Raj , Nabihah Binti Ahmad , Jeffin Gracewell , Renugadevi R , C.T. Kalaivani
Fog and smog significantly hinder image processing by reducing visual output quality and disrupting the functionality of systems reliant on visual data. Existing dehazing methods face several challenges, including computational complexity, sensitivity to parameter settings and limited optimization for diverse conditions. To overcome these limitations, this paper introduces the Selective Bilateral Filtering and Color Attenuation Analysis (SBBFC), a new methodology for real-time image dehazing. While offering this benefit, SBBFC eliminates problems that prior methods have by dynamically controlling window sizes and using color attenuation analysis to sustain reliable performance in response to changes in the level of haze and to guarantee accurate color rendition in the dehazed image. The hardware-optimized way uses FPGA or ASIC type of technologies with high throughput and real-time response, better image quality and considerably better detail reproduction. When it comes to ASIC implementation, the concepts of the proposed architecture provide 350 MPixels/s at the cost of 15k gates and 5 mW of power consumption with an area efficiency of 0. 8 mm²/k. In hardware mode targeting FPGA design, it offers 100 MPixels/s performance at a clock frequency of 100 MHz. In light of the above specifications, it is evident that the proposed architecture would be fitting in delivering dehazing in real-time, with high throughput and at low power.
雾和烟雾通过降低视觉输出质量和破坏依赖视觉数据的系统的功能,严重阻碍了图像处理。现有的除雾方法面临着一些挑战,包括计算复杂性、对参数设置的敏感性以及对不同条件的有限优化。为了克服这些限制,本文介绍了选择性双边滤波和颜色衰减分析(SBBFC),这是一种实时图像去雾的新方法。在提供这种好处的同时,SBBFC消除了以前的方法所存在的问题,通过动态控制窗口大小和使用颜色衰减分析来维持可靠的性能,以响应雾霾水平的变化,并保证去雾图像中的准确色彩再现。硬件优化方式采用FPGA或ASIC类型的技术,具有高吞吐量和实时响应,更好的图像质量和更好的细节再现。当涉及到ASIC实现时,所提出的架构概念以15k门的成本和5mw的功耗提供350 MPixels/s,面积效率为0。8毫米²/ k。在针对FPGA设计的硬件模式下,它在100 MHz的时钟频率下提供100 MPixels/s的性能。根据上述规格,很明显,所提出的架构将适合实时提供高吞吐量和低功耗的除雾。
{"title":"VLSI design and its hardware implementation for optimal image dehazing with adaptive bilateral filtering","authors":"A. Arul Edwin Raj ,&nbsp;Nabihah Binti Ahmad ,&nbsp;Jeffin Gracewell ,&nbsp;Renugadevi R ,&nbsp;C.T. Kalaivani","doi":"10.1016/j.jpdc.2025.105186","DOIUrl":"10.1016/j.jpdc.2025.105186","url":null,"abstract":"<div><div>Fog and smog significantly hinder image processing by reducing visual output quality and disrupting the functionality of systems reliant on visual data. Existing dehazing methods face several challenges, including computational complexity, sensitivity to parameter settings and limited optimization for diverse conditions. To overcome these limitations, this paper introduces the Selective Bilateral Filtering and Color Attenuation Analysis (SBBFC), a new methodology for real-time image dehazing. While offering this benefit, SBBFC eliminates problems that prior methods have by dynamically controlling window sizes and using color attenuation analysis to sustain reliable performance in response to changes in the level of haze and to guarantee accurate color rendition in the dehazed image. The hardware-optimized way uses FPGA or ASIC type of technologies with high throughput and real-time response, better image quality and considerably better detail reproduction. When it comes to ASIC implementation, the concepts of the proposed architecture provide 350 MPixels/s at the cost of 15k gates and 5 mW of power consumption with an area efficiency of 0. 8 mm²/k. In hardware mode targeting FPGA design, it offers 100 MPixels/s performance at a clock frequency of 100 MHz. In light of the above specifications, it is evident that the proposed architecture would be fitting in delivering dehazing in real-time, with high throughput and at low power.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"210 ","pages":"Article 105186"},"PeriodicalIF":4.0,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145885770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Parallel and Distributed Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1