首页 > 最新文献

Journal of Parallel and Distributed Computing最新文献

英文 中文
HoneyTwin: Securing smart cities with machine learning-enabled SDN edge and cloud-based honeypots HoneyTwin:利用支持机器学习的 SDN 边缘和基于云的蜜罐确保智慧城市安全
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-20 DOI: 10.1016/j.jpdc.2024.104866
Mohammed M. Alani

With the promise of higher throughput, and better response times, 6G networks provide a significant enabler for smart cities to evolve. The rapidly-growing reliance on connected devices within the smart city context encourages malicious actors to target these devices to achieve various malicious goals. In this paper, we present a novel defense technique that creates a cloud-based virtualized honeypot/twin that is designed to receive malicious traffic through edge-based machine learning-enabled detection system. The proposed system performs early identification of malicious traffic in a software defined network-enabled edge routing point to divert that traffic away from the 6G-enabled smart city endpoints. Testing of the proposed system showed an accuracy exceeding 99.8%, with an F1 score of 0.9984.

6G 网络有望实现更高的吞吐量和更短的响应时间,为智慧城市的发展提供了重要的推动力。在智慧城市背景下,人们对联网设备的依赖性迅速增加,这促使恶意行为者瞄准这些设备,以实现各种恶意目标。在本文中,我们提出了一种新颖的防御技术,它创建了一个基于云的虚拟化蜜罐/双核,旨在通过基于边缘机器学习的检测系统接收恶意流量。所提出的系统可在软件定义网络支持的边缘路由点中对恶意流量进行早期识别,从而将这些流量从支持 6G 的智慧城市终端分流出去。对拟议系统的测试表明,其准确率超过 99.8%,F1 得分为 0.9984。
{"title":"HoneyTwin: Securing smart cities with machine learning-enabled SDN edge and cloud-based honeypots","authors":"Mohammed M. Alani","doi":"10.1016/j.jpdc.2024.104866","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104866","url":null,"abstract":"<div><p>With the promise of higher throughput, and better response times, 6G networks provide a significant enabler for smart cities to evolve. The rapidly-growing reliance on connected devices within the smart city context encourages malicious actors to target these devices to achieve various malicious goals. In this paper, we present a novel defense technique that creates a cloud-based virtualized honeypot/twin that is designed to receive malicious traffic through edge-based machine learning-enabled detection system. The proposed system performs early identification of malicious traffic in a software defined network-enabled edge routing point to divert that traffic away from the 6G-enabled smart city endpoints. Testing of the proposed system showed an accuracy exceeding 99.8%, with an <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> score of 0.9984.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139942060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical sort-based parallel algorithm for dynamic interest matching 基于分层排序的动态兴趣匹配并行算法
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-18 DOI: 10.1016/j.jpdc.2024.104867
Wenjie Tang, Yiping Yao, Lizhen Ou, Kai Chen

Publish–subscribe communication is a fundamental service used for message-passing between decoupled applications in distributed simulation. When abundant unnecessary data transfer is introduced, interest-matching services are needed to filter irrelevant message traffic. Frequent demands during simulation execution makes interest matching a bottleneck with increased simulation scale. Contemporary algorithms built for serial processing inadequately leverage multicore processor-based parallel resources. Parallel algorithmic improvements are insufficient for large-scale simulations. Therefore, we propose a hierarchical sort-based parallel algorithm for dynamic interest matching that embeds all update and subscription regions into two full binary trees, thereby transferring the region-matching task to one of node-matching. It utilizes the association between adjacent nodes and the hierarchical relation between parent‒child nodes to eliminate redundant operations, and achieves incremental parallel matching that only compares changed regions. We analyze the time and space complexity of this process. The new algorithm performs better and is more scalable than state-of-the-art algorithms.

发布-订阅通信是分布式仿真中解耦应用程序之间消息传递的基本服务。当引入大量不必要的数据传输时,就需要兴趣匹配服务来过滤不相关的消息流量。随着仿真规模的扩大,仿真执行过程中的频繁需求使得兴趣匹配成为一个瓶颈。为串行处理而构建的当代算法无法充分利用基于多核处理器的并行资源。并行算法的改进不足以应对大规模仿真。因此,我们提出了一种基于分层排序的动态兴趣匹配并行算法,它将所有更新和订阅区域嵌入两棵完整的二叉树中,从而将区域匹配任务转移到节点匹配任务中。它利用相邻节点之间的关联和父子节点之间的层级关系来消除冗余操作,并实现只比较变化区域的增量并行匹配。我们分析了这一过程的时间和空间复杂性。与最先进的算法相比,新算法性能更好,扩展性更强。
{"title":"Hierarchical sort-based parallel algorithm for dynamic interest matching","authors":"Wenjie Tang,&nbsp;Yiping Yao,&nbsp;Lizhen Ou,&nbsp;Kai Chen","doi":"10.1016/j.jpdc.2024.104867","DOIUrl":"10.1016/j.jpdc.2024.104867","url":null,"abstract":"<div><p>Publish–subscribe communication is a fundamental service used for message-passing between decoupled applications in distributed simulation. When abundant unnecessary data transfer is introduced, interest-matching services are needed to filter irrelevant message traffic. Frequent demands during simulation execution makes interest matching a bottleneck with increased simulation scale. Contemporary algorithms built for serial processing inadequately leverage multicore processor-based parallel resources. Parallel algorithmic improvements are insufficient for large-scale simulations. Therefore, we propose a hierarchical sort-based parallel algorithm for dynamic interest matching that embeds all update and subscription regions into two full binary trees, thereby transferring the region-matching task to one of node-matching. It utilizes the association between adjacent nodes and the hierarchical relation between parent‒child nodes to eliminate redundant operations, and achieves incremental parallel matching that only compares changed regions. We analyze the time and space complexity of this process. The new algorithm performs better and is more scalable than state-of-the-art algorithms.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139923545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting I/O bandwidth-sharing strategies for HPC applications 重新审视高性能计算应用的 I/O 带宽共享策略
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-15 DOI: 10.1016/j.jpdc.2024.104863
Anne Benoit , Thomas Herault , Lucas Perotin , Yves Robert , Frédéric Vivien

This work revisits I/O bandwidth-sharing strategies for HPC applications. When several applications post concurrent I/O operations, well-known approaches include serializing these operations (

) or fair-sharing the bandwidth across them (FairShare). Another recent approach, I/O-Sets, assigns priorities to the applications, which are classified into different sets based upon the average length of their iterations. We introduce several new bandwidth-sharing strategies, some of them simple greedy algorithms, and some of them more complicated to implement, and we compare them with existing ones. Our new strategies do not rely on any a-priori knowledge of the behavior of the applications, such as the length of work phases, the volume of I/O operations, or some expected periodicity. We introduce a rigorous framework, namely steady-state windows, which enables to derive bounds on the competitive ratio of all bandwidth-sharing strategies for three different objectives: minimum yield, platform utilization, and global efficiency. To the best of our knowledge, this work is the first to provide a quantitative assessment of the online competitiveness of any bandwidth-sharing strategy. This theory-oriented assessment is complemented by a comprehensive set of simulations, based upon both synthetic and realistic traces. The main conclusion is that two of our simple and low-complexity greedy strategies significantly outperform
, FairShare and I/O-Sets, and we recommend that the I/O community would implement them for further assessment.

这项工作重新探讨了高性能计算应用的 I/O 带宽共享策略。当多个应用程序同时进行 I/O 操作时,众所周知的方法包括将这些操作序列化()或在它们之间公平共享带宽(FairShare)。另一种最新方法是 I/O 集,它为应用程序分配优先级,并根据其迭代的平均长度将其分为不同的集。我们引入了几种新的带宽共享策略,其中一些是简单的贪婪算法,另一些则实施起来较为复杂,我们还将它们与现有策略进行了比较。我们的新策略不依赖于对应用程序行为的任何先验知识,例如工作阶段的长度、I/O 操作量或某些预期周期。我们引入了一个严格的框架,即稳态窗口,它可以推导出所有带宽共享策略在三个不同目标下的竞争比率界限:最小产量、平台利用率和全局效率。据我们所知,这是首次对任何带宽共享策略的在线竞争力进行定量评估。这一以理论为导向的评估还辅以一套基于合成和现实轨迹的综合模拟。主要结论是,我们的两种简单、低复杂度的贪婪策略明显优于 FairShare 和 I/O-Sets。
{"title":"Revisiting I/O bandwidth-sharing strategies for HPC applications","authors":"Anne Benoit ,&nbsp;Thomas Herault ,&nbsp;Lucas Perotin ,&nbsp;Yves Robert ,&nbsp;Frédéric Vivien","doi":"10.1016/j.jpdc.2024.104863","DOIUrl":"10.1016/j.jpdc.2024.104863","url":null,"abstract":"<div><p>This work revisits I/O bandwidth-sharing strategies for HPC applications. When several applications post concurrent I/O operations, well-known approaches include serializing these operations (<figure><img></figure>) or fair-sharing the bandwidth across them (<span>FairShare</span>). Another recent approach, I/O-Sets, assigns priorities to the applications, which are classified into different sets based upon the average length of their iterations. We introduce several new bandwidth-sharing strategies, some of them simple greedy algorithms, and some of them more complicated to implement, and we compare them with existing ones. Our new strategies do not rely on any a-priori knowledge of the behavior of the applications, such as the length of work phases, the volume of I/O operations, or some expected periodicity. We introduce a rigorous framework, namely <em>steady-state windows</em>, which enables to derive bounds on the competitive ratio of all bandwidth-sharing strategies for three different objectives: minimum yield, platform utilization, and global efficiency. To the best of our knowledge, this work is the first to provide a quantitative assessment of the online competitiveness of any bandwidth-sharing strategy. This theory-oriented assessment is complemented by a comprehensive set of simulations, based upon both synthetic and realistic traces. The main conclusion is that two of our simple and low-complexity greedy strategies significantly outperform <figure><img></figure>, <span>FairShare</span> and I/O-Sets, and we recommend that the I/O community would implement them for further assessment.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139878546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues) 封面 1 - 完整扉页(常规期刊)/特刊扉页(特刊)
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-12 DOI: 10.1016/S0743-7315(24)00023-6
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(24)00023-6","DOIUrl":"https://doi.org/10.1016/S0743-7315(24)00023-6","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524000236/pdfft?md5=8661326c859cab793505056ef1edee51&pid=1-s2.0-S0743731524000236-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139726370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring multiprocessor approaches to time series analysis 探索时间序列分析的多处理器方法
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-08 DOI: 10.1016/j.jpdc.2024.104855
Ricardo Quislant, Eladio Gutierrez, Oscar Plata

Time series analysis is a key technique for extracting and predicting events in domains as diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, etc. Matrix Profile, a state-of-the-art algorithm to perform time series analysis, finds out the most similar and dissimilar subsequences in a time series in deterministic time and it is exact. Matrix Profile has low arithmetic intensity and it operates on large amounts of time series data, which can be an issue in terms of memory requirements. On the other hand, Hardware Transactional Memory (HTM) is an alternative optimistic synchronization method that executes transactions speculatively in parallel while keeping track of memory accesses to detect and resolve conflicts.

This work evaluates one of the best implementations of Matrix Profile exploring multiple multiprocessor variants and proposing new implementations that consider a variety of synchronization methods (HTM, locks, barriers), as well as algorithm organizations. We analyze these variants using real datasets, both short and large, in terms of speedup and memory requirements, the latter being a major issue when dealing with very large time series. The experimental evaluation shows that our proposals can achieve up to 100× speedup over the sequential algorithm for 128 threads, and up to 3× over the baseline, while keeping memory requirements low and even independent of the number of threads.

时间序列分析是提取和预测流行病学、基因组学、神经科学、环境科学、经济学等不同领域事件的关键技术。Matrix Profile 是一种最先进的时间序列分析算法,它能在确定的时间内找出时间序列中最相似和最不相似的子序列,而且是精确的。Matrix Profile 的运算强度较低,可处理大量的时间序列数据,这可能是内存需求方面的一个问题。另一方面,硬件事务内存(HTM)是另一种乐观的同步方法,它以并行方式推测性地执行事务,同时跟踪内存访问以检测和解决冲突。这项工作评估了 Matrix Profile 的最佳实现之一,探索了多个多处理器变体,并提出了考虑各种同步方法(HTM、锁、障碍)以及算法组织的新实现。我们使用真实的短期和长期数据集分析了这些变体的速度提升和内存需求,后者是处理超大时间序列时的一个主要问题。实验评估表明,在 128 个线程的情况下,我们的建议比顺序算法的速度提高了 100 倍,比基准算法的速度提高了 3 倍,同时保持了较低的内存需求,甚至与线程数无关。
{"title":"Exploring multiprocessor approaches to time series analysis","authors":"Ricardo Quislant,&nbsp;Eladio Gutierrez,&nbsp;Oscar Plata","doi":"10.1016/j.jpdc.2024.104855","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104855","url":null,"abstract":"<div><p>Time series analysis is a key technique for extracting and predicting events in domains as diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, etc. <em>Matrix Profile</em>, a state-of-the-art algorithm to perform time series analysis, finds out the most similar and dissimilar subsequences in a time series in deterministic time and it is exact. Matrix Profile has low arithmetic intensity and it operates on large amounts of time series data, which can be an issue in terms of memory requirements. On the other hand, Hardware Transactional Memory (HTM) is an alternative optimistic synchronization method that executes transactions speculatively in parallel while keeping track of memory accesses to detect and resolve conflicts.</p><p>This work evaluates one of the best implementations of Matrix Profile exploring multiple multiprocessor variants and proposing new implementations that consider a variety of synchronization methods (HTM, locks, barriers), as well as algorithm organizations. We analyze these variants using real datasets, both short and large, in terms of speedup and memory requirements, the latter being a major issue when dealing with very large time series. The experimental evaluation shows that our proposals can achieve up to 100× speedup over the sequential algorithm for 128 threads, and up to 3× over the baseline, while keeping memory requirements low and even independent of the number of threads.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524000194/pdfft?md5=a25b14cc13a327c9c4b6c5f9abde8126&pid=1-s2.0-S0743731524000194-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast recovery for large disk enclosures based on RAID2.0: Algorithms and evaluation 基于 RAID2.0 的大型磁盘阵列的快速恢复:算法与评估
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-07 DOI: 10.1016/j.jpdc.2024.104854
Qiliang Li , Min Lyu , Liangliang Xu , Yinlong Xu

The RAID2.0 architecture, which uses dozens or even hundreds of disks, is widely adopted for large-capacity data storage. However, limited resources like memory and CPU cause RAID2.0 to execute batch recovery for disk failures. The traditional random data placement and recovery schemes result in highly skewed I/O access within a batch, which slows down the recovery speed. To address this issue, we propose DR-RAID, an efficient reconstruction scheme that balances local rebuilding workloads across all surviving disks within a batch. We dynamically select a batch of tasks with almost balanced read loads and make intra-batch adjustments for tasks with multiple solutions of reading source chunks. Furthermore, we use a bipartite graph model to achieve a uniform distribution of write loads. DR-RAID can be applied with homogeneous or heterogeneous disk rebuilding bandwidth. Experimental results demonstrate that in offline rebuilding, DR-RAID enhances the rebuilding throughput by up to 61.90% compared to the random data placement scheme. With varied rebuilding bandwidth, the improvement can reach up to 65.00%.

RAID2.0 架构使用数十个甚至数百个磁盘,被广泛用于大容量数据存储。然而,由于内存和 CPU 等资源有限,RAID2.0 只能对磁盘故障执行批量恢复。传统的随机数据放置和恢复方案会导致批次内的 I/O 访问高度倾斜,从而降低恢复速度。为了解决这个问题,我们提出了 DR-RAID,这是一种高效的重建方案,可以平衡批次内所有存活磁盘的本地重建工作量。我们动态地选择一批读取负载基本平衡的任务,并对有多种读取源块解决方案的任务进行批内调整。此外,我们还使用双链图模型来实现写入负载的均匀分布。DR-RAID 可应用于同质或异质磁盘重建带宽。实验结果表明,在离线重建中,与随机数据放置方案相比,DR-RAID 提高了高达 61.90% 的重建吞吐量。在重建带宽不同的情况下,提高幅度可达 65.00%。
{"title":"Fast recovery for large disk enclosures based on RAID2.0: Algorithms and evaluation","authors":"Qiliang Li ,&nbsp;Min Lyu ,&nbsp;Liangliang Xu ,&nbsp;Yinlong Xu","doi":"10.1016/j.jpdc.2024.104854","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104854","url":null,"abstract":"<div><p>The RAID2.0 architecture, which uses dozens or even hundreds of disks, is widely adopted for large-capacity data storage. However, limited resources like memory and CPU cause RAID2.0 to execute batch recovery for disk failures. The traditional random data placement and recovery schemes result in highly skewed I/O access within a batch, which slows down the recovery speed. To address this issue, we propose DR-RAID, an efficient reconstruction scheme that balances local rebuilding workloads across all surviving disks within a batch. We dynamically select a batch of tasks with almost balanced read loads and make intra-batch adjustments for tasks with multiple solutions of reading source chunks. Furthermore, we use a bipartite graph model to achieve a uniform distribution of write loads. DR-RAID can be applied with homogeneous or heterogeneous disk rebuilding bandwidth. Experimental results demonstrate that in offline rebuilding, DR-RAID enhances the rebuilding throughput by up to 61.90% compared to the random data placement scheme. With varied rebuilding bandwidth, the improvement can reach up to 65.00%.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139732543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the effectiveness of Bat optimization in an adaptive and energy-efficient network-on-chip routing framework 评估自适应高能效片上网络路由框架中蝙蝠优化的效果
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-05 DOI: 10.1016/j.jpdc.2024.104853
B. Naresh Kumar Reddy , Aruru Sai Kumar

Adaptive routing is effective in maintaining higher processor performance and avoids packets over minimal or non-minimal alternate routes without congestion for a multiprocessor system on chip. However, many systems cannot deal with the fact that sending packets over an alternative path rather than the shorter, fixed-priority route can result in packets arriving at the destination node out of order. This can occur if packets belonging to the same communication flow are adaptively routed through a different path. In real-world network systems, there are strategies and algorithms to efficiently handle out-of-order packets without requiring infinite memory. Techniques like buffering, sliding windows, and sequence number management are used to reorder packets while considering the practical constraints of available memory and processing power. The specific method used depends on the network protocol and the requirements of the application. In the proposed technique, a novel technique aimed at improving the performance of multiprocessor systems on chip by implementing adaptive routing based on the Bat algorithm. The framework employs 5 stage pipeline router, that completely gained and forward a packet at the perfect direction in an adaptive mode. Bat algorithm is used to enhance the performance, which can optimize route to transmit packets at the destination. A test was carried out on various NoC sizes (6 X 6 and 8 X 8) under multimedia benchmarks, compared with other related algorithms and implemented on Kintex-7 FPGA board. The outcomes of the simulation illustrate that the proposed algorithm reduces delay and improves the throughput over the other traditional adaptive algorithms.

对于芯片上的多处理器系统而言,自适应路由选择可有效保持较高的处理器性能,并避免数据包通过最小或非最小的备用路径而造成拥塞。然而,许多系统无法处理这样一个事实,即通过替代路径而不是更短、固定优先级的路径发送数据包,会导致数据包不按顺序到达目的地节点。如果属于同一通信流的数据包通过不同路径自适应路由,就会出现这种情况。在现实世界的网络系统中,有一些策略和算法可以在不需要无限内存的情况下有效处理失序数据包。缓冲、滑动窗口和序列号管理等技术可用于重新排序数据包,同时考虑可用内存和处理能力的实际限制。具体采用哪种方法取决于网络协议和应用程序的要求。在所提出的技术中,一种新型技术旨在通过实施基于 Bat 算法的自适应路由来提高芯片上多处理器系统的性能。该框架采用 5 级流水线路由器,以自适应模式在最佳方向完全获取和转发数据包。Bat 算法用于提高性能,可以优化路由,将数据包传送到目的地。在多媒体基准下对不同尺寸(6 X 6 和 8 X 8)的 NoC 进行了测试,与其他相关算法进行了比较,并在 Kintex-7 FPGA 板上进行了实现。仿真结果表明,与其他传统自适应算法相比,建议的算法减少了延迟,提高了吞吐量。
{"title":"Evaluating the effectiveness of Bat optimization in an adaptive and energy-efficient network-on-chip routing framework","authors":"B. Naresh Kumar Reddy ,&nbsp;Aruru Sai Kumar","doi":"10.1016/j.jpdc.2024.104853","DOIUrl":"10.1016/j.jpdc.2024.104853","url":null,"abstract":"<div><p>Adaptive routing is effective in maintaining higher processor performance and avoids packets over minimal or non-minimal alternate routes without congestion for a multiprocessor system on chip. However, many systems cannot deal with the fact that sending packets over an alternative path rather than the shorter, fixed-priority route can result in packets arriving at the destination node out of order. This can occur if packets belonging to the same communication flow are adaptively routed through a different path. In real-world network systems, there are strategies and algorithms to efficiently handle out-of-order packets without requiring infinite memory. Techniques like buffering, sliding windows, and sequence number management are used to reorder packets while considering the practical constraints of available memory and processing power. The specific method used depends on the network protocol and the requirements of the application. In the proposed technique, a novel technique aimed at improving the performance of multiprocessor systems on chip by implementing adaptive routing based on the Bat algorithm. The framework employs 5 stage pipeline router, that completely gained and forward a packet at the perfect direction in an adaptive mode. Bat algorithm is used to enhance the performance, which can optimize route to transmit packets at the destination. A test was carried out on various NoC sizes (6 X 6 and 8 X 8) under multimedia benchmarks, compared with other related algorithms and implemented on Kintex-7 FPGA board. The outcomes of the simulation illustrate that the proposed algorithm reduces delay and improves the throughput over the other traditional adaptive algorithms.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139688940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collaborative dispersion by silent robots 无声机器人协同分散
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-05 DOI: 10.1016/j.jpdc.2024.104852
Barun Gorain , Partha Sarathi Mandal , Kaushik Mondal , Supantha Pandit

In the dispersion problem, a set of k co-located mobile robots must relocate themselves in distinct nodes of an unknown network. The network is modeled as an anonymous graph G=(V,E), where the graph's nodes are not labeled. The edges incident to a node v with degree d are labeled with port numbers in the range {0,1,,d1} at v. The robots have unique IDs in the range [0,L], where Lk, and are initially placed at a source node s. The task of the dispersion was traditionally achieved based on the assumption of two types of communication abilities: (a) when some robots are at the same node, they can communicate by exchanging messages between them, and (b) any two robots in the network can exchange messages between them. This paper investigates whether this communication ability among co-located robots is absolutely necessary to achieve dispersion. We establish that even in the absence of the ability of communication, the task of the dispersion by a set of mobile robots can be achieved in a much weaker model, where a robot at a node v has access to following very restricted information at the beginning of any round: (1) am I alone at v? (2) did the number of robots at v increase or decrease compared to the previous round?

We propose a deterministic distributed algorithm that achieves the dispersion on any given graph G=(V,E) in time O(klogL+k2logΔ), where Δ is the maximum degree of a node in G. Further, each robot uses O(logL+logΔ) additional memory, i.e., memory other than the memory required to store its id. We also prove that the task of the dispersion cannot be achieved by a set of mobile robots with o(logL+logΔ) additional memory.

在分散问题中,一组 k 个共同定位的移动机器人必须在一个未知网络的不同节点上重新定位。该网络被建模为一个匿名图 G=(V,E),图中的节点没有标记。机器人的 ID 范围为 [0,L](其中 L≥k),并且最初被放置在一个源节点 s 上。传统上,分散任务的实现基于两种通信能力假设:(a) 当一些机器人处于同一节点时,它们之间可以通过交换信息进行通信;(b) 网络中的任何两个机器人之间都可以交换信息。本文研究了同处一地的机器人之间的通信能力是否是实现分散的绝对必要条件。我们发现,即使没有通信能力,一组移动机器人的分散任务也可以在一个弱得多的模型中实现,即节点 v 上的机器人在任何一轮开始时都能获得以下非常有限的信息:(1) v 上只有我一个人吗?(我们提出了一种确定性分布式算法,可在 O(klogL+k2logΔ) 的时间内实现任意给定图 G=(V,E) 的分散,其中 Δ 是 G 中节点的最大度数、内存。我们还证明,使用 O(logL+logΔ) 额外内存的一组移动机器人无法完成分散任务。
{"title":"Collaborative dispersion by silent robots","authors":"Barun Gorain ,&nbsp;Partha Sarathi Mandal ,&nbsp;Kaushik Mondal ,&nbsp;Supantha Pandit","doi":"10.1016/j.jpdc.2024.104852","DOIUrl":"10.1016/j.jpdc.2024.104852","url":null,"abstract":"<div><p>In the dispersion problem, a set of <em>k</em> co-located mobile robots must relocate themselves in distinct nodes of an unknown network. The network is modeled as an anonymous graph <span><math><mi>G</mi><mo>=</mo><mo>(</mo><mi>V</mi><mo>,</mo><mi>E</mi><mo>)</mo></math></span>, where the graph's nodes are not labeled. The edges incident to a node <em>v</em> with degree <em>d</em> are labeled with port numbers in the range <span><math><mo>{</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>d</mi><mo>−</mo><mn>1</mn><mo>}</mo></math></span> at <em>v</em>. The robots have unique IDs in the range <span><math><mo>[</mo><mn>0</mn><mo>,</mo><mi>L</mi><mo>]</mo></math></span>, where <span><math><mi>L</mi><mo>≥</mo><mi>k</mi></math></span>, and are initially placed at a source node <em>s</em>. The task of the dispersion was traditionally achieved based on the assumption of two types of communication abilities: (a) when some robots are at the same node, they can communicate by exchanging messages between them, and (b) any two robots in the network can exchange messages between them. This paper investigates whether this communication ability among co-located robots is absolutely necessary to achieve dispersion. We establish that even in the absence of the ability of communication, the task of the dispersion by a set of mobile robots can be achieved in a much weaker model, where a robot at a node <em>v</em> has access to following very restricted information at the beginning of any round: (1) am I alone at <em>v</em>? (2) did the number of robots at <em>v</em> increase or decrease compared to the previous round?</p><p>We propose a deterministic distributed algorithm that achieves the dispersion on any given graph <span><math><mi>G</mi><mo>=</mo><mo>(</mo><mi>V</mi><mo>,</mo><mi>E</mi><mo>)</mo></math></span> in time <span><math><mi>O</mi><mrow><mo>(</mo><mi>k</mi><mi>log</mi><mo>⁡</mo><mi>L</mi><mo>+</mo><msup><mrow><mi>k</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>log</mi><mo>⁡</mo><mi>Δ</mi><mo>)</mo></mrow></math></span>, where Δ is the maximum degree of a node in <em>G</em>. Further, each robot uses <span><math><mi>O</mi><mo>(</mo><mi>log</mi><mo>⁡</mo><mi>L</mi><mo>+</mo><mi>log</mi><mo>⁡</mo><mi>Δ</mi><mo>)</mo></math></span> additional memory, i.e., memory other than the memory required to store its id. We also prove that the task of the dispersion cannot be achieved by a set of mobile robots with <span><math><mi>o</mi><mo>(</mo><mi>log</mi><mo>⁡</mo><mi>L</mi><mo>+</mo><mi>log</mi><mo>⁡</mo><mi>Δ</mi><mo>)</mo></math></span> additional memory.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139688985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DQS: A QoS-driven routing optimization approach in SDN using deep reinforcement learning DQS:在 SDN 中使用深度强化学习的 QoS 驱动型路由优化方法
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-01 DOI: 10.1016/j.jpdc.2024.104851
Lizeth Patricia Aguirre Sanchez, Yao Shen, Minyi Guo

In recent decades, the exponential growth of applications has intensified traffic demands, posing challenges in ensuring optimal user experiences within modern networks. Traditional congestion avoidance and control mechanisms embedded in conventional routing struggle to promptly adapt to new-generation networks. Current routing approaches risk-averse outcomes such as (1) scalability constraints, (2) high convergence times, and (3) congestion due to inadequate real-time traffic prioritization. To address these issues, this paper introduces a QoS-Driven Routing Optimization in Software-Defined Networking (SDN) using Deep Reinforcement Learning (DRL) to optimize routing and enhance QoS efficiency. Employing DRL, the proposed DQS optimizes routing decisions by intelligently distributing traffic, guided by a multi-objective function-driven DRL agent that considers both link and queue metrics. Despite the complexity of the network, DQS sustains scalability while significantly reducing convergence times. Through a Docker-based Openflow prototype, results highlight a substantial 20-30% reduction in end-to-end delay compared to baseline methods.

近几十年来,应用的指数级增长加剧了流量需求,为确保现代网络中的最佳用户体验带来了挑战。传统路由中嵌入的传统拥塞避免和控制机制难以及时适应新一代网络。当前的路由选择方法有可能导致以下结果:(1) 扩展性受限;(2) 收敛时间过长;(3) 实时流量优先级不够导致拥塞。为解决这些问题,本文介绍了软件定义网络(SDN)中的 QoS 驱动的路由优化,利用深度强化学习(DRL)优化路由并提高 QoS 效率。所提出的 DQS 采用 DRL,在考虑链路和队列指标的多目标函数驱动 DRL 代理的指导下,通过智能分配流量来优化路由决策。尽管网络很复杂,但 DQS 仍能保持可扩展性,同时显著缩短收敛时间。通过基于 Docker 的 Openflow 原型,结果表明与基线方法相比,端到端延迟大幅减少了 20-30%。
{"title":"DQS: A QoS-driven routing optimization approach in SDN using deep reinforcement learning","authors":"Lizeth Patricia Aguirre Sanchez,&nbsp;Yao Shen,&nbsp;Minyi Guo","doi":"10.1016/j.jpdc.2024.104851","DOIUrl":"10.1016/j.jpdc.2024.104851","url":null,"abstract":"<div><p>In recent decades, the exponential growth of applications has intensified traffic demands, posing challenges in ensuring optimal user experiences within modern networks. Traditional congestion avoidance and control mechanisms embedded in conventional routing struggle to promptly adapt to new-generation networks. Current routing approaches risk-averse outcomes such as (1) scalability constraints, (2) high convergence times, and (3) congestion due to inadequate real-time traffic prioritization. To address these issues, this paper introduces a QoS-Driven Routing Optimization in Software-Defined Networking (SDN) using Deep Reinforcement Learning (DRL) to optimize routing and enhance QoS efficiency. Employing DRL, the proposed DQS optimizes routing decisions by intelligently distributing traffic, guided by a multi-objective function-driven DRL agent that considers both link and queue metrics. Despite the complexity of the network, DQS sustains scalability while significantly reducing convergence times. Through a Docker-based Openflow prototype, results highlight a substantial 20-30% reduction in end-to-end delay compared to baseline methods.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139665146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy-efficient offloading for DNN-based applications in edge-cloud computing: A hybrid chaotic evolutionary approach 边缘云计算中基于 DNN 的应用的高能效卸载:混合混沌演化方法
IF 3.8 3区 计算机科学 Q1 Mathematics Pub Date : 2024-02-01 DOI: 10.1016/j.jpdc.2024.104850
Zengpeng Li, Huiqun Yu, Guisheng Fan, Jiayin Zhang, Jin Xu

The rapid development of Deep Neural Networks (DNNs) lays solid foundations for Internet of Things systems. However, mobile devices with limited processing capacity and short battery life confront the difficulties of executing complex DNNs. To satisfy different Quality of Service requirements, a feasible solution is offloading DNN layers to edge nodes and the cloud. The energy-efficient offloading problem for DNN-based applications with the deadline and budget constraints in the edge-cloud environment is still an open and challenging issue. To this end, this paper proposes a Hybrid Chaotic Evolutionary Algorithm (HCEA) incorporating diversification and intensification strategies and a DVFS-enabled version of it (HCEA-DVFS). The Archimedes Optimization Algorithm-based diversification strategy exploits global and local guiding information to improve population diversity during the updating process and employs Metropolis acceptance rule of Simulated Annealing to avoid premature convergence. The Genetic Algorithm-based chaotic intensification strategy is designed to enhance the local search capability of HCEA. Moreover, the Dynamic Voltage Frequency Scaling-enabled adjustment strategies can be embedded into HCEA to further reduce energy consumption by resetting frequency levels and reallocating DNN layers. Experimental results over four DNN-based applications demonstrate that HCEA-DVFS can reduce more energy consumption under different deadlines, budgets, and workloads on average by 7.93, 9.68, 11.02, 11.84, and 19.38 percent in comparison with HCEA, PSO-GA, MCEA, AOA, and Greedy, respectively.

深度神经网络(DNN)的快速发展为物联网系统奠定了坚实的基础。然而,处理能力有限、电池寿命短的移动设备难以执行复杂的 DNN。为了满足不同的服务质量要求,一种可行的解决方案是将 DNN 层卸载到边缘节点和云端。在边缘-云环境中,基于 DNN 的应用在截止日期和预算限制下的高能效卸载问题仍然是一个开放且具有挑战性的问题。为此,本文提出了一种混合混沌进化算法(HCEA),其中融合了多样化和集约化策略,以及其支持 DVFS 的版本(HCEA-DVFS)。基于阿基米德优化算法的多样化策略在更新过程中利用全局和局部指导信息来提高种群多样性,并采用模拟退火的 Metropolis 接受规则来避免过早收敛。基于遗传算法的混沌强化策略旨在增强 HCEA 的局部搜索能力。此外,还可以在 HCEA 中嵌入动态电压频率扩展调整策略,通过重置频率水平和重新分配 DNN 层来进一步降低能耗。对四种基于 DNN 的应用进行的实验结果表明,与 HCEA、PSO-GA、MCEA、AOA 和 Greedy 相比,HCEA-DVFS 在不同的截止时间、预算和工作负载条件下平均能减少 7.93%、9.68%、11.02%、11.84% 和 19.38% 的能耗。
{"title":"Energy-efficient offloading for DNN-based applications in edge-cloud computing: A hybrid chaotic evolutionary approach","authors":"Zengpeng Li,&nbsp;Huiqun Yu,&nbsp;Guisheng Fan,&nbsp;Jiayin Zhang,&nbsp;Jin Xu","doi":"10.1016/j.jpdc.2024.104850","DOIUrl":"10.1016/j.jpdc.2024.104850","url":null,"abstract":"<div><p><span><span><span>The rapid development of Deep Neural Networks (DNNs) lays solid foundations for </span>Internet of Things<span> systems. However, mobile devices with limited processing capacity and short battery life confront the difficulties of executing complex DNNs. To satisfy different Quality of Service requirements, a feasible solution is offloading DNN layers to edge nodes and the cloud. The energy-efficient offloading problem for DNN-based applications with the deadline and budget constraints in the edge-cloud environment is still an open and challenging issue. To this end, this paper proposes a Hybrid Chaotic </span></span>Evolutionary Algorithm<span> (HCEA) incorporating diversification and intensification strategies and a DVFS-enabled version of it (HCEA-DVFS). The Archimedes Optimization Algorithm-based diversification strategy exploits global and local guiding information to improve population diversity during the updating process and employs Metropolis acceptance rule of Simulated Annealing to avoid premature convergence. The Genetic Algorithm-based chaotic intensification strategy is designed to enhance the local search capability of HCEA. Moreover, the </span></span>Dynamic Voltage Frequency Scaling-enabled adjustment strategies can be embedded into HCEA to further reduce energy consumption by resetting frequency levels and reallocating DNN layers. Experimental results over four DNN-based applications demonstrate that HCEA-DVFS can reduce more energy consumption under different deadlines, budgets, and workloads on average by 7.93, 9.68, 11.02, 11.84, and 19.38 percent in comparison with HCEA, PSO-GA, MCEA, AOA, and Greedy, respectively.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139664784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Parallel and Distributed Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1