首页 > 最新文献

Journal of Parallel and Distributed Computing最新文献

英文 中文
Mitigating DDoS attacks in containerized environments: A comparative analysis of Docker and Kubernetes 减轻容器化环境中的DDoS攻击:Docker和Kubernetes的比较分析
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-11 DOI: 10.1016/j.jpdc.2025.105130
Yung-Ting Chuang, Chih-Han Tu
Containerization has become the primary method for deploying applications, with web services being the most prevalent. However, exposing server IP addresses to external connections renders containerized services vulnerable to DDoS attacks, which can deplete server resources and hinder legitimate user access. To address this issue, we implement twelve different mitigation strategies, test them across three common types of web services, and conduct experiments on both Docker and Kubernetes deployment platforms. Furthermore, this study introduces a cross-platform, orchestration-aware evaluation framework that simulates realistic multi-service workloads and analyzes defense strategy performance under varying concurrency conditions. Experimental results indicate that Docker excels in managing white-listed traffic and delaying attacker responses, while Kubernetes achieves low completion times, minimum response times, and low failure rates by processing all requests simultaneously. Based on these findings, we provide actionable insights for selecting appropriate mitigation strategies tailored to different orchestration environments and workload patterns, offering practical guidance for securing containerized deployments against low-rate DDoS threats. Our work not only provides empirical performance evaluations but also reveals deployment-specific trade-offs, offering strategic recommendations for building resilient cloud-native infrastructures.
容器化已经成为部署应用程序的主要方法,其中web服务最为流行。但是,将服务器IP地址暴露给外部连接会使容器化服务容易受到DDoS攻击,从而耗尽服务器资源并阻碍合法用户访问。为了解决这个问题,我们实施了12种不同的缓解策略,在三种常见的web服务类型上进行了测试,并在Docker和Kubernetes部署平台上进行了实验。此外,本研究引入了一个跨平台的、编排感知的评估框架,该框架模拟了现实的多服务工作负载,并分析了不同并发条件下的防御策略性能。实验结果表明,Docker在管理白名单流量和延迟攻击者响应方面表现出色,而Kubernetes通过同时处理所有请求,实现了低完成时间、最小响应时间和低故障率。基于这些发现,我们为选择适合不同编排环境和工作负载模式的适当缓解策略提供了可操作的见解,并为保护容器化部署免受低速率DDoS威胁提供了实用指导。我们的工作不仅提供了经验性能评估,还揭示了部署特定的权衡,为构建弹性云原生基础设施提供了战略建议。
{"title":"Mitigating DDoS attacks in containerized environments: A comparative analysis of Docker and Kubernetes","authors":"Yung-Ting Chuang,&nbsp;Chih-Han Tu","doi":"10.1016/j.jpdc.2025.105130","DOIUrl":"10.1016/j.jpdc.2025.105130","url":null,"abstract":"<div><div>Containerization has become the primary method for deploying applications, with web services being the most prevalent. However, exposing server IP addresses to external connections renders containerized services vulnerable to DDoS attacks, which can deplete server resources and hinder legitimate user access. To address this issue, we implement twelve different mitigation strategies, test them across three common types of web services, and conduct experiments on both Docker and Kubernetes deployment platforms. Furthermore, this study introduces a cross-platform, orchestration-aware evaluation framework that simulates realistic multi-service workloads and analyzes defense strategy performance under varying concurrency conditions. Experimental results indicate that Docker excels in managing white-listed traffic and delaying attacker responses, while Kubernetes achieves low completion times, minimum response times, and low failure rates by processing all requests simultaneously. Based on these findings, we provide actionable insights for selecting appropriate mitigation strategies tailored to different orchestration environments and workload patterns, offering practical guidance for securing containerized deployments against low-rate DDoS threats. Our work not only provides empirical performance evaluations but also reveals deployment-specific trade-offs, offering strategic recommendations for building resilient cloud-native infrastructures.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105130"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144280939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-enabled academic certificate authentication and deep learning-based student performance prediction system using hyperledger blockchain technology 支持隐私的学术证书认证和基于深度学习的学生成绩预测系统,使用超级账本区块链技术
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-05 DOI: 10.1016/j.jpdc.2025.105119
Sangeetha A․S , Shunmugan S
Blockchain systems do not rely on trust for electronic transactions and it emerged as a popular technology due to its attributes like immutability, transparency, distributed storage, and decentralized control. Student certificates and skill verification play crucial roles in job applications and other purposes. In traditional systems, certificate forgery is a common problem, especially in online education. Processes, such as issuing and verifying student certifications along with student performance prediction for higher education or job recruitment are often lengthy and time-consuming. Integrating blockchain into certificate verification protocols offers authenticity and significantly reduces processing times. Hence, this research introduced a novel secure privacy preservation-based academic certificate authentication system (CertAuthSystem) for verifying the academic certificates of students. The CertAuthSystem contains different entities, such as Student, System, University, Blockchain, and Company. The university issues certificates to students, which are stored in Blockchain, and when the student applies for a job/scholarship, he/she transmits the certificate and the blockID to the organization, based on which verification is performed. Moreover, the student’s performance is predicted by a classifier named Deep Long Short-Term Memory (DLSTM). Then, CertAuthSystem is examined for its superiority considering measures, like validation time, memory, throughput and execution time and has achieved values of 53.412 ms, 86.6 MB, 94.876 Mbps, and 73.57 ms, correspondingly for block size 7. Finally, the prediction analysis of the DLSTM classifier is done based on evaluation metrics, such as precision, recall and F measure, which attained superior values of 90.77 %, 92.99 %, and 91.86 %.
区块链系统不依赖于电子交易的信任,由于其不变性、透明度、分布式存储和分散控制等属性,它成为一种流行的技术。学生证书和技能验证在工作申请和其他目的中起着至关重要的作用。在传统的教育系统中,证书伪造是一个常见的问题,特别是在网络教育中。诸如颁发和验证学生证书以及高等教育或工作招聘的学生表现预测等过程通常是漫长而耗时的。将区块链集成到证书验证协议中提供了真实性,并大大缩短了处理时间。因此,本研究提出了一种基于安全隐私保护的新型学历证书认证系统(CertAuthSystem),用于对学生的学历证书进行验证。CertAuthSystem包含不同的实体,如Student、System、University、区块链和Company。大学向学生颁发证书,这些证书存储在区块链中,当学生申请工作/奖学金时,他/她将证书和blockID传送给组织,根据该组织进行验证。此外,学生的表现是由一个分类器称为深长短期记忆(DLSTM)预测。然后,考虑验证时间、内存、吞吐量和执行时间等指标,对CertAuthSystem的优越性进行了检验,在块大小为7的情况下,CertAuthSystem的值分别为53.412 ms、86.6 MB、94.876 Mbps和73.57 ms。最后,基于准确率、召回率和F度量等评价指标对DLSTM分类器进行预测分析,得到了90.77%、92.99%和91.86%的优值。
{"title":"Privacy-enabled academic certificate authentication and deep learning-based student performance prediction system using hyperledger blockchain technology","authors":"Sangeetha A․S ,&nbsp;Shunmugan S","doi":"10.1016/j.jpdc.2025.105119","DOIUrl":"10.1016/j.jpdc.2025.105119","url":null,"abstract":"<div><div>Blockchain systems do not rely on trust for electronic transactions and it emerged as a popular technology due to its attributes like immutability, transparency, distributed storage, and decentralized control. Student certificates and skill verification play crucial roles in job applications and other purposes. In traditional systems, certificate forgery is a common problem, especially in online education. Processes, such as issuing and verifying student certifications along with student performance prediction for higher education or job recruitment are often lengthy and time-consuming. Integrating blockchain into certificate verification protocols offers authenticity and significantly reduces processing times. Hence, this research introduced a novel secure privacy preservation-based academic certificate authentication system (CertAuthSystem) for verifying the academic certificates of students. The CertAuthSystem contains different entities, such as Student, System, University, Blockchain, and Company. The university issues certificates to students, which are stored in Blockchain, and when the student applies for a job/scholarship, he/she transmits the certificate and the blockID to the organization, based on which verification is performed. Moreover, the student’s performance is predicted by a classifier named Deep Long Short-Term Memory (DLSTM). Then, CertAuthSystem is examined for its superiority considering measures, like validation time, memory, throughput and execution time and has achieved values of 53.412 ms, 86.6 MB, 94.876 Mbps, and 73.57 ms, correspondingly for block size 7. Finally, the prediction analysis of the DLSTM classifier is done based on evaluation metrics, such as precision, recall and F measure, which attained superior values of 90.77 %, 92.99 %, and 91.86 %.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105119"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144289001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Edge metric basis and its fault tolerance over certain interconnection networks 边缘度量基础及其在一定互连网络中的容错性
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-23 DOI: 10.1016/j.jpdc.2025.105141
S. Prabhu , T. Jenifer Janany , M. Arulperumjothi , I.G. Yero
The surveillance of elements in an interconnection network is a classical problem in computer engineering. In addition, it is a problem closely related to uniquely identifying the elements of the network, which is indeed a classical distance-related problem in graph theory. This surveillance can be considered for different styles of elements in the network. The classical version centers the attention on the nodes, while some recent variations of it consider monitoring also the edges or both, vertices and edges at the same time. The first style gave rise to graph structures, called edge resolving set and edge metric basis, which is used to uniquely identify the edges of a given network by means of distance vectors. A vertex x in a graph G uniquely recognizes (resolves or identifies) two edges e and f in G if dG[e,x]dG[f,x], where dG[e,x] stands for the distance between a vertex x and an edge e of G. A set S with the smallest number of vertices, such that every couple of edges is uniquely recognized by a minimum of one vertex in S, is an edge metric basis, and the edge metric dimension refers to the cardinality of such S. Fault tolerance of a working system is the ability of such a system to keep functioning even if one of its parts stops working properly. The fault tolerance property of the edge metric basis is considered in this work. This results in a concept called fault-tolerant edge metric basis. That is, an edge metric basis S of a graph G is fault-tolerant if every pair of edges of G are resolved by a minimum of two vertices in S, and the minimum possible cardinality of such sets is coined as the fault-tolerant edge metric dimension of G. In this work, we present bounds for the edge metric dimension of graphs and its fault tolerance version. In addition, we investigate these parameters for butterfly, Beneš and fractal cubic networks, and found the exact value for their (fault-tolerant) edge metric dimensions.
互连网络中元素的监控是计算机工程中的一个经典问题。此外,它是一个与唯一识别网络元素密切相关的问题,这确实是图论中一个经典的与距离相关的问题。这种监视可以考虑网络中不同风格的元素。经典版本将注意力集中在节点上,而最近的一些变化则考虑同时监控边缘或同时监控顶点和边缘。第一种方法产生了称为边缘分辨集和边缘度量基的图结构,它们通过距离向量来唯一地识别给定网络的边缘。如果dG[e,x]≠dG[f,x],则图G中的顶点x唯一地识别(或识别)G中的两条边e和f,其中dG[e,x]表示顶点x与G中的边e之间的距离,使得S中每一对边都被S中至少一个顶点唯一识别的集合S是边度量基,一个工作系统的容错能力是指即使其中一个部分停止正常工作,该系统仍能保持正常工作的能力。本文考虑了边缘度量基的容错特性。这就产生了容错边缘度量基的概念。即,如果图G的每一对边都被S中的至少两个顶点解析,则图G的边度量基S是容错的,并将这些集合的最小可能基数称为G的容错边度量维数。本文给出了图的边度量维数及其容错版本的界限。此外,我们研究了蝴蝶网络、贝内斯网络和分形三次网络的这些参数,并找到了它们(容错)边缘度量维的精确值。
{"title":"Edge metric basis and its fault tolerance over certain interconnection networks","authors":"S. Prabhu ,&nbsp;T. Jenifer Janany ,&nbsp;M. Arulperumjothi ,&nbsp;I.G. Yero","doi":"10.1016/j.jpdc.2025.105141","DOIUrl":"10.1016/j.jpdc.2025.105141","url":null,"abstract":"<div><div>The surveillance of elements in an interconnection network is a classical problem in computer engineering. In addition, it is a problem closely related to uniquely identifying the elements of the network, which is indeed a classical distance-related problem in graph theory. This surveillance can be considered for different styles of elements in the network. The classical version centers the attention on the nodes, while some recent variations of it consider monitoring also the edges or both, vertices and edges at the same time. The first style gave rise to graph structures, called edge resolving set and edge metric basis, which is used to uniquely identify the edges of a given network by means of distance vectors. A vertex <em>x</em> in a graph <em>G</em> uniquely recognizes (resolves or identifies) two edges <em>e</em> and <em>f</em> in <em>G</em> if <span><math><msub><mrow><mi>d</mi></mrow><mrow><mi>G</mi></mrow></msub><mo>[</mo><mi>e</mi><mo>,</mo><mi>x</mi><mo>]</mo><mo>≠</mo><msub><mrow><mi>d</mi></mrow><mrow><mi>G</mi></mrow></msub><mo>[</mo><mi>f</mi><mo>,</mo><mi>x</mi><mo>]</mo></math></span>, where <span><math><msub><mrow><mi>d</mi></mrow><mrow><mi>G</mi></mrow></msub><mo>[</mo><mi>e</mi><mo>,</mo><mi>x</mi><mo>]</mo></math></span> stands for the distance between a vertex <em>x</em> and an edge <em>e</em> of <em>G</em>. A set <em>S</em> with the smallest number of vertices, such that every couple of edges is uniquely recognized by a minimum of one vertex in <em>S</em>, is an edge metric basis, and the edge metric dimension refers to the cardinality of such <em>S</em>. Fault tolerance of a working system is the ability of such a system to keep functioning even if one of its parts stops working properly. The fault tolerance property of the edge metric basis is considered in this work. This results in a concept called fault-tolerant edge metric basis. That is, an edge metric basis <em>S</em> of a graph <em>G</em> is fault-tolerant if every pair of edges of <em>G</em> are resolved by a minimum of two vertices in <em>S</em>, and the minimum possible cardinality of such sets is coined as the fault-tolerant edge metric dimension of <em>G</em>. In this work, we present bounds for the edge metric dimension of graphs and its fault tolerance version. In addition, we investigate these parameters for butterfly, Beneš and fractal cubic networks, and found the exact value for their (fault-tolerant) edge metric dimensions.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105141"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144489472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to reduce the number of steps for (multi-valued validated) Byzantine agreement? 如何减少(多值验证)拜占庭协议的步骤数?
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-17 DOI: 10.1016/j.jpdc.2025.105132
Baohan Huang , Haibin Zhang , Chao Liu , Shengli Liu , Yong Yu , Fangguo Zhang , Liehuang Zhu
Multi-valued validated Byzantine agreement (MVBA), a notion of Cachin, Kursawe, Petzold, and Shoup (CKPS), is a core primitive in fault-tolerant distributed computing and can be used to build asynchronous Byzantine atomic broadcast, Byzantine fault-tolerant state machine replication, and asynchronous distributed key generation protocols. Recently, a major breakthrough by Abraham, Malkhi, and Spiegelman (AMS) improves the CKPS construction with optimal word complexity and on average 19.5 steps. Lu, Lu, Tang, and Wang propose Dumbo-MVBA using erasure coding to reduce the communication of AMS MVBA and Dumbo-MVBA* which is a self-bootstrap framework transforming any MVBA into a communication-efficient implementation. This paper introduces a new way of building MVBA that shares the same communication as Dumbo-MVBA but has about 7/25 as many steps as Dumbo-MVBA. A central building block for our MVBA is a new distributed computing primitive—verifiable and validated asynchronous consistent information dispersal (VVCID) that is of independent interest. We provide two instantiations, one being based on fingerprinted cross-checksum (Hendricks, Ganger, and Reiter), and the other relying on erasure-coding proof systems (Alhaddad, Duan, Varia, and Zhang).
We further show that in the case where n>4f, we can build even more efficient protocols. In particular, we present the first asynchronous binary agreement (ABA) protocol that has strictly 2 steps in each round and achieves optimal word complexity, while prior such protocols require n>5f. Our ABA additionally has a new biased validity property allowing us to optimize our MVBA framework further: our new MVBA for n>4f has about one fifth as many steps as Dumbo-MVBA.
多值验证拜占庭协议(MVBA)是Cachin、Kursawe、Petzold和Shoup (CKPS)的概念,是容错分布式计算的核心原语,可用于构建异步拜占庭原子广播、拜占庭容错状态机复制和异步分布式密钥生成协议。最近,Abraham, Malkhi, and Spiegelman (AMS)的一项重大突破使CKPS的构建达到了最优的词复杂度,平均步骤为19.5。Lu, Lu, Tang, and Wang提出了使用擦除编码的Dumbo-MVBA来减少AMS MVBA和Dumbo-MVBA*之间的通信,Dumbo-MVBA*是一个将任何MVBA转换为通信高效实现的自引导框架。本文介绍了一种新的构建MVBA的方法,该方法与Dumbo-MVBA具有相同的通信,但其步骤约为Dumbo-MVBA的7/25。我们的MVBA的中心构建块是一种新的分布式计算原语——可验证和验证的异步一致信息分散(VVCID),这是一种独立的兴趣。我们提供了两个实例,一个基于指纹交叉校验和(Hendricks、Ganger和Reiter),另一个依赖于擦除编码证明系统(Alhaddad、Duan、Varia和Zhang)。我们进一步展示了在n>;4f的情况下,我们可以构建更有效的协议。特别是,我们提出了第一个异步二进制协议(ABA)协议,该协议每轮严格有2个步骤,并实现了最佳的字复杂度,而以前的此类协议需要n>;5f。我们的ABA还具有一个新的偏效性,允许我们进一步优化我们的MVBA框架:我们针对n>;4f的新MVBA的步骤大约是Dumbo-MVBA的五分之一。
{"title":"How to reduce the number of steps for (multi-valued validated) Byzantine agreement?","authors":"Baohan Huang ,&nbsp;Haibin Zhang ,&nbsp;Chao Liu ,&nbsp;Shengli Liu ,&nbsp;Yong Yu ,&nbsp;Fangguo Zhang ,&nbsp;Liehuang Zhu","doi":"10.1016/j.jpdc.2025.105132","DOIUrl":"10.1016/j.jpdc.2025.105132","url":null,"abstract":"<div><div>Multi-valued validated Byzantine agreement (MVBA), a notion of Cachin, Kursawe, Petzold, and Shoup (CKPS), is a core primitive in fault-tolerant distributed computing and can be used to build asynchronous Byzantine atomic broadcast, Byzantine fault-tolerant state machine replication, and asynchronous distributed key generation protocols. Recently, a major breakthrough by Abraham, Malkhi, and Spiegelman (AMS) improves the CKPS construction with optimal word complexity and on average 19.5 steps. Lu, Lu, Tang, and Wang propose Dumbo-MVBA using erasure coding to reduce the communication of AMS MVBA and Dumbo-MVBA* which is a self-bootstrap framework transforming any MVBA into a communication-efficient implementation. This paper introduces a new way of building MVBA that shares the same communication as Dumbo-MVBA but has about 7/25 as many steps as Dumbo-MVBA. A central building block for our MVBA is a new distributed computing primitive—verifiable and validated asynchronous consistent information dispersal (VVCID) that is of independent interest. We provide two instantiations, one being based on fingerprinted cross-checksum (Hendricks, Ganger, and Reiter), and the other relying on erasure-coding proof systems (Alhaddad, Duan, Varia, and Zhang).</div><div>We further show that in the case where <span><math><mi>n</mi><mo>&gt;</mo><mn>4</mn><mi>f</mi></math></span>, we can build even more efficient protocols. In particular, we present the first asynchronous binary agreement (ABA) protocol that has strictly 2 steps in each round and achieves optimal word complexity, while prior such protocols require <span><math><mi>n</mi><mo>&gt;</mo><mn>5</mn><mi>f</mi></math></span>. Our ABA additionally has a new biased validity property allowing us to optimize our MVBA framework further: our new MVBA for <span><math><mi>n</mi><mo>&gt;</mo><mn>4</mn><mi>f</mi></math></span> has about one fifth as many steps as Dumbo-MVBA.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105132"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Power, energy, and performance analysis of single- and multi-threaded applications in the ARM ThunderX2 ARM ThunderX2中单线程和多线程应用程序的功耗、能源和性能分析
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-02 DOI: 10.1016/j.jpdc.2025.105118
Ibai Calero, Salvador Petit, María E. Gómez, Julio Sahuquillo
Energy efficiency has been a major concern in data centers, and the problem is exacerbated as its size continues to rise. However, the lack of tools to measure and handle this energy at a fine granularity (e.g., processor core or last-level cache) has translated into slow research advances in this topic. Understanding where (i.e., which components) and when (the point in time) energy consumption translates into minor performance improvements is of paramount importance to design any energy-aware scheduler. This paper characterizes the relationship between energy consumption and performance in a 28-core ARM ThunderX2 processor for both single-threaded and multi-threaded applications.
This paper shows that single-threaded applications with high CPU activity maintain their performance in spite of the inter-application interference at shared resources, but this comes at the expense of higher power consumption. Conversely, applications that heavily utilize the L3 cache and memory consume less power but suffer significant performance degradation as interference levels rise.
In contrast, multi-threaded applications show two distinct behaviors. On the one hand, some of them experience significant performance gains when they execute in a higher number of cores with more threads, which outweighs the increase in power consumption, leading to high energy efficiency.
能源效率一直是数据中心的一个主要问题,随着数据中心规模的不断扩大,这个问题变得更加严重。然而,由于缺乏精确测量和处理这些能量的工具(例如,处理器核心或最后一级缓存),导致该主题的研究进展缓慢。了解能耗在哪里(即哪些组件)以及何时(时间点)转化为较小的性能改进,对于设计任何能感知能耗的调度器都是至关重要的。本文描述了28核ARM ThunderX2处理器在单线程和多线程应用中的能耗与性能之间的关系。本文表明,尽管在共享资源上存在应用程序间的干扰,具有高CPU活动的单线程应用程序仍能保持其性能,但这是以更高的功耗为代价的。相反,大量使用L3缓存和内存的应用程序消耗较少的功率,但随着干扰水平的提高,性能会显著下降。相反,多线程应用程序表现出两种不同的行为。一方面,当它们在更多的内核和更多的线程中执行时,其中一些会获得显着的性能提升,这超过了功耗的增加,从而实现高能效。
{"title":"Power, energy, and performance analysis of single- and multi-threaded applications in the ARM ThunderX2","authors":"Ibai Calero,&nbsp;Salvador Petit,&nbsp;María E. Gómez,&nbsp;Julio Sahuquillo","doi":"10.1016/j.jpdc.2025.105118","DOIUrl":"10.1016/j.jpdc.2025.105118","url":null,"abstract":"<div><div>Energy efficiency has been a major concern in data centers, and the problem is exacerbated as its size continues to rise. However, the lack of tools to measure and handle this energy at a fine granularity (e.g., processor core or last-level cache) has translated into slow research advances in this topic. Understanding where (i.e., which components) and when (the point in time) energy consumption translates into minor performance improvements is of paramount importance to design any energy-aware scheduler. This paper characterizes the relationship between energy consumption and performance in a 28-core ARM ThunderX2 processor for both single-threaded and multi-threaded applications.</div><div>This paper shows that single-threaded applications with high CPU activity maintain their performance in spite of the inter-application interference at shared resources, but this comes at the expense of higher power consumption. Conversely, applications that heavily utilize the L3 cache and memory consume less power but suffer significant performance degradation as interference levels rise.</div><div>In contrast, multi-threaded applications show two distinct behaviors. On the one hand, some of them experience significant performance gains when they execute in a higher number of cores with more threads, which outweighs the increase in power consumption, leading to high energy efficiency.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105118"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144242749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SHAP-based intrusion detection in IoT networks using quantum neural networks on IonQ hardware 在IonQ硬件上使用量子神经网络的物联网网络中基于shap的入侵检测
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-13 DOI: 10.1016/j.jpdc.2025.105133
K Rajkumar, S. Mercy Shalinie
Securing IoT networks against cyber-attacks, especially Distributed Denial of Service (DDoS) attacks, is a growing challenge due to their ability to disrupt services and overwhelm network resources. This study introduces a novel post-processing methodology that integrates Explainable AI (XAI) with Quantum Neural Networks (QNN) to enhance the interpretability of DDoS attack detection. We utilize the CICFlowMeter tool for feature extraction, processing bidirectional network traffic data and generating up to 87 distinct features. Notably, the CICFlowMeter removes potentially tampered features such as IP addresses and ports to prevent manipulation, addressing the limitations associated with the use of these features in the presence of attackers. After a QNN generates expectation values for a given input, SHAP (SHapley Additive exPlanations) values are applied to interpret the contributions of individual features in the decision-making process. Although the QNN output indicates whether a network flow is benign or malicious, the quantum model's complexity makes it difficult to interpret. By using SHAP values, we identify which features such as IP addresses, ports, and traffic patterns significantly influence the QNN’s classification, providing human-understandable explanations for the model's predictions. For evaluation, we used the CIC-IoT 2022and proposed SDN-DDoS24 datasets, with SDN-DDoS24 outperforming others when integrated with the proposed methodology. The QNN was implemented on IonQ quantum hardware through Amazon Braket, achieving an expectation value of 0.98 with a low latency of 113 milliseconds, making it suitable for applications requiring both precision and speed. This study demonstrates that integrating XAI with QNN not only improves DDoS attack detection accuracy but also enhances transparency, making the model more trustworthy for real-world cybersecurity applications. By offering clear explanations of model behavior, the approach ensures that security experts can make informed decisions based on the quantum-enhanced detection system, improving its reliability and usability in dynamic network environments.
保护物联网网络免受网络攻击,特别是分布式拒绝服务(DDoS)攻击,是一项日益严峻的挑战,因为它们能够破坏服务并压倒网络资源。本研究介绍了一种新的后处理方法,该方法将可解释人工智能(XAI)与量子神经网络(QNN)相结合,以增强DDoS攻击检测的可解释性。我们利用CICFlowMeter工具进行特征提取,处理双向网络流量数据,并生成多达87个不同的特征。值得注意的是,CICFlowMeter删除了潜在的篡改功能,如IP地址和端口,以防止操作,解决了在攻击者存在的情况下使用这些功能的限制。在QNN为给定输入生成期望值后,应用SHapley加性解释(SHapley Additive explanation)值来解释决策过程中各个特征的贡献。尽管QNN的输出表明网络流是良性的还是恶意的,但量子模型的复杂性使其难以解释。通过使用SHAP值,我们确定哪些特征(如IP地址、端口和流量模式)显著影响QNN的分类,为模型的预测提供人类可以理解的解释。为了进行评估,我们使用了CIC-IoT 2022和建议的SDN-DDoS24数据集,其中SDN-DDoS24在与建议的方法集成时优于其他数据集。该QNN通过Amazon rack在IonQ量子硬件上实现,实现了0.98的期望值和113毫秒的低延迟,使其适合同时要求精度和速度的应用。该研究表明,将XAI与QNN集成不仅可以提高DDoS攻击检测的准确性,还可以增强透明度,使模型在现实世界的网络安全应用中更值得信赖。通过提供模型行为的清晰解释,该方法确保安全专家能够根据量子增强检测系统做出明智的决策,提高其在动态网络环境中的可靠性和可用性。
{"title":"SHAP-based intrusion detection in IoT networks using quantum neural networks on IonQ hardware","authors":"K Rajkumar,&nbsp;S. Mercy Shalinie","doi":"10.1016/j.jpdc.2025.105133","DOIUrl":"10.1016/j.jpdc.2025.105133","url":null,"abstract":"<div><div>Securing IoT networks against cyber-attacks, especially Distributed Denial of Service (DDoS) attacks, is a growing challenge due to their ability to disrupt services and overwhelm network resources. This study introduces a novel post-processing methodology that integrates Explainable AI (XAI) with Quantum Neural Networks (QNN) to enhance the interpretability of DDoS attack detection. We utilize the CICFlowMeter tool for feature extraction, processing bidirectional network traffic data and generating up to 87 distinct features. Notably, the CICFlowMeter removes potentially tampered features such as IP addresses and ports to prevent manipulation, addressing the limitations associated with the use of these features in the presence of attackers. After a QNN generates expectation values for a given input, SHAP (SHapley Additive exPlanations) values are applied to interpret the contributions of individual features in the decision-making process. Although the QNN output indicates whether a network flow is benign or malicious, the quantum model's complexity makes it difficult to interpret. By using SHAP values, we identify which features such as IP addresses, ports, and traffic patterns significantly influence the QNN’s classification, providing human-understandable explanations for the model's predictions. For evaluation, we used the CIC-IoT 2022and proposed SDN-DDoS24 datasets, with SDN-DDoS24 outperforming others when integrated with the proposed methodology. The QNN was implemented on IonQ quantum hardware through Amazon Braket, achieving an expectation value of 0.98 with a low latency of 113 milliseconds, making it suitable for applications requiring both precision and speed. This study demonstrates that integrating XAI with QNN not only improves DDoS attack detection accuracy but also enhances transparency, making the model more trustworthy for real-world cybersecurity applications. By offering clear explanations of model behavior, the approach ensures that security experts can make informed decisions based on the quantum-enhanced detection system, improving its reliability and usability in dynamic network environments.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105133"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144321377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues) 封面1 -完整的扉页(每期)/特刊扉页(每期)
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-07-11 DOI: 10.1016/S0743-7315(25)00116-9
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(25)00116-9","DOIUrl":"10.1016/S0743-7315(25)00116-9","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105149"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
QPOPSS: Query and Parallelism Optimized Space-Saving for finding frequent stream elements QPOPSS:查找频繁流元素的查询和并行优化节省空间
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-17 DOI: 10.1016/j.jpdc.2025.105134
Victor Jarlow , Charalampos Stylianopoulos , Marina Papatriantafilou
The frequent elements problem, a key component in demanding stream-data analytics, involves selecting elements whose occurrence exceeds a user-specified threshold. Fast, memory-efficient ϵ-approximate synopsis algorithms select all frequent elements but may overestimate them depending on ϵ (user-defined parameter). Evolving applications demand performance only achievable by parallelization. However, algorithmic guarantees concerning concurrent updates and queries have been overlooked. We propose Query and Parallelism Optimized Space-Saving (QPOPSS ), providing concurrency guarantees. A cornerstone of the design is a new approach for the main data structure for the Space-Saving algorithm, enabling support of very fast queries. QPOPSS combines minimal overlap with concurrent updates, distributing work and using fine-grained thread synchronization to achieve high throughput, accuracy, and low memory use. Our analysis shows space and approximation bounds under various concurrency and data distribution conditions. Our empirical evaluation relative to representative state-of-the-art methods reveals that QPOPSS 's multithreaded throughput scales linearly while maintaining the highest accuracy, with orders of magnitude smaller memory footprint.
频繁元素问题是要求苛刻的流数据分析中的一个关键组成部分,它涉及选择出现次数超过用户指定阈值的元素。快速、内存高效ϵ-approximate概要算法选择所有频繁元素,但可能会高估它们,这取决于λ(用户定义的参数)。不断发展的应用程序需要的性能只有通过并行化才能实现。然而,关于并发更新和查询的算法保证一直被忽视。我们提出了查询和并行优化节省空间(QPOPSS),提供并发性保证。该设计的一个基础是为节省空间算法的主数据结构提供了一种新方法,支持非常快的查询。QPOPSS将最小重叠与并发更新、分配工作和使用细粒度线程同步相结合,以实现高吞吐量、准确性和低内存使用。我们的分析显示了在各种并发性和数据分布条件下的空间和近似边界。我们的经验评估相对于代表性的最先进的方法表明,QPOPSS的多线程吞吐量线性扩展,同时保持最高的精度,具有更小的内存占用数量级。
{"title":"QPOPSS: Query and Parallelism Optimized Space-Saving for finding frequent stream elements","authors":"Victor Jarlow ,&nbsp;Charalampos Stylianopoulos ,&nbsp;Marina Papatriantafilou","doi":"10.1016/j.jpdc.2025.105134","DOIUrl":"10.1016/j.jpdc.2025.105134","url":null,"abstract":"<div><div>The frequent elements problem, a key component in demanding stream-data analytics, involves selecting elements whose occurrence exceeds a user-specified threshold. Fast, memory-efficient <em>ϵ</em>-approximate synopsis algorithms select all frequent elements but may overestimate them depending on <em>ϵ</em> (user-defined parameter). Evolving applications demand performance only achievable by parallelization. However, algorithmic guarantees concerning concurrent updates and queries have been overlooked. We propose Query and Parallelism Optimized Space-Saving (QPOPSS ), providing concurrency guarantees. A cornerstone of the design is a new approach for the main data structure for the <em>Space-Saving</em> algorithm, enabling support of very fast queries. QPOPSS combines minimal overlap with concurrent updates, distributing work and using fine-grained thread synchronization to achieve high throughput, accuracy, and low memory use. Our analysis shows space and approximation bounds under various concurrency and data distribution conditions. Our empirical evaluation relative to representative state-of-the-art methods reveals that QPOPSS 's multithreaded throughput scales linearly while maintaining the highest accuracy, with orders of magnitude smaller memory footprint.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105134"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144472089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Multi-Instance GPUs through moldable task scheduling 通过可建模的任务调度利用多实例gpu
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-06-06 DOI: 10.1016/j.jpdc.2025.105128
Jorge Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz
NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured. This work highlights the untapped potential of MIG through moldable task scheduling with dynamic reconfigurations. Specifically, we propose a makespan minimization problem for multi-task execution under MIG constraints. Our profiling shows that assuming monotonicity in task work with respect to resources is not viable, as is usual in multicore scheduling. Relying on a state-of-the-art proposal that does not require such an assumption, we present FAR, a 3-phase algorithm to solve the problem. Phase 1 of FAR builds on a classical task moldability method, phase 2 combines Longest Processing Time First and List Scheduling with a novel repartitioning tree heuristic tailored to MIG constraints, and phase 3 employs local search via task moves and swaps. FAR schedules tasks in batches offline, concatenating their schedules on the fly in an improved way that favors resource reuse. Excluding reconfiguration costs, the List Scheduling proof shows an approximation factor of 7/4 on the NVIDIA A30 model. We adapt the technique to the particular constraints of an NVIDIA A100/H100 to obtain an approximation factor of 2. Including the reconfiguration cost, our real-world experiments reveal a makespan with respect to the optimum no worse than 1.22× for a well-known suite of benchmarks, and 1.10× for synthetic inputs inspired by real kernels. We obtain good experimental results for each batch of tasks, but also in the concatenation of batches, with large improvements over the state-of-the-art and proposals without GPU reconfiguration. Moreover, we show that the proposed heuristics allow a correct adaptation to tasks of very different characteristics. Beyond the specific algorithm, the paper demonstrates the research potential of the MIG technology and suggests useful metrics, workload characterizations and evaluation techniques for future work in this field.
NVIDIA MIG (Multi-Instance GPU)允许将一个物理GPU划分为多个逻辑实例,这些实例具有完全隔离的资源,可以动态重新配置。这项工作通过动态重新配置的可建模任务调度突出了MIG尚未开发的潜力。具体来说,我们提出了在MIG约束下多任务执行的最大完成时间最小化问题。我们的分析表明,假设任务工作相对于资源是单调的,这在多核调度中是不可行的。依靠最先进的建议,不需要这样的假设,我们提出FAR,一个三阶段算法来解决这个问题。FAR的第一阶段建立在经典的任务可塑性方法之上,第二阶段结合了最长处理时间优先和列表调度以及针对MIG约束的新颖的重新划分树启发式方法,第三阶段通过任务移动和交换使用本地搜索。FAR脱机分批调度任务,以一种有利于资源重用的改进方式动态地连接它们的调度。排除重新配置成本,列表调度证明显示NVIDIA A30模型上的近似因子为7/4。我们将该技术应用于NVIDIA A100/H100的特定约束,以获得近似因子2。包括重新配置成本在内,我们的真实世界实验表明,对于一组著名的基准测试,相对于最优的makespan不低于1.22倍,对于由真实内核启发的合成输入,makespan不低于1.10倍。我们对每批任务都获得了良好的实验结果,而且在批的串联中也获得了良好的实验结果,比最新的技术和建议有了很大的改进,而不需要重新配置GPU。此外,我们表明,提出的启发式允许正确的适应任务非常不同的特点。除了具体的算法之外,本文还展示了MIG技术的研究潜力,并为该领域的未来工作提出了有用的度量、工作量表征和评估技术。
{"title":"Leveraging Multi-Instance GPUs through moldable task scheduling","authors":"Jorge Villarrubia,&nbsp;Luis Costero,&nbsp;Francisco D. Igual,&nbsp;Katzalin Olcoz","doi":"10.1016/j.jpdc.2025.105128","DOIUrl":"10.1016/j.jpdc.2025.105128","url":null,"abstract":"<div><div>NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured. This work highlights the untapped potential of MIG through moldable task scheduling with dynamic reconfigurations. Specifically, we propose a makespan minimization problem for multi-task execution under MIG constraints. Our profiling shows that assuming monotonicity in task work with respect to resources is not viable, as is usual in multicore scheduling. Relying on a state-of-the-art proposal that does not require such an assumption, we present <span>FAR</span>, a 3-phase algorithm to solve the problem. Phase 1 of FAR builds on a classical task moldability method, phase 2 combines Longest Processing Time First and List Scheduling with a novel repartitioning tree heuristic tailored to MIG constraints, and phase 3 employs local search via task moves and swaps. <span>FAR</span> schedules tasks in batches offline, concatenating their schedules on the fly in an improved way that favors resource reuse. Excluding reconfiguration costs, the List Scheduling proof shows an approximation factor of 7/4 on the NVIDIA A30 model. We adapt the technique to the particular constraints of an NVIDIA A100/H100 to obtain an approximation factor of 2. Including the reconfiguration cost, our real-world experiments reveal a makespan with respect to the optimum no worse than 1.22× for a well-known suite of benchmarks, and 1.10× for synthetic inputs inspired by real kernels. We obtain good experimental results for each batch of tasks, but also in the concatenation of batches, with large improvements over the state-of-the-art and proposals without GPU reconfiguration. Moreover, we show that the proposed heuristics allow a correct adaptation to tasks of very different characteristics. Beyond the specific algorithm, the paper demonstrates the research potential of the MIG technology and suggests useful metrics, workload characterizations and evaluation techniques for future work in this field.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105128"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144254815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal scheduling algorithms for software-defined radio pipelined and replicated task chains on multicore architectures 多核架构下软件定义无线电流水线和复制任务链的优化调度算法
IF 3.4 3区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-10-01 Epub Date: 2025-05-16 DOI: 10.1016/j.jpdc.2025.105106
Diane Orhan , Laércio Lima Pilla , Denis Barthou , Adrien Cassagne , Olivier Aumage , Romain Tajan , Christophe Jégo , Camille Leroux
Software-Defined Radio (SDR) represents a move from dedicated hardware to software implementations of digital communication standards. This approach offers flexibility, shorter time to market, maintainability, and lower costs, but it requires an optimized distribution tasks in order to meet performance requirements. Thus, we study the problem of scheduling SDR linear task chains of stateless and stateful tasks for streaming processing. We model this problem as a pipelined workflow scheduling problem based on pipelined and replicated parallelism on homogeneous resources. We propose an optimal dynamic programming solution and an optimal greedy algorithm named OTAC for maximizing throughput while also minimizing resource utilization. Moreover, the optimality of the proposed scheduling algorithm is proved. We evaluate our solutions and compare their execution times and schedules to other algorithms using synthetic task chains and an implementation of the DVB-S2 communication standard on the AFF3CT SDR Domain Specific Language. Our results demonstrate how OTAC quickly finds optimal schedules, leading consistently to better results than other algorithms, or equivalent results with fewer resources.
软件定义无线电(SDR)代表了数字通信标准从专用硬件到软件实现的转变。这种方法提供了灵活性、更短的上市时间、可维护性和更低的成本,但是为了满足性能需求,它需要优化的分发任务。因此,我们研究了流处理中无状态和有状态任务的SDR线性任务链调度问题。我们将此问题建模为基于同构资源上的流水线并行和复制并行的流水线工作流调度问题。提出了一种最优动态规划方案和最优贪心算法OTAC,以实现吞吐量最大化和资源利用率最小化。此外,还证明了所提调度算法的最优性。我们评估了我们的解决方案,并使用合成任务链和在AFF3CT SDR域特定语言上实现DVB-S2通信标准,将其执行时间和时间表与其他算法进行比较。我们的结果展示了OTAC如何快速找到最佳调度,从而始终比其他算法获得更好的结果,或者用更少的资源获得相同的结果。
{"title":"Optimal scheduling algorithms for software-defined radio pipelined and replicated task chains on multicore architectures","authors":"Diane Orhan ,&nbsp;Laércio Lima Pilla ,&nbsp;Denis Barthou ,&nbsp;Adrien Cassagne ,&nbsp;Olivier Aumage ,&nbsp;Romain Tajan ,&nbsp;Christophe Jégo ,&nbsp;Camille Leroux","doi":"10.1016/j.jpdc.2025.105106","DOIUrl":"10.1016/j.jpdc.2025.105106","url":null,"abstract":"<div><div>Software-Defined Radio (SDR) represents a move from dedicated hardware to software implementations of digital communication standards. This approach offers flexibility, shorter time to market, maintainability, and lower costs, but it requires an optimized distribution tasks in order to meet performance requirements. Thus, we study the problem of scheduling SDR linear task chains of stateless and stateful tasks for streaming processing. We model this problem as a pipelined workflow scheduling problem based on pipelined and replicated parallelism on homogeneous resources. We propose an optimal dynamic programming solution and an optimal greedy algorithm named OTAC for maximizing throughput while also minimizing resource utilization. Moreover, the optimality of the proposed scheduling algorithm is proved. We evaluate our solutions and compare their execution times and schedules to other algorithms using synthetic task chains and an implementation of the DVB-S2 communication standard on the AFF3CT SDR Domain Specific Language. Our results demonstrate how OTAC quickly finds optimal schedules, leading consistently to better results than other algorithms, or equivalent results with fewer resources.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"204 ","pages":"Article 105106"},"PeriodicalIF":3.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144195960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Parallel and Distributed Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1