Pub Date : 2024-04-12DOI: 10.1016/S0743-7315(24)00058-3
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(24)00058-3","DOIUrl":"https://doi.org/10.1016/S0743-7315(24)00058-3","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524000583/pdfft?md5=8c3c570a807bfaf1547376210bf18a64&pid=1-s2.0-S0743731524000583-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140549613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-10DOI: 10.1016/j.jpdc.2024.104891
Gerard Finol, Gerard París, Pedro García-López, Marc Sánchez-Artigas
Function-as-a-Service execution model in serverless computing has been successful in running large-scale computations like MapReduce, linear algebra, and machine learning. However, little attention has been given to executing highly-dynamic parallel applications with unbalanced and irregular workloads. These algorithms are difficult to execute with good parallel efficiency due to the challenge of provisioning the required computing resources in time, leading to resource over- and under-provisioning in clusters of static size. We propose that the elasticity and fine-grained “pay-as-you-go model” of the FaaS model can be a key enabler for effectively running these algorithms in the cloud. We use a simple serverless executor pool abstraction, and evaluate it using three algorithms with unbalanced and irregular workloads. Results show that their serverless implementation can outperform a static Spark cluster of large virtual machines by up to 55% with the same cost, and can even outperform a single large virtual machine running locally.
{"title":"Exploiting inherent elasticity of serverless in algorithms with unbalanced and irregular workloads","authors":"Gerard Finol, Gerard París, Pedro García-López, Marc Sánchez-Artigas","doi":"10.1016/j.jpdc.2024.104891","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104891","url":null,"abstract":"<div><p>Function-as-a-Service execution model in serverless computing has been successful in running large-scale computations like MapReduce, linear algebra, and machine learning. However, little attention has been given to executing highly-dynamic parallel applications with <em>unbalanced</em> and <em>irregular</em> workloads. These algorithms are difficult to execute with good parallel efficiency due to the challenge of provisioning the required computing resources in time, leading to resource over- and under-provisioning in clusters of static size. We propose that the elasticity and fine-grained “pay-as-you-go model” of the FaaS model can be a key enabler for effectively running these algorithms in the cloud. We use a simple serverless executor pool abstraction, and evaluate it using three algorithms with <em>unbalanced</em> and <em>irregular</em> workloads. Results show that their serverless implementation can outperform a static Spark cluster of large virtual machines by up to 55% with the same cost, and can even outperform a single large virtual machine running locally.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524000558/pdfft?md5=dfd5618d89af807a65e1b979fb557eaa&pid=1-s2.0-S0743731524000558-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140618804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep Neural Networks (DNNs) have gained widespread popularity in different domain applications due to their dominant performance. Despite the prevalence of massively parallel multi-core processor architectures, adopting large DNN models in embedded systems remains challenging, as most embedded applications are designed with single-core processors in mind. This limits DNN adoption in embedded systems due to inefficient leveraging of model parallelization and workload partitioning. Prior solutions attempt to address these challenges using data and model parallelism. However, they lack in finding optimal DNN model partitions and distributing them efficiently to achieve improved performance.
This paper proposes a DNN model parallelism framework to accelerate model training by finding the optimal number of model partitions and resource provisions. The proposed framework combines data and model parallelism techniques to optimize the parallel processing of DNNs for embedded applications. In addition, it implements the pipeline execution of the partitioned models and integrates a task controller to manage the computing resources. The experimental results for image object detection demonstrate the applicability of our proposed framework in estimating the latest execution time and reducing overall model training time by almost 44.87% compared to the baseline AlexNet convolutional neural network (CNN) model.
{"title":"Optimizing DNN training with pipeline model parallelism for enhanced performance in embedded systems","authors":"Md Al Maruf , Akramul Azim , Nitin Auluck , Mansi Sahi","doi":"10.1016/j.jpdc.2024.104890","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104890","url":null,"abstract":"<div><p>Deep Neural Networks (DNNs) have gained widespread popularity in different domain applications due to their dominant performance. Despite the prevalence of massively parallel multi-core processor architectures, adopting large DNN models in embedded systems remains challenging, as most embedded applications are designed with single-core processors in mind. This limits DNN adoption in embedded systems due to inefficient leveraging of model parallelization and workload partitioning. Prior solutions attempt to address these challenges using data and model parallelism. However, they lack in finding optimal DNN model partitions and distributing them efficiently to achieve improved performance.</p><p>This paper proposes a DNN model parallelism framework to accelerate model training by finding the optimal number of model partitions and resource provisions. The proposed framework combines data and model parallelism techniques to optimize the parallel processing of DNNs for embedded applications. In addition, it implements the pipeline execution of the partitioned models and integrates a task controller to manage the computing resources. The experimental results for image object detection demonstrate the applicability of our proposed framework in estimating the latest execution time and reducing overall model training time by almost 44.87% compared to the baseline AlexNet convolutional neural network (CNN) model.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524000546/pdfft?md5=d1af7342dc4b7d20a8dac857da5813c8&pid=1-s2.0-S0743731524000546-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140618805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Collaborative Filtering (CF) is one of the most successful techniques for quality-of-service (QoS) prediction and cloud service recommendation. However, individual QoS are time-sensitive and fluctuating, resulting in the QoS predicted by CF to deviate from the actual values. In addition, existing CF approaches ignore inauthentic QoS values given by untrustworthy users. To address these problems, we develop a two-dimensional time-aware and trust-aware service recommendation approach (TaTruSR). First, considering both timeliness and fluctuation of service QoS, an integrative method incorporates time weight (time dimension) and temporal certainty (QoS dimension) are proposed to determine the contribution of co-invoked services. Time weight is computed by a personalized logistic decay function to measure QoS changes by weighting the length of the time interval, while temporal certainty is defined by entropy to acquire the degree of QoS fluctuation over a period of time. Second, a set of most similar and trusted neighbors can be identified from the view of the time-aware similarity model and trust model. In models, the direct similarity and local trust are calculated based on the QoS ratings and contribution of co-invoked services to improve the prediction accuracy and eliminate unreliable QoS. The indirect similarity and global trust are estimated based on user relationship networks to alleviate the data sparsity problem. Finally, missing QoS prediction and reliable service recommendation for the active user can be achieved based on enhanced similarity and trust. A case study and experimental evaluation on real-world datasets demonstrate the practicality and accuracy of the proposed approach.
{"title":"A two-dimensional time-aware cloud service recommendation approach with enhanced similarity and trust","authors":"Chunhua Tang , Shuangyao Zhao , Binbin Chen , Xiaonong Lu , Qiang Zhang","doi":"10.1016/j.jpdc.2024.104889","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104889","url":null,"abstract":"<div><p>Collaborative Filtering (CF) is one of the most successful techniques for quality-of-service (QoS) prediction and cloud service recommendation. However, individual QoS are time-sensitive and fluctuating, resulting in the QoS predicted by CF to deviate from the actual values. In addition, existing CF approaches ignore inauthentic QoS values given by untrustworthy users. To address these problems, we develop a two-dimensional time-aware and trust-aware service recommendation approach (TaTruSR). First, considering both timeliness and fluctuation of service QoS, an integrative method incorporates time weight (time dimension) and temporal certainty (QoS dimension) are proposed to determine the contribution of co-invoked services. Time weight is computed by a personalized logistic decay function to measure QoS changes by weighting the length of the time interval, while temporal certainty is defined by entropy to acquire the degree of QoS fluctuation over a period of time. Second, a set of most similar and trusted neighbors can be identified from the view of the time-aware similarity model and trust model. In models, the direct similarity and local trust are calculated based on the QoS ratings and contribution of co-invoked services to improve the prediction accuracy and eliminate unreliable QoS. The indirect similarity and global trust are estimated based on user relationship networks to alleviate the data sparsity problem. Finally, missing QoS prediction and reliable service recommendation for the active user can be achieved based on enhanced similarity and trust. A case study and experimental evaluation on real-world datasets demonstrate the practicality and accuracy of the proposed approach.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140605693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-03DOI: 10.1016/j.jpdc.2024.104888
Yi Ding, Linhe Zhu
With the development of the times, rumors spread rapidly on the Internet. Firstly, this paper establishes a reaction-diffusion system with Allee effect to describe the rumor spreading process and derives the necessary conditions for the emergence of Turing bifurcation. Next, a parameter identification approach utilizing optimal control theory is shown. Ultimately, the impact of the magnitude of the certain parameters in the objective function on parameter identification is examined through numerous parameter identifications in continuous space and various complex networks. Additionally, the convergence rates and error magnitudes of different algorithms for parameter identification are studied across different spatial structures.
{"title":"Parameter identification method of a reaction-diffusion network information propagation system based on optimization theory","authors":"Yi Ding, Linhe Zhu","doi":"10.1016/j.jpdc.2024.104888","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104888","url":null,"abstract":"<div><p>With the development of the times, rumors spread rapidly on the Internet. Firstly, this paper establishes a reaction-diffusion system with Allee effect to describe the rumor spreading process and derives the necessary conditions for the emergence of Turing bifurcation. Next, a parameter identification approach utilizing optimal control theory is shown. Ultimately, the impact of the magnitude of the certain parameters in the objective function on parameter identification is examined through numerous parameter identifications in continuous space and various complex networks. Additionally, the convergence rates and error magnitudes of different algorithms for parameter identification are studied across different spatial structures.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140535044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-28DOI: 10.1016/j.jpdc.2024.104885
Chun-Hee Lee , Dong-oh Kang , Hwa Jeon Song
Knowledge graphs can be used in many areas related to data semantics such as question-answering systems, knowledge based systems. However, the currently constructed knowledge graphs need to be complemented for better knowledge in terms of relations. It is called knowledge graph completion. To add new relations to the existing knowledge graph by using knowledge graph embedding models, we have to evaluate vector operations, where N is the number of entities and R is the number of relation types. It is very costly.
In this paper, we provide an efficient knowledge graph completion framework on GPUs to get new relations using knowledge graph embedding vectors. In the proposed framework, we first define transformable to a metric space and then provide a method to transform the knowledge graph completion problem into the similarity join problem for a model which is transformable to a metric space. After that, to efficiently process the similarity join problem, we derive formulas using the properties of a metric space. Based on the formulas, we develop a fast knowledge graph completion algorithm. Finally, we experimentally show that our framework can efficiently process the knowledge graph completion problem.
知识图谱可用于许多与数据语义相关的领域,如问题解答系统、基于知识的系统等。然而,目前构建的知识图谱需要进行补充,以获得更好的知识关系。这就是所谓的知识图谱补全。要使用知识图谱嵌入模型为现有知识图谱添加新的关系,我们必须评估 N×N×R 向量运算,其中 N 是实体的数量,R 是关系类型的数量。在本文中,我们在 GPU 上提供了一个高效的知识图完成框架,利用知识图嵌入向量获取新关系。在所提出的框架中,我们首先定义了可转换为度量空间的模型,然后提供了一种将知识图完成问题转换为可转换为度量空间的模型的相似性连接问题的方法。之后,为了有效地处理相似性连接问题,我们利用度量空间的特性推导出公式。基于这些公式,我们开发了一种快速知识图完成算法。最后,我们通过实验证明,我们的框架可以高效地处理知识图完成问题。
{"title":"Fast knowledge graph completion using graphics processing units","authors":"Chun-Hee Lee , Dong-oh Kang , Hwa Jeon Song","doi":"10.1016/j.jpdc.2024.104885","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104885","url":null,"abstract":"<div><p>Knowledge graphs can be used in many areas related to data semantics such as question-answering systems, knowledge based systems. However, the currently constructed knowledge graphs need to be complemented for better knowledge in terms of relations. It is called knowledge graph completion. To add new relations to the existing knowledge graph by using knowledge graph embedding models, we have to evaluate <span><math><mi>N</mi><mo>×</mo><mi>N</mi><mo>×</mo><mi>R</mi></math></span> vector operations, where <em>N</em> is the number of entities and <em>R</em> is the number of relation types. It is very costly.</p><p>In this paper, we provide an efficient knowledge graph completion framework on GPUs to get new relations using knowledge graph embedding vectors. In the proposed framework, we first define <em>transformable to a metric space</em> and then provide a method to transform the knowledge graph completion problem into the similarity join problem for a model which is <em>transformable to a metric space</em>. After that, to efficiently process the similarity join problem, we derive formulas using the properties of a metric space. Based on the formulas, we develop a fast knowledge graph completion algorithm. Finally, we experimentally show that our framework can efficiently process the knowledge graph completion problem.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140348191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-27DOI: 10.1016/j.jpdc.2024.104884
Zijian Cao , Qiao Sun , Wenhao Yang , Changcheng Song , Zhe Wang , Huiyuan Li
HPL-AI, also known as HPL-MxP, is a new benchmark program used to evaluate the upper-bound performance of AI-related tasks on a specific computing cluster. It solves a large linear equation system in FP64, preconditioned by complete LU factorization in lower precision. In this paper, we propose a new HPL-AI approach that relies on the factorization of the coefficient matrix in mixed precision: FP32 diagonals and FP16 off-diagonals. Without compromising the quality of the resultant LU preconditioner, the proposed approach only utilizes the primitive of dense matrix multiplication in FP16 on the accelerator, maximizing the FP16 throughput. Numerical analysis and experiments validate our approach, ensuring avoidance of numerical underflow or overflow during factorization. We implement the proposed approach on Kunpeng+Ascend clusters, a novel AI-specific platform with exceedingly high FP16 peak performance. By applying various optimization techniques, including 2D lookahead, HCCL-based communication pipeline, and SYCL-based tasks overlapping, we achieve 975 TFlops on a single node and nearly 100 PFlops on a cluster of 128 nodes, with a weak scalability of 79.8%.
{"title":"A novel HPL-AI approach for FP16-only accelerator and its instantiation on Kunpeng+Ascend AI-specific platform","authors":"Zijian Cao , Qiao Sun , Wenhao Yang , Changcheng Song , Zhe Wang , Huiyuan Li","doi":"10.1016/j.jpdc.2024.104884","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104884","url":null,"abstract":"<div><p>HPL-AI, also known as HPL-MxP, is a new benchmark program used to evaluate the upper-bound performance of AI-related tasks on a specific computing cluster. It solves a large linear equation system in FP64, preconditioned by complete LU factorization in lower precision. In this paper, we propose a new HPL-AI approach that relies on the factorization of the coefficient matrix in mixed precision: FP32 diagonals and FP16 off-diagonals. Without compromising the quality of the resultant LU preconditioner, the proposed approach only utilizes the primitive of dense matrix multiplication in FP16 on the accelerator, maximizing the FP16 throughput. Numerical analysis and experiments validate our approach, ensuring avoidance of numerical underflow or overflow during factorization. We implement the proposed approach on Kunpeng+Ascend clusters, a novel AI-specific platform with exceedingly high FP16 peak performance. By applying various optimization techniques, including 2D lookahead, HCCL-based communication pipeline, and SYCL-based tasks overlapping, we achieve 975 TFlops on a single node and nearly 100 PFlops on a cluster of 128 nodes, with a weak scalability of 79.8%.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140341161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-27DOI: 10.1016/j.jpdc.2024.104886
Si-Yu Li , Xiang-Jun Li , Meijie Ma
The g-restricted edge connectivity is an important measurement to assess the reliability of networks. The g-restricted edge connectivity of a connected graph G is the minimum size of a set of edges in G, if it exists, whose deletion separates G and leaves every vertex in the remaining components with at least g neighbors. The k-ary n-cube is an extension of the hypercube network and has many desirable properties. It has been used to build the architecture of the Supercomputer Fugaku. This paper establishes that for , the g-restricted edge connectivity of 3-ary n-cubes is , and the g-restricted edge connectivity of k-ary n-cubes with is . These results imply that in with at most faulty edges, or with at most faulty edges, if each vertex is incident with at least g fault-free edges, then the remaining network is connected.
受 g 限制的边连通性是评估网络可靠性的一个重要指标。连通图 G 的 g 受限边连通性是 G 中一组边的最小大小(如果存在),删除这组边可以将 G 分割开来,并使剩余部分中的每个顶点都至少有 g 个邻居。k-ary n 立方体是超立方体网络的扩展,具有许多理想的特性。超级计算机 Fugaku 就是用它构建的。本文证明,对于 g≤n,3-ary n 立方体的 g 限制边连通性为 3⌊g/2⌋(1+(gmod2))(2n-g),而 k≥4 的 k-ary n 立方体的 g 限制边连通性为 2g(2n-g)。这些结果意味着,在最多有 3⌊g/2⌋(1+(gmod2))(2n-g)-1条故障边的 Qn3 中,或最多有 2g(2n-g)-1条故障边的 Qnk(k≥4)中,如果每个顶点至少有 g 条无故障边,那么其余网络是连通的。
{"title":"Reliability assessment for k-ary n-cubes with faulty edges","authors":"Si-Yu Li , Xiang-Jun Li , Meijie Ma","doi":"10.1016/j.jpdc.2024.104886","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104886","url":null,"abstract":"<div><p>The <em>g</em>-restricted edge connectivity is an important measurement to assess the reliability of networks. The <em>g</em>-restricted edge connectivity of a connected graph <em>G</em> is the minimum size of a set of edges in <em>G</em>, if it exists, whose deletion separates <em>G</em> and leaves every vertex in the remaining components with at least <em>g</em> neighbors. The <em>k</em>-ary <em>n</em>-cube is an extension of the hypercube network and has many desirable properties. It has been used to build the architecture of the Supercomputer Fugaku. This paper establishes that for <span><math><mi>g</mi><mo>≤</mo><mi>n</mi></math></span>, the <em>g</em>-restricted edge connectivity of 3-ary <em>n</em>-cubes is <span><math><msup><mrow><mn>3</mn></mrow><mrow><mo>⌊</mo><mi>g</mi><mo>/</mo><mn>2</mn><mo>⌋</mo></mrow></msup><mo>(</mo><mn>1</mn><mo>+</mo><mo>(</mo><mi>g</mi><mrow><mspace></mspace><mtext>mod</mtext><mspace></mspace></mrow><mn>2</mn><mo>)</mo><mo>)</mo><mo>(</mo><mn>2</mn><mi>n</mi><mo>−</mo><mi>g</mi><mo>)</mo></math></span>, and the <em>g</em>-restricted edge connectivity of <em>k</em>-ary <em>n</em>-cubes with <span><math><mi>k</mi><mo>≥</mo><mn>4</mn></math></span> is <span><math><msup><mrow><mn>2</mn></mrow><mrow><mi>g</mi></mrow></msup><mo>(</mo><mn>2</mn><mi>n</mi><mo>−</mo><mi>g</mi><mo>)</mo></math></span>. These results imply that in <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mn>3</mn></mrow></msubsup></math></span> with at most <span><math><msup><mrow><mn>3</mn></mrow><mrow><mo>⌊</mo><mi>g</mi><mo>/</mo><mn>2</mn><mo>⌋</mo></mrow></msup><mo>(</mo><mn>1</mn><mo>+</mo><mo>(</mo><mi>g</mi><mrow><mspace></mspace><mtext>mod</mtext><mspace></mspace></mrow><mn>2</mn><mo>)</mo><mo>)</mo><mo>(</mo><mn>2</mn><mi>n</mi><mo>−</mo><mi>g</mi><mo>)</mo><mo>−</mo><mn>1</mn></math></span> faulty edges, or <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mi>k</mi></mrow></msubsup><mo>(</mo><mi>k</mi><mo>≥</mo><mn>4</mn><mo>)</mo></math></span> with at most <span><math><msup><mrow><mn>2</mn></mrow><mrow><mi>g</mi></mrow></msup><mo>(</mo><mn>2</mn><mi>n</mi><mo>−</mo><mi>g</mi><mo>)</mo><mo>−</mo><mn>1</mn></math></span> faulty edges, if each vertex is incident with at least <em>g</em> fault-free edges, then the remaining network is connected.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140321045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-27DOI: 10.1016/j.jpdc.2024.104887
Hongbin Zhuang , Xiao-Yan Li , Jou-Ming Chang , Ximeng Liu
The k-ary n-cube serves as an indispensable interconnection network in the design of data center networks, network-on-chips, and parallel computing systems since it possesses numerous attractive properties. In these parallel architectures, the paired (or unpaired) many-to-many m-disjoint path cover (m-DPC) plays a significant role in message transmission. Nevertheless, the construction of m-DPC is severely obstructed by large-scale edge faults due to the rapid growth of the system scale. In this paper, we investigate the existence of paired 2-DPC in under the partitioned edge fault (PEF) model, which is a novel fault model for enhancing the networks' fault-tolerance related to path embedding problem. We exploit this model to evaluate the edge fault-tolerance of when a paired 2-DPC is embedded into . Compared to the other known works, our results can help to achieve large-scale edge fault-tolerance.
{"title":"Paired 2-disjoint path covers of k-ary n-cubes under the partitioned edge fault model","authors":"Hongbin Zhuang , Xiao-Yan Li , Jou-Ming Chang , Ximeng Liu","doi":"10.1016/j.jpdc.2024.104887","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104887","url":null,"abstract":"<div><p>The <em>k</em>-ary <em>n</em>-cube <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mi>k</mi></mrow></msubsup></math></span> serves as an indispensable interconnection network in the design of data center networks, network-on-chips, and parallel computing systems since it possesses numerous attractive properties. In these parallel architectures, the paired (or unpaired) many-to-many <em>m</em>-disjoint path cover (<em>m</em>-DPC) plays a significant role in message transmission. Nevertheless, the construction of <em>m</em>-DPC is severely obstructed by large-scale edge faults due to the rapid growth of the system scale. In this paper, we investigate the existence of paired 2-DPC in <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mi>k</mi></mrow></msubsup></math></span> under the partitioned edge fault (PEF) model, which is a novel fault model for enhancing the networks' fault-tolerance related to path embedding problem. We exploit this model to evaluate the edge fault-tolerance of <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mi>k</mi></mrow></msubsup></math></span> when a paired 2-DPC is embedded into <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mi>k</mi></mrow></msubsup></math></span>. Compared to the other known works, our results can help <span><math><msubsup><mrow><mi>Q</mi></mrow><mrow><mi>n</mi></mrow><mrow><mi>k</mi></mrow></msubsup></math></span> to achieve large-scale edge fault-tolerance.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140344268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-26DOI: 10.1016/j.jpdc.2024.104883
Fatemeh Barani , Abdorreza Savadi , Hadi Sadoghi Yazdi
Outliers and noises are unavoidable factors that cause performance of the distributed learning algorithms to be severely reduced. Developing a robust algorithm is vital in applications such as system identification and forecasting stock market, in which noise on the desired signals may intensely divert the solutions. In this paper, we propose a Robust Diffusion Stochastic Gradient Descent (RDSGD) algorithm based on the pseudo-Huber loss function which can significantly suppress the effect of Gaussian and non-Gaussian noises on estimation performances in the adaptive networks. Performance and convergence behavior of RDSGD are assessed in presence of the α-stable and Mixed-Gaussian noises in the stationary and non-stationary environments. Simulation results show that the proposed algorithm can achieve both higher convergence rate and lower steady-state misadjustment than the conventional diffusion algorithms and several robust algorithms.
{"title":"A distributed learning based on robust diffusion SGD over adaptive networks with noisy output data","authors":"Fatemeh Barani , Abdorreza Savadi , Hadi Sadoghi Yazdi","doi":"10.1016/j.jpdc.2024.104883","DOIUrl":"https://doi.org/10.1016/j.jpdc.2024.104883","url":null,"abstract":"<div><p>Outliers and noises are unavoidable factors that cause performance of the distributed learning algorithms to be severely reduced. Developing a robust algorithm is vital in applications such as system identification and forecasting stock market, in which noise on the desired signals may intensely divert the solutions. In this paper, we propose a Robust Diffusion Stochastic Gradient Descent (RDSGD) algorithm based on the pseudo-Huber loss function which can significantly suppress the effect of Gaussian and non-Gaussian noises on estimation performances in the adaptive networks. Performance and convergence behavior of RDSGD are assessed in presence of the <em>α</em>-stable and Mixed-Gaussian noises in the stationary and non-stationary environments. Simulation results show that the proposed algorithm can achieve both higher convergence rate and lower steady-state misadjustment than the conventional diffusion algorithms and several robust algorithms.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140328744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}