Boyu Zhang, Trilce Estrada, Pietro Cicotti, P. Balaji, M. Taufer
We present a scalable method to extensively search for and accurately select pharmaceutical drug candidates in large spaces of drug conformations computationally generated and stored across the nodes of a large distributed system. For each legend conformation in the dataset, our method first extracts relevant geometrical properties and transforms the properties into a single metadata point in the three-dimensional space. Then, it performs an ochre-based clustering on the metadata to search for predominant clusters. Our method avoids the need to move legend conformations among nodes because it extracts relevant data properties locally and concurrently. By doing so, we can perform accurate and scalable distributed clustering analysis on large distributed datasets. We scale the analysis of our pharmaceutical datasets a factor of 400X higher in performance and 500X larger in size than ever before. We also show that our clustering achieves higher accuracy compared with that of traditional clustering methods and conformational scoring based on minimum energy.
{"title":"Accurate Scoring of Drug Conformations at the Extreme Scale","authors":"Boyu Zhang, Trilce Estrada, Pietro Cicotti, P. Balaji, M. Taufer","doi":"10.1109/CCGrid.2015.94","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.94","url":null,"abstract":"We present a scalable method to extensively search for and accurately select pharmaceutical drug candidates in large spaces of drug conformations computationally generated and stored across the nodes of a large distributed system. For each legend conformation in the dataset, our method first extracts relevant geometrical properties and transforms the properties into a single metadata point in the three-dimensional space. Then, it performs an ochre-based clustering on the metadata to search for predominant clusters. Our method avoids the need to move legend conformations among nodes because it extracts relevant data properties locally and concurrently. By doing so, we can perform accurate and scalable distributed clustering analysis on large distributed datasets. We scale the analysis of our pharmaceutical datasets a factor of 400X higher in performance and 500X larger in size than ever before. We also show that our clustering achieves higher accuracy compared with that of traditional clustering methods and conformational scoring based on minimum energy.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"423 1","pages":"817-822"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77851511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrej Podzimek, L. Bulej, L. Chen, Walter Binder, P. Tůma
While workload collocation is a necessity to increase energy efficiency of contemporary multi-core hardware, it also increases the risk of performance anomalies due to workload interference. Pinning certain workloads to a subset of CPUs is a simple approach to increasing workload isolation, but its effect depends on workload type and system architecture. Apart from common sense guidelines, the effect of pinning has not been extensively studied so far. In this paper we study the impact of CPU pinning on performance interference and energy efficiency for pairs of collocated workloads. Besides various combinations of workloads, virtualization and resource isolation, we explore the effects of pinning depending on the level of background load. The presented results are based on more than 1000 experiments carried out on an Intel-based NUMA system, with all power management features enabled to reflect real-world settings. We find that less common CPU pinning configurations improve energy efficiency at partial background loads, indicating that systems hosting collocated workloads could benefit from dynamic CPU pinning based on CPU load and workload type.
{"title":"Analyzing the Impact of CPU Pinning and Partial CPU Loads on Performance and Energy Efficiency","authors":"Andrej Podzimek, L. Bulej, L. Chen, Walter Binder, P. Tůma","doi":"10.1109/CCGrid.2015.164","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.164","url":null,"abstract":"While workload collocation is a necessity to increase energy efficiency of contemporary multi-core hardware, it also increases the risk of performance anomalies due to workload interference. Pinning certain workloads to a subset of CPUs is a simple approach to increasing workload isolation, but its effect depends on workload type and system architecture. Apart from common sense guidelines, the effect of pinning has not been extensively studied so far. In this paper we study the impact of CPU pinning on performance interference and energy efficiency for pairs of collocated workloads. Besides various combinations of workloads, virtualization and resource isolation, we explore the effects of pinning depending on the level of background load. The presented results are based on more than 1000 experiments carried out on an Intel-based NUMA system, with all power management features enabled to reflect real-world settings. We find that less common CPU pinning configurations improve energy efficiency at partial background loads, indicating that systems hosting collocated workloads could benefit from dynamic CPU pinning based on CPU load and workload type.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"81 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81342324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giorgio Lucarelli, F. Mendonca, D. Trystram, Frédéric Wagner
We consider the classical First Come First Served / backfilling algorithm which is commonly used in actual batch schedulers. As HPC platforms grow in size and complexity, an interesting question is how to enhance this algorithm in order to improve global performance by reducing the overall amount of communications. In this direction, we are interested in studying the impact of contiguity and locality allocation constraints on the behavior of batch scheduler. We provide a theoretical analysis of the cost of enforcing contiguity and locality properties. More specifically, we show that both properties do not impose strong limit on achievable make span performance while comparing feasible optimal solutions under different settings, we describe here the existing results on this topic and complete them with all combinations of constraints. We also propose a range of different allocation algorithms for backfilling by choosing between a strict or a soft enforcing of locality and contiguity. Our approach is validated through an extensive series of simulations based on batch scheduler traces. Experiments show that our algorithms do not increase the make span in average when comparing to actual practices. Interestingly, we observe that enforcing contiguity efficiently improves locality.
{"title":"Contiguity and Locality in Backfilling Scheduling","authors":"Giorgio Lucarelli, F. Mendonca, D. Trystram, Frédéric Wagner","doi":"10.1109/CCGrid.2015.143","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.143","url":null,"abstract":"We consider the classical First Come First Served / backfilling algorithm which is commonly used in actual batch schedulers. As HPC platforms grow in size and complexity, an interesting question is how to enhance this algorithm in order to improve global performance by reducing the overall amount of communications. In this direction, we are interested in studying the impact of contiguity and locality allocation constraints on the behavior of batch scheduler. We provide a theoretical analysis of the cost of enforcing contiguity and locality properties. More specifically, we show that both properties do not impose strong limit on achievable make span performance while comparing feasible optimal solutions under different settings, we describe here the existing results on this topic and complete them with all combinations of constraints. We also propose a range of different allocation algorithms for backfilling by choosing between a strict or a soft enforcing of locality and contiguity. Our approach is validated through an extensive series of simulations based on batch scheduler traces. Experiments show that our algorithms do not increase the make span in average when comparing to actual practices. Interestingly, we observe that enforcing contiguity efficiently improves locality.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"106 1","pages":"586-595"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81652010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The demand for parallel I/O performance continues to grow. However, modelling and generating parallel I/O work-loads are challenging for several reasons including the large number of processes, I/O request dependencies and workload scalability. In this paper, we propose the PIONEER, a complete solution to Parallel I/O workload characterization and gEnERation. The core of PIONEER is a proposed generic workload path, which is essentially an abstract and dense representation of the parallel I/O patterns for all processes in a High Performance Computing (HPC) application. The generic workload path can be built via exploring the inter-processes correlations, I/O dependencies as well as file open session properties. We demonstrate the effectiveness of PIONEER by faithfully generating synthetic workloads for two popular HPC benchmarks and one real HPC application.
{"title":"PIONEER: A Solution to Parallel I/O Workload Characterization and Generation","authors":"Weiping He, D. Du, Sai B. Narasimhamurthy","doi":"10.1109/CCGrid.2015.32","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.32","url":null,"abstract":"The demand for parallel I/O performance continues to grow. However, modelling and generating parallel I/O work-loads are challenging for several reasons including the large number of processes, I/O request dependencies and workload scalability. In this paper, we propose the PIONEER, a complete solution to Parallel I/O workload characterization and gEnERation. The core of PIONEER is a proposed generic workload path, which is essentially an abstract and dense representation of the parallel I/O patterns for all processes in a High Performance Computing (HPC) application. The generic workload path can be built via exploring the inter-processes correlations, I/O dependencies as well as file open session properties. We demonstrate the effectiveness of PIONEER by faithfully generating synthetic workloads for two popular HPC benchmarks and one real HPC application.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"31 1","pages":"111-120"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86910429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evangelos Tasoulas, Ernst Gunnar Gran, Bjørn Dag Johnsen, T. Skeie
In large InfiniBand subnets the Subnet Manager (SM) is a potential bottleneck. When an InfiniBand subnet grows in size, the number of paths between hosts increases polynomials and the SM may not be able to serve the network in a timely manner when many concurrent path resolution requests are received. This scalability challenge is further amplified in a dynamic virtualized cloud environment. When a Virtual Machine (VM) with InfiniBand interconnect live migrates, the VM addresses change. These address changes result in additional load to the SM as communicating peers send Subnet Administration (SA) path record queries to the SM to resolve new path characteristics. In this paper we benchmark OpenSM to empirically demonstrate the SM scalability problems. Then we show that our novel SA Path Record Query caching scheme significantly reduces the load towards the SM. In particular, we show by using the Reliable Datagram Socket protocol that only a single initial SA path query is needed per communicating peer, independent of any subsequent (re)connection attempts.
{"title":"A Novel Query Caching Scheme for Dynamic InfiniBand Subnets","authors":"Evangelos Tasoulas, Ernst Gunnar Gran, Bjørn Dag Johnsen, T. Skeie","doi":"10.1109/CCGrid.2015.10","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.10","url":null,"abstract":"In large InfiniBand subnets the Subnet Manager (SM) is a potential bottleneck. When an InfiniBand subnet grows in size, the number of paths between hosts increases polynomials and the SM may not be able to serve the network in a timely manner when many concurrent path resolution requests are received. This scalability challenge is further amplified in a dynamic virtualized cloud environment. When a Virtual Machine (VM) with InfiniBand interconnect live migrates, the VM addresses change. These address changes result in additional load to the SM as communicating peers send Subnet Administration (SA) path record queries to the SM to resolve new path characteristics. In this paper we benchmark OpenSM to empirically demonstrate the SM scalability problems. Then we show that our novel SA Path Record Query caching scheme significantly reduces the load towards the SM. In particular, we show by using the Reliable Datagram Socket protocol that only a single initial SA path query is needed per communicating peer, independent of any subsequent (re)connection attempts.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"130 1","pages":"199-210"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88764428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This papers targeted scientists and programmers who need to easily develop and run e-science applications on large scale distributed systems. We present a rich programming paradigm and environment used to develop and deploy high performance applications (HPC) on large scale distributed and heterogeneous platforms. We particularly target iterative e-science applications where (i) convergence conditions and number of jobs are not known in advance, (ii) jobs are created on the fly and (iii) jobs could be persistent. We propose two programming paradigms so as to provide intuitive statements enabling an easy writing of HPC e-science applications. Non-expert developers (scientific researchers) can use them to guarantee fast development and efficient deployment of their applications.
{"title":"Towards a High Level Programming Paradigm to Deploy e-Science Applications with Dynamic Workflows on Large Scale Distributed Systems","authors":"M. B. Belgacem, N. Abdennadher","doi":"10.1109/CCGrid.2015.147","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.147","url":null,"abstract":"This papers targeted scientists and programmers who need to easily develop and run e-science applications on large scale distributed systems. We present a rich programming paradigm and environment used to develop and deploy high performance applications (HPC) on large scale distributed and heterogeneous platforms. We particularly target iterative e-science applications where (i) convergence conditions and number of jobs are not known in advance, (ii) jobs are created on the fly and (iii) jobs could be persistent. We propose two programming paradigms so as to provide intuitive statements enabling an easy writing of HPC e-science applications. Non-expert developers (scientific researchers) can use them to guarantee fast development and efficient deployment of their applications.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"6 1","pages":"292-301"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87235490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Shen, A. Iosup, A. Israel, W. Cirne, D. Raz, D. Epema
Data enters are at the core of a wide variety of daily ICT utilities, ranging from scientific computing to online gaming. Due to the scale of today's data enters, the failure of computing resources is a common occurrence that may disrupt the availability of ICT services, leading to revenue loss. Although many high availability (HA) techniques have been proposed to mask resource failures, datacenter users' -- who rent datacenter resources and use them to provide ICT utilities to a global population' -- still have limited management options for dynamically selecting and configuring HA techniques. In this work, we propose Availability-on-Demand (AoD), a mechanism consisting of an API that allows datacenter users to specify availability requirements which can dynamically change, and an availability-aware scheduler that dynamically manages computing resources based on user-specified requirements. The mechanism operates at the level of individual service instance, thus enabling fine-grained control of availability, for example during sudden requirement changes and periodic operations. Through realistic, trace-based simulations, we show that the AoD mechanism can achieve high availability with low cost. The AoD approach consumes about the same CPU hours but with higher availability than approaches which use HA techniques randomly. Moreover, comparing to an ideal approach which has perfect predictions about failures, it consumes 13% to 31% more CPU hours but achieves similar availability for critical parts of applications.
{"title":"An Availability-on-Demand Mechanism for Datacenters","authors":"S. Shen, A. Iosup, A. Israel, W. Cirne, D. Raz, D. Epema","doi":"10.1109/CCGrid.2015.58","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.58","url":null,"abstract":"Data enters are at the core of a wide variety of daily ICT utilities, ranging from scientific computing to online gaming. Due to the scale of today's data enters, the failure of computing resources is a common occurrence that may disrupt the availability of ICT services, leading to revenue loss. Although many high availability (HA) techniques have been proposed to mask resource failures, datacenter users' -- who rent datacenter resources and use them to provide ICT utilities to a global population' -- still have limited management options for dynamically selecting and configuring HA techniques. In this work, we propose Availability-on-Demand (AoD), a mechanism consisting of an API that allows datacenter users to specify availability requirements which can dynamically change, and an availability-aware scheduler that dynamically manages computing resources based on user-specified requirements. The mechanism operates at the level of individual service instance, thus enabling fine-grained control of availability, for example during sudden requirement changes and periodic operations. Through realistic, trace-based simulations, we show that the AoD mechanism can achieve high availability with low cost. The AoD approach consumes about the same CPU hours but with higher availability than approaches which use HA techniques randomly. Moreover, comparing to an ideal approach which has perfect predictions about failures, it consumes 13% to 31% more CPU hours but achieves similar availability for critical parts of applications.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"84 1","pages":"495-504"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86709807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Power and performance are two potentially opposing objectives in the design of a supercomputer, where increases in performance often come at the cost of increased power consumption and vice versa. The task of simultaneously maximising both objectives is becoming an increasingly prominent challenge in the development of future exascale supercomputers. To gain some perspective on the scale of the challenge, we analyse the power and performance trends for the Top500 and Green500 supercomputer lists. We then present the PαPW metric, which we use to evaluate the scalability of power efficiency, projecting the development of an exascale system. From this analysis, we found that when both power and performance are considered, the projected date of achieving an exascale system falls far beyond the current target of 2020.
{"title":"Quantifying the Energy Efficiency Challenges of Achieving Exascale Computing","authors":"J. Mair, Zhiyi Huang, D. Eyers, Yawen Chen","doi":"10.1109/CCGrid.2015.130","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.130","url":null,"abstract":"Power and performance are two potentially opposing objectives in the design of a supercomputer, where increases in performance often come at the cost of increased power consumption and vice versa. The task of simultaneously maximising both objectives is becoming an increasingly prominent challenge in the development of future exascale supercomputers. To gain some perspective on the scale of the challenge, we analyse the power and performance trends for the Top500 and Green500 supercomputer lists. We then present the PαPW metric, which we use to evaluate the scalability of power efficiency, projecting the development of an exascale system. From this analysis, we found that when both power and performance are considered, the projected date of achieving an exascale system falls far beyond the current target of 2020.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"60 1","pages":"943-950"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89947881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A critical feature of IaaS cloud computing is the ability to quickly disseminate the content of a shared dataset at large scale. In this context, a common pattern is collective on-demand read, i.e., accessing the same VM image or dataset from a large number of V Minstances concurrently. There are various techniques that avoid I/Ocontention to the storage service where the dataset is located without relying on pre-broadcast. Most such techniques employ peer-to-peer collaborative behavior where the VM instances exchange information about the content that was accessed during runtime, such that it impossible to fetch the missing data pieces directly from each other rather than the storage system. However, such techniques are often limited within a group that performs a collective read. In light of high data redundancy on large IaaS data centers and multiple users that simultaneously run VM instance groups that perform collective reads, an important opportunity arises: enabling unrelated VMinstances belonging to different groups to collaborate and exchange common data in order to further reduce the I/O pressure on the storage system. This paper deals with the challenges posed by such absolution, which prompt the need for novel techniques to efficiently detect and leverage common data pieces across groups. To this end, we introduce a low-overhead fingerprint based approach that we evaluate and demonstrate to be efficient in practice for a representative scenario on dozens of nodes and a variety of group configurations.
{"title":"Discovering and Leveraging Content Similarity to Optimize Collective on-Demand Data Access to IaaS Cloud Storage","authors":"Bogdan Nicolae, Andrzej Kochut, A. Karve","doi":"10.1109/CCGrid.2015.156","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.156","url":null,"abstract":"A critical feature of IaaS cloud computing is the ability to quickly disseminate the content of a shared dataset at large scale. In this context, a common pattern is collective on-demand read, i.e., accessing the same VM image or dataset from a large number of V Minstances concurrently. There are various techniques that avoid I/Ocontention to the storage service where the dataset is located without relying on pre-broadcast. Most such techniques employ peer-to-peer collaborative behavior where the VM instances exchange information about the content that was accessed during runtime, such that it impossible to fetch the missing data pieces directly from each other rather than the storage system. However, such techniques are often limited within a group that performs a collective read. In light of high data redundancy on large IaaS data centers and multiple users that simultaneously run VM instance groups that perform collective reads, an important opportunity arises: enabling unrelated VMinstances belonging to different groups to collaborate and exchange common data in order to further reduce the I/O pressure on the storage system. This paper deals with the challenges posed by such absolution, which prompt the need for novel techniques to efficiently detect and leverage common data pieces across groups. To this end, we introduce a low-overhead fingerprint based approach that we evaluate and demonstrate to be efficient in practice for a representative scenario on dozens of nodes and a variety of group configurations.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"211-220"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76872731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The overall performance of Network-on-Chip (NoC) is strongly affected by the efficiency of the on-chip routing algorithm. Among the factors associated with the design of a high-performance routing method, adaptivity is an important one. Moreover, deadlock-and live lock-freedom are necessary for a functional routing method. Despite the advantages that the diametrical mesh can bring to NoCs compared with the classical mesh topology, the literature records little research efforts to design pertinent routing methods for such networks. Using the available routing algorithms, the network performance degrades drastically not only due to the deterministic paths, but also to the deadlocks created between the packets. In this paper, we take advantage of the Hamiltonian routing strategy to adaptively route the packets through deadlock-free paths in a diametrical 2D mesh network. The simulation results demonstrate the efficiency of the proposed approach in decreasing the likelihood of congestion and smoothly distributing the traffic across the network.
{"title":"Hamiltonian Path Strategy for Deadlock-Free and Adaptive Routing in Diametrical 2D Mesh NoCs","authors":"Poona Bahrebar, D. Stroobandt","doi":"10.1109/CCGrid.2015.112","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.112","url":null,"abstract":"The overall performance of Network-on-Chip (NoC) is strongly affected by the efficiency of the on-chip routing algorithm. Among the factors associated with the design of a high-performance routing method, adaptivity is an important one. Moreover, deadlock-and live lock-freedom are necessary for a functional routing method. Despite the advantages that the diametrical mesh can bring to NoCs compared with the classical mesh topology, the literature records little research efforts to design pertinent routing methods for such networks. Using the available routing algorithms, the network performance degrades drastically not only due to the deterministic paths, but also to the deadlocks created between the packets. In this paper, we take advantage of the Hamiltonian routing strategy to adaptively route the packets through deadlock-free paths in a diametrical 2D mesh network. The simulation results demonstrate the efficiency of the proposed approach in decreasing the likelihood of congestion and smoothly distributing the traffic across the network.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"77 1","pages":"1209-1212"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78104800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}