Multi-comparand associative processors are efficient in parallel processing of complex search problems that arise from many application areas including computational geometry, graph theory and list/matrix computations. In this paper we report new FPGA implementations of a multi-comparand multi-search associative processor. The architecture of the processor working in a combined bit-serial/bit-parallel word-parallel mode and its functions are described. Then, several implementations of associative processors in VHDL, using Xilinx Foundation ISE software and Digilent development boards with Xilinx FPGA devices are reported. Parameters of the implemented FPGA processors are presented and discussed.
多公司关联处理器在并行处理复杂搜索问题方面是高效的,这些问题出现在许多应用领域,包括计算几何、图论和列表/矩阵计算。在本文中,我们报告了一种新的多比较多搜索关联处理器的FPGA实现。描述了以位串行/位并行字并行组合方式工作的处理器结构及其功能。然后,介绍了使用Xilinx Foundation ISE软件和Digilent开发板和Xilinx FPGA器件在VHDL中实现关联处理器的几种方法。给出并讨论了所实现的FPGA处理器的参数。
{"title":"FPGA Implementations of a Parallel Associative Processor with Multi-Comparand Multi-Search Operations","authors":"Zbigniew Kokosinski, Bartlomiej Malus","doi":"10.1109/ISPDC.2008.42","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.42","url":null,"abstract":"Multi-comparand associative processors are efficient in parallel processing of complex search problems that arise from many application areas including computational geometry, graph theory and list/matrix computations. In this paper we report new FPGA implementations of a multi-comparand multi-search associative processor. The architecture of the processor working in a combined bit-serial/bit-parallel word-parallel mode and its functions are described. Then, several implementations of associative processors in VHDL, using Xilinx Foundation ISE software and Digilent development boards with Xilinx FPGA devices are reported. Parameters of the implemented FPGA processors are presented and discussed.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126761063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work highlights and takes aim at the most critical security aspects required for two different types of distributed systems for scientific computation. It covers two open-source systems written in Java: a demand-driven system - general intensional programming system (GIPSY) and a pipelined system - distributed modular audio recognition framework (DMARF), which are the distributed scientific computational engines used as case studies with respect to the security aspects. More specific goals include data/demand integrity, data/demand origin authentication, confidentiality, high availability, and malicious code detection. We address some of the goals to a degree, some with the Java data security framework (JDSF) as a work-in- progress.
{"title":"Towards Security Hardening of Scientific Demand-Driven and Pipelined Distributed Computing Systems","authors":"Serguei A. Mokhov","doi":"10.1109/ISPDC.2008.52","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.52","url":null,"abstract":"This work highlights and takes aim at the most critical security aspects required for two different types of distributed systems for scientific computation. It covers two open-source systems written in Java: a demand-driven system - general intensional programming system (GIPSY) and a pipelined system - distributed modular audio recognition framework (DMARF), which are the distributed scientific computational engines used as case studies with respect to the security aspects. More specific goals include data/demand integrity, data/demand origin authentication, confidentiality, high availability, and malicious code detection. We address some of the goals to a degree, some with the Java data security framework (JDSF) as a work-in- progress.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"64 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113956617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computer architecture simulation and modeling require a huge amount of time and resources, not only for the simulation itself but also regarding the configuration and submission procedures. A quite common simulation toolset (SimpleScalar) has been used to model a variety of platforms ranging from simple unpipelined processors to detailed dynamically scheduled microarchitectures with multiple-level memory hierarchies. In this paper we propose a platform for automatically executing a massive number of simulations in parallel, by exploiting a distributed computing approach. We developed a Web-based simulation system consisting in a front-end user interface and a back-end part supported on a grid system. The front-end is responsible for configuring the simulation and parsing the results, while the back-end distributes the workload by using Condor scheduler. Experimental results show that it is very easy to use the system, even when dealing with a huge number of simulations, and also it provides results in a very suitable format. Moreover, it has been concluded that a significant speedup can be achieved, by exploiting parallelism at the benchmark levels or also by sampling each benchmark with the SimPoint tool.
{"title":"Distributed Web-based Platform for Computer Architecture Simulation","authors":"A. Ilic, F. Pratas, L. Sousa","doi":"10.1109/ISPDC.2008.39","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.39","url":null,"abstract":"Computer architecture simulation and modeling require a huge amount of time and resources, not only for the simulation itself but also regarding the configuration and submission procedures. A quite common simulation toolset (SimpleScalar) has been used to model a variety of platforms ranging from simple unpipelined processors to detailed dynamically scheduled microarchitectures with multiple-level memory hierarchies. In this paper we propose a platform for automatically executing a massive number of simulations in parallel, by exploiting a distributed computing approach. We developed a Web-based simulation system consisting in a front-end user interface and a back-end part supported on a grid system. The front-end is responsible for configuring the simulation and parsing the results, while the back-end distributes the workload by using Condor scheduler. Experimental results show that it is very easy to use the system, even when dealing with a huge number of simulations, and also it provides results in a very suitable format. Moreover, it has been concluded that a significant speedup can be achieved, by exploiting parallelism at the benchmark levels or also by sampling each benchmark with the SimPoint tool.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127585981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diagrammatic description of satellite imagery processing has been developed in the MedioGrid system through the gProcess toolset. The processing workflow supports a flexible exploration of different algorithms and computing configuration over the grid. This paper concerns with experimental evaluation of the optimal mapping of the logical workflow onto the physical level. The execution time is evaluated for horizontal and vertical grouping techniques of the operational nodes.
{"title":"Graph Based Evaluation of Satellite Imagery Processing over Grid","authors":"V. Bâcu, D. Gorgan","doi":"10.1109/ISPDC.2008.50","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.50","url":null,"abstract":"Diagrammatic description of satellite imagery processing has been developed in the MedioGrid system through the gProcess toolset. The processing workflow supports a flexible exploration of different algorithms and computing configuration over the grid. This paper concerns with experimental evaluation of the optimal mapping of the logical workflow onto the physical level. The execution time is evaluated for horizontal and vertical grouping techniques of the operational nodes.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125412612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Gómez-Iglesias, M. A. Vega-Rodríguez, Francisco Castejón-Magaña, M. Cárdenas-Montes, Enrique Morales-Ramos
Fusion energy is the next generation of energy. The devices that scientists are using to carry out their researches need more energy than they produce, because many problems are presented in fusion devices. In magnetic confinement devices, one of these problems is the transport of particles in the confined plasma. Some modeling tools can be used to improve the transport levels, but the computational cost of these tools and the number of different configurations to simulate make impossible to perform the required test to obtain good designs. But with grid computing we have the computational resources needed for running the required number of tests and with genetic algorithms we can look for a good result without exploring all the solution space.
{"title":"Using a Genetic Algorithm and the Grid to Improve Transport Levels in the TJ-II Stellarator","authors":"A. Gómez-Iglesias, M. A. Vega-Rodríguez, Francisco Castejón-Magaña, M. Cárdenas-Montes, Enrique Morales-Ramos","doi":"10.1109/ISPDC.2008.30","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.30","url":null,"abstract":"Fusion energy is the next generation of energy. The devices that scientists are using to carry out their researches need more energy than they produce, because many problems are presented in fusion devices. In magnetic confinement devices, one of these problems is the transport of particles in the confined plasma. Some modeling tools can be used to improve the transport levels, but the computational cost of these tools and the number of different configurations to simulate make impossible to perform the required test to obtain good designs. But with grid computing we have the computational resources needed for running the required number of tests and with genetic algorithms we can look for a good result without exploring all the solution space.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126284509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Resource monitoring in distributed systems is required to understand the 'health' of the overall system and to help identify particular problems, such as dysfunctional hardware, a faulty system or application software. Desirable characteristics for monitoring systems are the ability to connect to any number of different types of monitoring agents and to provide different views of the system, based on a client's particular preferences. This paper outlines and discusses the ongoing activities within the GridRM wide-area resource-monitoring project.
{"title":"A Flexible Monitoring and Notification System for Distributed Resources","authors":"Garry Smith, M. Baker","doi":"10.1109/ISPDC.2008.29","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.29","url":null,"abstract":"Resource monitoring in distributed systems is required to understand the 'health' of the overall system and to help identify particular problems, such as dysfunctional hardware, a faulty system or application software. Desirable characteristics for monitoring systems are the ability to connect to any number of different types of monitoring agents and to provide different views of the system, based on a client's particular preferences. This paper outlines and discusses the ongoing activities within the GridRM wide-area resource-monitoring project.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120955497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biagio Cosenza, G. Cordasco, R. D. Chiara, U. Erra, V. Scarano
We present a load-balancing technique that exploits the temporal coherence, among successive computation phases, in mesh-like computations to be mapped on a cluster of processors. Our method partitions the computation in balanced tasks and distributes them to independent processors through the prediction binary tree (PBT). At each new phase, current PBT is updated by using previous phase computing time (for each task) as (next phase) cost estimate. The PBT is designed so that it balances the load across the tasks as well as reduce {em dependency} among processors for higher performances. Reducing dependency is obtained by using rectangular tiles of the mesh, of almost-square shape (i.e. one dimension is at most twice the other). By reducing dependency, one can reduce inter-processors communication or exploit local dependencies among tasks (such as data locality).Our strategy has been assessed on a significant problem, parallel ray tracing. Our implementation shows a good scalability, and improves over coherence-oblivious implementations. We report different measurements showing that granularity of tasks is a key point for the performances of our decomposition/mapping strategy.
{"title":"Load Balancing in Mesh-like Computations using Prediction Binary Trees","authors":"Biagio Cosenza, G. Cordasco, R. D. Chiara, U. Erra, V. Scarano","doi":"10.1109/ISPDC.2008.24","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.24","url":null,"abstract":"We present a load-balancing technique that exploits the temporal coherence, among successive computation phases, in mesh-like computations to be mapped on a cluster of processors. Our method partitions the computation in balanced tasks and distributes them to independent processors through the prediction binary tree (PBT). At each new phase, current PBT is updated by using previous phase computing time (for each task) as (next phase) cost estimate. The PBT is designed so that it balances the load across the tasks as well as reduce {em dependency} among processors for higher performances. Reducing dependency is obtained by using rectangular tiles of the mesh, of almost-square shape (i.e. one dimension is at most twice the other). By reducing dependency, one can reduce inter-processors communication or exploit local dependencies among tasks (such as data locality).Our strategy has been assessed on a significant problem, parallel ray tracing. Our implementation shows a good scalability, and improves over coherence-oblivious implementations. We report different measurements showing that granularity of tasks is a key point for the performances of our decomposition/mapping strategy.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128343180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper concerns speculative parallelization as a method of improving computations efficiency and also as a method of reducing the problem solving time with reference to its sequential version. Speculative parallelization is proposed for a particular class of problems, described as recursive functions taking values from finite sets. It refers to speculative execution of consecutive iteration steps. Each of them, except the first one, depends on the preceding iteration step yet before it ends. Assuming that in the sequential version one iteration is performed in one linear execution time step (hereinafter referred to as computational step), then the aim of the speculative parallelization is the reduction of the total number of computational steps and thus execution of more than one iteration in one time step. The essence of the problem is that we assume some mapping schemes of arguments into the set of possible values of the function in speculative computing, i.e. there exists precise information about the possible values that the function can take for particular arguments. This paper presents simulation results for the chosen mapping schemes, illustrating how the number of steps, required to compute the value of the function for the given argument, depends on the structure of the mapping scheme and on the number of used parallel threads.
{"title":"Speculative Computing of Recursive Functions Taking Values from Finite Sets","authors":"M. Brzuszek, A. Sasak, M. Turek","doi":"10.1109/ISPDC.2008.23","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.23","url":null,"abstract":"This paper concerns speculative parallelization as a method of improving computations efficiency and also as a method of reducing the problem solving time with reference to its sequential version. Speculative parallelization is proposed for a particular class of problems, described as recursive functions taking values from finite sets. It refers to speculative execution of consecutive iteration steps. Each of them, except the first one, depends on the preceding iteration step yet before it ends. Assuming that in the sequential version one iteration is performed in one linear execution time step (hereinafter referred to as computational step), then the aim of the speculative parallelization is the reduction of the total number of computational steps and thus execution of more than one iteration in one time step. The essence of the problem is that we assume some mapping schemes of arguments into the set of possible values of the function in speculative computing, i.e. there exists precise information about the possible values that the function can take for particular arguments. This paper presents simulation results for the chosen mapping schemes, illustrating how the number of steps, required to compute the value of the function for the given argument, depends on the structure of the mapping scheme and on the number of used parallel threads.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116539403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper presents a new kind of parallel embedded systems implemented in system on chip (SoC) technology, in which inter-processor communication infrastructure is dynamically run-time adjustable to application program requirements. The new system architecture assumes processors with a large number of autonomous communication links, which enables the look-ahead inter-processor connection reconfiguration that overlaps with current program execution including data communication. Dynamic connection reconfiguration pattern is determined at compile-time, as a result of application program graphs analysis. Algorithms for task scheduling and program decomposition into sections executed with the dynamic look-ahead created connections of processor links are presented. Experimental results with structuring of parallel numerical programs of fast Fourier transformation (FFT) are presented. The experiments compare program structuring quality of the look ahead connection reconfiguration in a single crossbar switch but with the use of multiple link subsets intergeably reconfigured in advance with the quality of reconfiguration in a single crossbar switch based on classical on-request approach.
{"title":"Optimized Communication Control in Programs for Dynamic Look-Ahead Reconfigurable SoC Systems","authors":"E. Laskowski, M. Tudruj","doi":"10.1109/ISPDC.2008.54","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.54","url":null,"abstract":"The paper presents a new kind of parallel embedded systems implemented in system on chip (SoC) technology, in which inter-processor communication infrastructure is dynamically run-time adjustable to application program requirements. The new system architecture assumes processors with a large number of autonomous communication links, which enables the look-ahead inter-processor connection reconfiguration that overlaps with current program execution including data communication. Dynamic connection reconfiguration pattern is determined at compile-time, as a result of application program graphs analysis. Algorithms for task scheduling and program decomposition into sections executed with the dynamic look-ahead created connections of processor links are presented. Experimental results with structuring of parallel numerical programs of fast Fourier transformation (FFT) are presented. The experiments compare program structuring quality of the look ahead connection reconfiguration in a single crossbar switch but with the use of multiple link subsets intergeably reconfigured in advance with the quality of reconfiguration in a single crossbar switch based on classical on-request approach.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125440924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present in this paper a fault tolerant permission-based k-mutual exclusion algorithm, which is an extension of Raymond's algorithm. Tolerating up to n-1 failures, our algorithm keeps its effectiveness despite failures. It uses information provided by unreliable failure detectors to dynamically detect crashes of nodes. Performance evaluation experiments show the performance of our algorithm compared to Raymond's when faults are injected.
{"title":"Fault Tolerant K-Mutual Exclusion Algorithm Using Failure Detector","authors":"Mathieu Bouillaguet, L. Arantes, Pierre Sens","doi":"10.1109/ISPDC.2008.57","DOIUrl":"https://doi.org/10.1109/ISPDC.2008.57","url":null,"abstract":"We present in this paper a fault tolerant permission-based k-mutual exclusion algorithm, which is an extension of Raymond's algorithm. Tolerating up to n-1 failures, our algorithm keeps its effectiveness despite failures. It uses information provided by unreliable failure detectors to dynamically detect crashes of nodes. Performance evaluation experiments show the performance of our algorithm compared to Raymond's when faults are injected.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129898010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}