Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271425
Rocco Aversa, B. D. Martino, Nicola Mazzocca, S. Venticinque
Mobile agents can provide a suitable framework for supporting resource and service discovery in grid platforms, and can support optimal access and interaction with the user through heterogeneous terminals, differing in terms of memory capacity, computational resources, display characteristics, allowed connection mode, etc. We deal with the utilization of Web services technology to discover and optimally access mobile grid resources and services, within a mobile agent based grid architecture (MAGDA) we have designed and have been implementing. Web services paradigm and SIP and UDDI technologies are utilized to implement a resource discovery service that allow users and mobile agents to look for and access distributed resources and applications, through heterogeneous terminals, by dynamically configuring the interaction session and service functionalities based on characteristic of the terminal and QoS of the interconnection.
{"title":"Terminal-aware grid resource and service discovery and access based on agents technology","authors":"Rocco Aversa, B. D. Martino, Nicola Mazzocca, S. Venticinque","doi":"10.1109/EMPDP.2004.1271425","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271425","url":null,"abstract":"Mobile agents can provide a suitable framework for supporting resource and service discovery in grid platforms, and can support optimal access and interaction with the user through heterogeneous terminals, differing in terms of memory capacity, computational resources, display characteristics, allowed connection mode, etc. We deal with the utilization of Web services technology to discover and optimally access mobile grid resources and services, within a mobile agent based grid architecture (MAGDA) we have designed and have been implementing. Web services paradigm and SIP and UDDI technologies are utilized to implement a resource discovery service that allow users and mobile agents to look for and access distributed resources and applications, through heterogeneous terminals, by dynamically configuring the interaction session and service functionalities based on characteristic of the terminal and QoS of the interconnection.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116176528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271461
M. Díaz, B. Rubio, E. Soler, J. M. Troya
SBASCO is a new programming environment for the development of parallel and distributed high-performance scientific applications. The approach integrates both skeleton-based and component technologies. The main goal of the proposal is to provide a high-level programmability system for the efficient development of numerical applications with performance portability on different platforms. We present the system programming model which considers two different views of a component interface: one from the point of view of the application programmer and another thought to be used by a configuration tool in order to establish efficient implementations. This can be achieved due to the knowledge at the interface level of data distribution and processor layout inside each component. The programming model borrows from software skeletons a cost model enhanced by a run-time analysis, which enables one to automatically establish a suitable degree of parallelism and replication of the internal structure of a component.
{"title":"SBASCO: skeleton-based scientific components","authors":"M. Díaz, B. Rubio, E. Soler, J. M. Troya","doi":"10.1109/EMPDP.2004.1271461","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271461","url":null,"abstract":"SBASCO is a new programming environment for the development of parallel and distributed high-performance scientific applications. The approach integrates both skeleton-based and component technologies. The main goal of the proposal is to provide a high-level programmability system for the efficient development of numerical applications with performance portability on different platforms. We present the system programming model which considers two different views of a component interface: one from the point of view of the application programmer and another thought to be used by a configuration tool in order to establish efficient implementations. This can be achieved due to the knowledge at the interface level of data distribution and processor layout inside each component. The programming model borrows from software skeletons a cost model enhanced by a run-time analysis, which enables one to automatically establish a suitable degree of parallelism and replication of the internal structure of a component.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128515159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271431
E. M. Garzón, S. Tabik, I. García, A. Bretones
We deal with the computational aspects of a numerical method for solving the electric field integral equation (EFIE) for the analysis of the interaction of electromagnetic signals with thin-wires structures. Our interest is mainly to device an efficient parallel implementation of this numerical method which helps physicist to solve the electric field integral equation for very complex and large thin-wires structures The development of this parallel implementation has been carried out on distributed memory multiprocessors, with the use of the parallel programming library MPI and routines of PETSc (portable, extensible toolkit for scientific computation). These routines can solve sparse linear systems in parallel. Appropriate data partitions have been designed in order to optimize the performance of the parallel implementation. A parameter named relative efficiency has been defined to compare two parallel executions with different number of processors. This parameter allows us to better describe the superlinear performance behavior of our parallel implementation. Evaluation of the parallel implementation is given in terms of the values of the speed-up and the relative efficiency. Moreover, a discussion about the requirements of memory versus the number of processors is included. It will be shown that memory hierarchy management improves substantially as the number of processors increases and that this is the reason why superlinear speed-up is obtained.
{"title":"Multiprocessing of the time domain analysis of thin-wire antennas and scatterers","authors":"E. M. Garzón, S. Tabik, I. García, A. Bretones","doi":"10.1109/EMPDP.2004.1271431","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271431","url":null,"abstract":"We deal with the computational aspects of a numerical method for solving the electric field integral equation (EFIE) for the analysis of the interaction of electromagnetic signals with thin-wires structures. Our interest is mainly to device an efficient parallel implementation of this numerical method which helps physicist to solve the electric field integral equation for very complex and large thin-wires structures The development of this parallel implementation has been carried out on distributed memory multiprocessors, with the use of the parallel programming library MPI and routines of PETSc (portable, extensible toolkit for scientific computation). These routines can solve sparse linear systems in parallel. Appropriate data partitions have been designed in order to optimize the performance of the parallel implementation. A parameter named relative efficiency has been defined to compare two parallel executions with different number of processors. This parameter allows us to better describe the superlinear performance behavior of our parallel implementation. Evaluation of the parallel implementation is given in terms of the values of the speed-up and the relative efficiency. Moreover, a discussion about the requirements of memory versus the number of processors is included. It will be shown that memory hierarchy management improves substantially as the number of processors increases and that this is the reason why superlinear speed-up is obtained.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130668496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271421
V. Felea, B. Toursel
In the context of heterogeneous networks, like clusters of workstations, the design of programming and execution environments aims both to easily express parallelism and distribution of applications, at conception level, and to adapt automatically their execution to fluctuations that may appear in the evolution of applications or in resources availabilities. We present the ADAJ environment (adaptive distributed applications in Java), targeted towards Java applications, which addresses these aims, through conceptual tools offered by the programming environment and through a dynamic load balancing mechanism, integrated at the middleware level. At a conception level, the developer has the possibility of easily activating processing in a MIMD programming model, using library calls. At the execution level, the efficiency of execution in ADAJ exploits an observation mechanism, which allows to acquire information on processing behaviour in order to dynamically redistribute load, by object migrations.
{"title":"Adaptive distributed execution of Java applications","authors":"V. Felea, B. Toursel","doi":"10.1109/EMPDP.2004.1271421","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271421","url":null,"abstract":"In the context of heterogeneous networks, like clusters of workstations, the design of programming and execution environments aims both to easily express parallelism and distribution of applications, at conception level, and to adapt automatically their execution to fluctuations that may appear in the evolution of applications or in resources availabilities. We present the ADAJ environment (adaptive distributed applications in Java), targeted towards Java applications, which addresses these aims, through conceptual tools offered by the programming environment and through a dynamic load balancing mechanism, integrated at the middleware level. At a conception level, the developer has the possibility of easily activating processing in a MIMD programming model, using library calls. At the execution level, the efficiency of execution in ADAJ exploits an observation mechanism, which allows to acquire information on processing behaviour in order to dynamically redistribute load, by object migrations.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"386 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115912239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271426
J. Duato, J. Flich, Teresa Nachiondo Frinós
Head-of-line (HOL) blocking is one of the main problems arising in input-buffered switches. The best-known solution to this problem consists of using virtual output queues (VOQs). However this strategy is not scalable at all. Its implementation cost increases quadratically with the number of ports in the switch. Taking into account current trends, the demand for larger number of ports in high-performance switches is likely to increase very rapidly in the near future. Therefore, a more scalable and cost-effective solution is required. We propose a very efficient and cost-effective technique, referred to as destination-based buffer management (DBBM), to reduce HOL blocking in single-stage and multistage switch. Results show that the use of the DBBM technique with a reduced number of queues at each IA is able to obtain roughly the same throughput as the VOQ mechanism. In particular, the number of queues can be reduced by a factor of up to 8 with the DBBM technique.
{"title":"A cost-effective technique to reduce HOL blocking in single-stage and multistage switch fabrics","authors":"J. Duato, J. Flich, Teresa Nachiondo Frinós","doi":"10.1109/EMPDP.2004.1271426","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271426","url":null,"abstract":"Head-of-line (HOL) blocking is one of the main problems arising in input-buffered switches. The best-known solution to this problem consists of using virtual output queues (VOQs). However this strategy is not scalable at all. Its implementation cost increases quadratically with the number of ports in the switch. Taking into account current trends, the demand for larger number of ports in high-performance switches is likely to increase very rapidly in the near future. Therefore, a more scalable and cost-effective solution is required. We propose a very efficient and cost-effective technique, referred to as destination-based buffer management (DBBM), to reduce HOL blocking in single-stage and multistage switch. Results show that the use of the DBBM technique with a reduced number of queues at each IA is able to obtain roughly the same throughput as the VOQ mechanism. In particular, the number of queues can be reduced by a factor of up to 8 with the DBBM technique.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127436631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271445
Tarek Hagras, J. Janecek
List-based scheduling is generally accepted as an attractive approach to static task scheduling in a homogeneous environment, since it pairs low complexity with good results. We present a low complexity algorithm based on list-scheduling and task-duplication on a bounded number of fully connected homogeneous machines. The algorithm is called critical unlisted parents with fast duplicator (CUPFD). The CUPFD algorithm consists of two phases: the listing phase, which is a simple listing heuristic based on list-scheduling, and a low complexity machine assigning phase based on task-duplication. The experimental work has shown that CUPFD outperformed on average all other higher complexity algorithms.
{"title":"A static task scheduling heuristic for homogeneous computing environments","authors":"Tarek Hagras, J. Janecek","doi":"10.1109/EMPDP.2004.1271445","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271445","url":null,"abstract":"List-based scheduling is generally accepted as an attractive approach to static task scheduling in a homogeneous environment, since it pairs low complexity with good results. We present a low complexity algorithm based on list-scheduling and task-duplication on a bounded number of fully connected homogeneous machines. The algorithm is called critical unlisted parents with fast duplicator (CUPFD). The CUPFD algorithm consists of two phases: the listing phase, which is a simple listing heuristic based on list-scheduling, and a low complexity machine assigning phase based on task-duplication. The experimental work has shown that CUPFD outperformed on average all other higher complexity algorithms.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"130 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131460032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271452
Kaizar Amin, M. Hategan, G. Laszewski, N. Zaluzec
The Grid approach allows collaborative pooling of distributed resources across multiple domains. However, the benefits of the Grid are limited to those offered by the commodity application development framework used. Several elegant and flexible application development frameworks support only specific Grid architectures, thereby not allowing the applications to exploit the full potential of the Grid. In order to initiate community interest to standardize a high-level abstraction layer for different Grid architectures, we introduce a collection of abstractions and data structures that collectively build a basis for an open Grid computing environment.
{"title":"Abstracting the Grid","authors":"Kaizar Amin, M. Hategan, G. Laszewski, N. Zaluzec","doi":"10.1109/EMPDP.2004.1271452","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271452","url":null,"abstract":"The Grid approach allows collaborative pooling of distributed resources across multiple domains. However, the benefits of the Grid are limited to those offered by the commodity application development framework used. Several elegant and flexible application development frameworks support only specific Grid architectures, thereby not allowing the applications to exploit the full potential of the Grid. In order to initiate community interest to standardize a high-level abstraction layer for different Grid architectures, we introduce a collection of abstractions and data structures that collectively build a basis for an open Grid computing environment.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130382568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271455
D. Lioupis, D. Psihogiou, Michalis Stefanidakis
In today's life we are surrounded by numerous embedded devices that serve our daily needs, without being continuously in use. This fact, in conjunction with the tremendous growth of these devices, results in considerable idle time in a home environment. In other words, we are in the middle of a significant amount of underutilized processing power. Here we investigate the idea of exploiting the unused embedded processing power to execute intensive global computing applications, and the hardware and software issues that arise from such an approach. We present a framework enabling the participation of home embedded devices to the global computing grid, which we call e-grid (embedded grid), along with a theoretical analysis of its performance gain. We also develop an experimental setup based on Jini technology, and measure its actual performance, trying to explore the feasibility of the e-grid approach.
{"title":"Exporting processing power of home embedded devices to global computing applications","authors":"D. Lioupis, D. Psihogiou, Michalis Stefanidakis","doi":"10.1109/EMPDP.2004.1271455","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271455","url":null,"abstract":"In today's life we are surrounded by numerous embedded devices that serve our daily needs, without being continuously in use. This fact, in conjunction with the tremendous growth of these devices, results in considerable idle time in a home environment. In other words, we are in the middle of a significant amount of underutilized processing power. Here we investigate the idea of exploiting the unused embedded processing power to execute intensive global computing applications, and the hardware and software issues that arise from such an approach. We present a framework enabling the participation of home embedded devices to the global computing grid, which we call e-grid (embedded grid), along with a theoretical analysis of its performance gain. We also develop an experimental setup based on Jini technology, and measure its actual performance, trying to explore the feasibility of the e-grid approach.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115057010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271462
I. Pardines, F. F. Rivera
In this work, we deal with the problem of minimizing the load redistribution cost in parallel implementations for cluster architectures. Due to the importance of the network latency in this kind of systems, the redistribution cost is primarily depending on the maximum number of messages sent or received by a processor. The load redistribution is a NP-hard problem similar to the multiple knapsack problems. Three heuristics are proposed to solve the problem in a global context, and a comparison is made to emphasize their characteristics. In a parallel application, it is important to decide whether it is efficient or not to carry out the workload redistribution. This decision is taken comparing the cost of the load imbalance and the communication overheads associated with the load balancing heuristic. Depending on these costs, a theoretic value of imbalance from which the redistribution is profitable is defined. Experimental results show the accuracy of our proposals.
{"title":"Minimizing the load redistribution cost in cluster architectures","authors":"I. Pardines, F. F. Rivera","doi":"10.1109/EMPDP.2004.1271462","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271462","url":null,"abstract":"In this work, we deal with the problem of minimizing the load redistribution cost in parallel implementations for cluster architectures. Due to the importance of the network latency in this kind of systems, the redistribution cost is primarily depending on the maximum number of messages sent or received by a processor. The load redistribution is a NP-hard problem similar to the multiple knapsack problems. Three heuristics are proposed to solve the problem in a global context, and a comparison is made to emphasize their characteristics. In a parallel application, it is important to decide whether it is efficient or not to carry out the workload redistribution. This decision is taken comparing the cost of the load imbalance and the communication overheads associated with the load balancing heuristic. Depending on these costs, a theoretic value of imbalance from which the redistribution is profitable is defined. Experimental results show the accuracy of our proposals.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132971971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-03-08DOI: 10.1109/EMPDP.2004.1271472
Oscar Ardaiz-Villanueva, L. Navarro
An application network consists of a number of application servers distributed throughout the Internet, connected and coordinated to provide services with low latency. Adding, removing and migrating servers, application networks adapt to demand variations. To create new servers anywhere in the Internet a programmable Internet service infrastructure is needed. In addition application network servers must be deployed coordinately. We propose a framework for application network deployment that implements such functionality.
{"title":"Xweb: a framework for application network deployment in a programmable Internet service infrastructure","authors":"Oscar Ardaiz-Villanueva, L. Navarro","doi":"10.1109/EMPDP.2004.1271472","DOIUrl":"https://doi.org/10.1109/EMPDP.2004.1271472","url":null,"abstract":"An application network consists of a number of application servers distributed throughout the Internet, connected and coordinated to provide services with low latency. Adding, removing and migrating servers, application networks adapt to demand variations. To create new servers anywhere in the Internet a programmable Internet service infrastructure is needed. In addition application network servers must be deployed coordinately. We propose a framework for application network deployment that implements such functionality.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133316642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}