Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040858
Farag Azzedin, Muthucumaru Maheswaran
Grid computing systems that have been the focus of much research in recent years provide a virtual framework for controlled sharing of resources across institutional boundaries. Security is a major concern in any system that enables remote execution. Several techniques can be used for providing security in grid systems including sandboxing, encryption, and other access control and authentication mechanisms. The additional overhead caused by these mechanisms may negate the performance advantages gained by grid computing. Hence, we contend that it is essential for the scheduler to consider the security implications while performing resource allocations. In this paper, we present a trust model for grid systems and show how the model can be used to incorporate security implications into scheduling algorithms. Three scheduling heuristics that can be used in a grid system are modified to incorporate the trust notion and simulations are performed to evaluate the performance.
{"title":"Integrating trust into grid resource management systems","authors":"Farag Azzedin, Muthucumaru Maheswaran","doi":"10.1109/ICPP.2002.1040858","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040858","url":null,"abstract":"Grid computing systems that have been the focus of much research in recent years provide a virtual framework for controlled sharing of resources across institutional boundaries. Security is a major concern in any system that enables remote execution. Several techniques can be used for providing security in grid systems including sandboxing, encryption, and other access control and authentication mechanisms. The additional overhead caused by these mechanisms may negate the performance advantages gained by grid computing. Hence, we contend that it is essential for the scheduler to consider the security implications while performing resource allocations. In this paper, we present a trust model for grid systems and show how the model can be used to incorporate security implications into scheduling algorithms. Three scheduling heuristics that can be used in a grid system are modified to incorporate the trust notion and simulations are performed to evaluate the performance.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130882413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040868
Wu-chun Feng, Michael S. Warren, E. Weigle
In this paper, we present a novel twist on the Beowulf cluster - the Bladed Beowulf. Designed by RLX Technologies and integrated and configured at Los Alamos National Laboratory, our Bladed Beowulf consists of compute nodes made from commodity off-the-shelf parts mounted on motherboard blades measuring 14.7" /spl times/ 4.7" /spl times/ 0.58". Each motherboard blade (node) contains a 633 MHz Trans-meta TM5600/spl trade/ CPU, 256 MB memory, 10 GB hard disk, and three 100-Mb/s Fast Ethernet network interfaces. Using a chassis provided by RLX, twenty-four such nodes mount side-by-side in a vertical orientation to fit in a rack-mountable 3U space, i.e., 19" in width and 5.25" in height. A Bladed Beowulf can reduce the total cost of ownership (TCO) of a traditional Beowulf by a factor of three while providing Beowulf-like performance. Accordingly, rather than use the traditional definition of price-performance ratio where price is the cost of acquisition, we introduce a new metric called ToPPeR: total price-performance ratio, where total price encompasses TCO. We also propose two related (but more concrete) metrics: performance-space ratio and performance-power ratio.
{"title":"Honey, I shrunk the Beowulf!","authors":"Wu-chun Feng, Michael S. Warren, E. Weigle","doi":"10.1109/ICPP.2002.1040868","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040868","url":null,"abstract":"In this paper, we present a novel twist on the Beowulf cluster - the Bladed Beowulf. Designed by RLX Technologies and integrated and configured at Los Alamos National Laboratory, our Bladed Beowulf consists of compute nodes made from commodity off-the-shelf parts mounted on motherboard blades measuring 14.7\" /spl times/ 4.7\" /spl times/ 0.58\". Each motherboard blade (node) contains a 633 MHz Trans-meta TM5600/spl trade/ CPU, 256 MB memory, 10 GB hard disk, and three 100-Mb/s Fast Ethernet network interfaces. Using a chassis provided by RLX, twenty-four such nodes mount side-by-side in a vertical orientation to fit in a rack-mountable 3U space, i.e., 19\" in width and 5.25\" in height. A Bladed Beowulf can reduce the total cost of ownership (TCO) of a traditional Beowulf by a factor of three while providing Beowulf-like performance. Accordingly, rather than use the traditional definition of price-performance ratio where price is the cost of acquisition, we introduce a new metric called ToPPeR: total price-performance ratio, where total price encompasses TCO. We also propose two related (but more concrete) metrics: performance-space ratio and performance-power ratio.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122539267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040891
A. Doğan, F. Özgüner
Finding an optimal solution to the problem of scheduling an application modeled by a directed acyclic graph (DAG) onto a set of heterogeneous machines is known to be an NP-hard problem. In this study, we present a duplication based scheduling algorithm, namely the levelized duplication based scheduling (LDBS) algorithm, which solves this problem efficiently. The primary goal of LDBS is to minimize the schedule length of applications. LDBS can accommodate different duplication heuristics, thanks to its modular design. Specifically, we have designed two different duplication heuristics with different time complexities. The simulation studies confirm that LDBS is a very competitive scheduling algorithm in terms of minimizing the schedule length of applications.
将由有向无环图(DAG)建模的应用程序调度到一组异构机器上的问题的最优解是一个np困难问题。在本研究中,我们提出了一种基于复制的调度算法,即levelized duplication based scheduling (LDBS)算法,有效地解决了这一问题。LDBS的主要目标是最小化应用程序的进度长度。由于其模块化设计,LDBS可以适应不同的复制启发式。具体来说,我们设计了两种具有不同时间复杂度的重复启发式算法。仿真研究表明,在最小化应用程序调度长度方面,LDBS是一种极具竞争力的调度算法。
{"title":"LDBS: a duplication based scheduling algorithm for heterogeneous computing systems","authors":"A. Doğan, F. Özgüner","doi":"10.1109/ICPP.2002.1040891","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040891","url":null,"abstract":"Finding an optimal solution to the problem of scheduling an application modeled by a directed acyclic graph (DAG) onto a set of heterogeneous machines is known to be an NP-hard problem. In this study, we present a duplication based scheduling algorithm, namely the levelized duplication based scheduling (LDBS) algorithm, which solves this problem efficiently. The primary goal of LDBS is to minimize the schedule length of applications. LDBS can accommodate different duplication heuristics, thanks to its modular design. Specifically, we have designed two different duplication heuristics with different time complexities. The simulation studies confirm that LDBS is a very competitive scheduling algorithm in terms of minimizing the schedule length of applications.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126579189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040866
Jeffrey Tang, A. Bilas
In this paper, we investigate how system area networks can deal with transient and permanent network failures. We design and implement a firmware-level retransmission scheme to tolerate transient failures and an on-demand network mapping scheme to deal with permanent failures. Both schemes are transparent to applications and are conceptually simple and suitable for low-level implementations, e.g. in firmware. We then examine how the retransmission scheme affects system performance and how various protocol parameters impact system behavior. We analyze and evaluate system performance by using a real implementation on a state-of-the art cluster and both micro-benchmarks and real applications from the SPLASH-2 suite.
{"title":"Tolerating network failures in system area networks","authors":"Jeffrey Tang, A. Bilas","doi":"10.1109/ICPP.2002.1040866","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040866","url":null,"abstract":"In this paper, we investigate how system area networks can deal with transient and permanent network failures. We design and implement a firmware-level retransmission scheme to tolerate transient failures and an on-demand network mapping scheme to deal with permanent failures. Both schemes are transparent to applications and are conceptually simple and suitable for low-level implementations, e.g. in firmware. We then examine how the retransmission scheme affects system performance and how various protocol parameters impact system behavior. We analyze and evaluate system performance by using a real implementation on a state-of-the art cluster and both micro-benchmarks and real applications from the SPLASH-2 suite.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122483116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040889
A. Kalyanaraman, S. Aluru, S. Kothari
Expressed sequence tags, ESTs, are DNA molecules experimentally derived from expressed portions of genes. Clustering of ESTs is essential for gene recognition and understanding important genetic variations such as those resulting in diseases. In this paper, we present the design and development of a parallel software system for EST clustering. To our knowledge, this is the first such effort to address the problem of EST clustering in parallel. The novel features of our approach include 1) design of space efficient algorithms to keep the space requirement linear in the size of the input data set, 2) a combination of algorithmic techniques to reduce the total work without sacrificing the quality of EST clustering, and 3) use of parallel processing to reduce the run-time and facilitate the clustering of large datasets. Using a combination of these techniques, we report the clustering of 81,414 Arabidopsis ESTs in under 2.5 minutes on a 64-processor IBM SP, a problem that is estimated to take 9 hours of run-time with a state-of-the-art software, provided the memory required to run the software can be made available.
{"title":"Space and time efficient parallel algorithms and software for EST clustering","authors":"A. Kalyanaraman, S. Aluru, S. Kothari","doi":"10.1109/ICPP.2002.1040889","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040889","url":null,"abstract":"Expressed sequence tags, ESTs, are DNA molecules experimentally derived from expressed portions of genes. Clustering of ESTs is essential for gene recognition and understanding important genetic variations such as those resulting in diseases. In this paper, we present the design and development of a parallel software system for EST clustering. To our knowledge, this is the first such effort to address the problem of EST clustering in parallel. The novel features of our approach include 1) design of space efficient algorithms to keep the space requirement linear in the size of the input data set, 2) a combination of algorithmic techniques to reduce the total work without sacrificing the quality of EST clustering, and 3) use of parallel processing to reduce the run-time and facilitate the clustering of large datasets. Using a combination of these techniques, we report the clustering of 81,414 Arabidopsis ESTs in under 2.5 minutes on a 64-processor IBM SP, a problem that is estimated to take 9 hours of run-time with a state-of-the-art software, provided the memory required to run the software can be made available.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131384925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040863
R. Prodan, T. Fahringer
This paper describes ZEN, a directive-based language for the specification of arbitrarily complex program executions by varying the problem, system, or machine parameters for parallel and distributed applications. ZEN introduces directives to substitute strings and to insert assignment statements inside arbitrary files, such as program, input, script, or make-files. The programmer thus can invoke experiments for arbitrary value ranges of any problem parameter, including program variables, file names, compiler options, target machines, machine sizes, scheduling strategies, data distributions, etc. The number of experiments can be controlled through ZEN constraint directives. Finally, the programmer may request a large set of performance metrics to be computed for any code region of interest. The scope of ZEN directives can be restricted to arbitrary file or code regions. We implemented a prototype tool for automatic experiment management that is based on ZEN. We report results for the performance analysis of an ocean simulation application and for the parameter study of a computational finance code.
{"title":"ZEN: a directive-based language for automatic experiment management of distributed and parallel programs","authors":"R. Prodan, T. Fahringer","doi":"10.1109/ICPP.2002.1040863","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040863","url":null,"abstract":"This paper describes ZEN, a directive-based language for the specification of arbitrarily complex program executions by varying the problem, system, or machine parameters for parallel and distributed applications. ZEN introduces directives to substitute strings and to insert assignment statements inside arbitrary files, such as program, input, script, or make-files. The programmer thus can invoke experiments for arbitrary value ranges of any problem parameter, including program variables, file names, compiler options, target machines, machine sizes, scheduling strategies, data distributions, etc. The number of experiments can be controlled through ZEN constraint directives. Finally, the programmer may request a large set of performance metrics to be computed for any code region of interest. The scope of ZEN directives can be restricted to arbitrary file or code regions. We implemented a prototype tool for automatic experiment management that is based on ZEN. We report results for the performance analysis of an ocean simulation application and for the parameter study of a computational finance code.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134124757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040865
Hairong Kuang, L. Bic, M. Dillencourt
We describe an environment for the distributed solution of iterative grid-based applications. The environment is built using the MESSENGERS mobile agent system. The main advantage of paradigm-oriented distributed computing is that the user only needs to specify the application-specific sequential code, while the underlying infrastructure takes care of the parallelization and distribution. The two paradigms discussed in this papers are: the finite difference method, and individual-based simulation. These paradigms present some interesting challenges, both in terms of performance (because they require frequent synchronized communication between nodes) and in terms of repeatability (because the mapping of the user space onto the network may change due to load balancing or due to changes in the underlying logical network). We describe their use, implementation, and performance within a mobile agent-based environment.
{"title":"Iterative grid-based computing using mobile agents","authors":"Hairong Kuang, L. Bic, M. Dillencourt","doi":"10.1109/ICPP.2002.1040865","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040865","url":null,"abstract":"We describe an environment for the distributed solution of iterative grid-based applications. The environment is built using the MESSENGERS mobile agent system. The main advantage of paradigm-oriented distributed computing is that the user only needs to specify the application-specific sequential code, while the underlying infrastructure takes care of the parallelization and distribution. The two paradigms discussed in this papers are: the finite difference method, and individual-based simulation. These paradigms present some interesting challenges, both in terms of performance (because they require frequent synchronized communication between nodes) and in terms of repeatability (because the mapping of the user space onto the network may change due to load balancing or due to changes in the underlying logical network). We describe their use, implementation, and performance within a mobile agent-based environment.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130354807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040901
Huanjing Wang, Guangbin Fan, Jingyuan Zhang
Location management deals with how to track mobile users within the cellular network. It consists of two basic operations: location update and paging. The total location management cost is the sum of the location update cost and the paging cost. Location areas and reporting centers are two popular location management schemes. The motivation for the study is the observation that the location update cost difference between the reporting centers scheme and the location areas scheme is small whereas the paging cost in the reporting centers scheme is larger than that in the location areas scheme. The paper compares the performance of the location areas scheme and the reporting centers scheme under aggregate movement behavior mobility models by simulations. Simulation results show that the location areas scheme performs about the same as the reporting centers scheme in two extreme cases, that is, when a few cells or almost all cells are selected as the reporting cells. However, the location areas scheme outperforms the reporting centers scheme at the 100% confidence level with all call-to-mobility ratios when the reporting cells divide the whole service area into several regions.
{"title":"Performance comparison of location areas and reporting centers under aggregate movement behavior mobility models","authors":"Huanjing Wang, Guangbin Fan, Jingyuan Zhang","doi":"10.1109/ICPP.2002.1040901","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040901","url":null,"abstract":"Location management deals with how to track mobile users within the cellular network. It consists of two basic operations: location update and paging. The total location management cost is the sum of the location update cost and the paging cost. Location areas and reporting centers are two popular location management schemes. The motivation for the study is the observation that the location update cost difference between the reporting centers scheme and the location areas scheme is small whereas the paging cost in the reporting centers scheme is larger than that in the location areas scheme. The paper compares the performance of the location areas scheme and the reporting centers scheme under aggregate movement behavior mobility models by simulations. Simulation results show that the location areas scheme performs about the same as the reporting centers scheme in two extreme cases, that is, when a few cells or almost all cells are selected as the reporting cells. However, the location areas scheme outperforms the reporting centers scheme at the 100% confidence level with all call-to-mobility ratios when the reporting cells divide the whole service area into several regions.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114256404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040878
D. Xiang, Ai Chen
A limited-global-safety-information-based metric called local safety is proposed to handle fault-tolerant routing in 2D tori (or meshes). Sufficient conditions for existence of a minimum feasible path between the source and destination is presented based on local safety information in a 2D torus network. An efficient heuristic function is defined to guide fault-tolerant routing inside a 2D torus network. Unlike the conventional methods based on the block fault model, our method does not disable any fault-free nodes and fault-free nodes inside a fault block can still be a source or a destination, which can greatly increase throughput and computational power of the system. Techniques for avoidance of deadlocks are introduced. Extensive simulation results are presented.
{"title":"Fault-tolerant routing in 2D tori or meshes using limited-global-safety information","authors":"D. Xiang, Ai Chen","doi":"10.1109/ICPP.2002.1040878","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040878","url":null,"abstract":"A limited-global-safety-information-based metric called local safety is proposed to handle fault-tolerant routing in 2D tori (or meshes). Sufficient conditions for existence of a minimum feasible path between the source and destination is presented based on local safety information in a 2D torus network. An efficient heuristic function is defined to guide fault-tolerant routing inside a 2D torus network. Unlike the conventional methods based on the block fault model, our method does not disable any fault-free nodes and fault-free nodes inside a fault block can still be a source or a destination, which can greatly increase throughput and computational power of the system. Techniques for avoidance of deadlocks are introduced. Extensive simulation results are presented.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114363450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-18DOI: 10.1109/ICPP.2002.1040917
Dakai Zhu, Nevine AbouGhazaleh, D. Mossé, R. Melhem
Power aware computing has become popular recently and many techniques have been proposed to manage the energy consumption for traditional real-time applications. We have previously proposed (2001) two greedy slack sharing scheduling algorithms for such applications on multi-processor systems. In this paper, we are concerned mainly with real-time applications that have different execution paths consisting of different number of tasks. The AND/OR graph model is used to represent the application data dependence and control flow. The contribution of this paper is twofold. First, we extend our greedy slack sharing algorithm for traditional applications to deal with applications represented by AND/OR graphs. Then, using the statistical information about the applications, we propose a few variations of speculative scheduling algorithms that intend to save energy by reducing the number of speed changes (and thus the overhead) while ensuring that the applications meet the timing constraints. The performance of the algorithms is analyzed with respect to energy savings. The results obtained show that the greedy scheme is better than some speculative schemes and that the greedy scheme is good enough when a reasonable minimal speed exists in the system.
{"title":"Power aware scheduling for AND/OR graphs in multiprocessor real-time systems","authors":"Dakai Zhu, Nevine AbouGhazaleh, D. Mossé, R. Melhem","doi":"10.1109/ICPP.2002.1040917","DOIUrl":"https://doi.org/10.1109/ICPP.2002.1040917","url":null,"abstract":"Power aware computing has become popular recently and many techniques have been proposed to manage the energy consumption for traditional real-time applications. We have previously proposed (2001) two greedy slack sharing scheduling algorithms for such applications on multi-processor systems. In this paper, we are concerned mainly with real-time applications that have different execution paths consisting of different number of tasks. The AND/OR graph model is used to represent the application data dependence and control flow. The contribution of this paper is twofold. First, we extend our greedy slack sharing algorithm for traditional applications to deal with applications represented by AND/OR graphs. Then, using the statistical information about the applications, we propose a few variations of speculative scheduling algorithms that intend to save energy by reducing the number of speed changes (and thus the overhead) while ensuring that the applications meet the timing constraints. The performance of the algorithms is analyzed with respect to energy savings. The results obtained show that the greedy scheme is better than some speculative schemes and that the greedy scheme is good enough when a reasonable minimal speed exists in the system.","PeriodicalId":393916,"journal":{"name":"Proceedings International Conference on Parallel Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2002-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114873402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}