Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741109
M. Cosnard, E. Jeannot, Tao Yang
The DAG based task graph model has been found effective in scheduling for performance prediction and optimization of parallel applications. However the scheduling complexity and solution normally depend on the problem size. We propose a symbolic scheduling scheme for a parameterized task graph which models coarse grain DAG parallelism, independent of the problem size. The algorithm first derives symbolic clusters to a group of tasks in order to minimize communication while preserving parallelism, and then it evenly assigns task clusters to processors. The run time system executes clusters on each processor in a multithreaded fashion. The paper also presents preliminary experimental results to demonstrate the effectiveness of our techniques.
{"title":"Symbolic partitioning and scheduling of parameterized task graphs","authors":"M. Cosnard, E. Jeannot, Tao Yang","doi":"10.1109/ICPADS.1998.741109","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741109","url":null,"abstract":"The DAG based task graph model has been found effective in scheduling for performance prediction and optimization of parallel applications. However the scheduling complexity and solution normally depend on the problem size. We propose a symbolic scheduling scheme for a parameterized task graph which models coarse grain DAG parallelism, independent of the problem size. The algorithm first derives symbolic clusters to a group of tasks in order to minimize communication while preserving parallelism, and then it evenly assigns task clusters to processors. The run time system executes clusters on each processor in a multithreaded fashion. The paper also presents preliminary experimental results to demonstrate the effectiveness of our techniques.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130093901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741154
A. Yeung, K. Suen
This paper describes a scalable Web server array architecture which uses a caching policy called the probability based replacement (PBR) algorithm. The server array consists of a central server and several Web servers. The central server stores the whole document set, and sends the user requested documents to the Web servers by a technique called the selective broadcast technique. Web documents are cached in the Web servers and are replaced based on the PBR algorithm. Performance comparison using NASA and ClarkNet access logs between PBR server arrays and purely mirrored Web servers is performed. The results show that with 10% document caching, the maximum throughput of the former one is nearly the same as that of mirrored Web servers. The PBR server arrays, however, require much smaller disk storage in the Web servers than the mirrored Web servers. The PBR server arrays are also much more scalable than the mirrored ones.
{"title":"Probability based replacement algorithm for WWW server arrays","authors":"A. Yeung, K. Suen","doi":"10.1109/ICPADS.1998.741154","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741154","url":null,"abstract":"This paper describes a scalable Web server array architecture which uses a caching policy called the probability based replacement (PBR) algorithm. The server array consists of a central server and several Web servers. The central server stores the whole document set, and sends the user requested documents to the Web servers by a technique called the selective broadcast technique. Web documents are cached in the Web servers and are replaced based on the PBR algorithm. Performance comparison using NASA and ClarkNet access logs between PBR server arrays and purely mirrored Web servers is performed. The results show that with 10% document caching, the maximum throughput of the former one is nearly the same as that of mirrored Web servers. The PBR server arrays, however, require much smaller disk storage in the Web servers than the mirrored Web servers. The PBR server arrays are also much more scalable than the mirrored ones.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133055339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741025
M. Chang, Deng-Jyi Chen, Min-Sheng Lin, K. Ku
We show that computing distributed program reliability on the star distributed computing system is NP-hard. We develop a polynomially solvable case to compute distributed program reliability when some additional file distribution is restricted on the star topology. We also propose a polynomial time algorithm for computing distributed program reliability with an approximate solution when the star topology is not satisfied with the additional file distribution.
{"title":"The distributed program reliability analysis on star topologies","authors":"M. Chang, Deng-Jyi Chen, Min-Sheng Lin, K. Ku","doi":"10.1109/ICPADS.1998.741025","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741025","url":null,"abstract":"We show that computing distributed program reliability on the star distributed computing system is NP-hard. We develop a polynomially solvable case to compute distributed program reliability when some additional file distribution is restricted on the star topology. We also propose a polynomial time algorithm for computing distributed program reliability with an approximate solution when the star topology is not satisfied with the additional file distribution.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116212017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741041
A. Mostéfaoui, Oliver E. Theel
Almost all published work on causal ordering mechanisms assumes theoretically unbounded counters for timestamps, thus ignoring the real world problem that arises if one is actually interested in an operable implementation, since unbounded counters simply cannot be realized. An argument for its justification often encountered states, that the counter size can be chosen such that counters practically do not overflow or wrap around. For example, using matrix timestamps in a distributed computation involving not more than 50 processes and 32 bits per integer, results in a timestamp size of almost 10 K byte. We present a solution, called Factorized Timestamp Approach (FTA) that substantially reduces the amount of piggybacked control information. It is based on introducing the notion of phases in which much smaller timestamps are used. Simulation results given in the paper show the suitability of this approach.
{"title":"Shrinking timestamp sizes of event ordering protocols","authors":"A. Mostéfaoui, Oliver E. Theel","doi":"10.1109/ICPADS.1998.741041","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741041","url":null,"abstract":"Almost all published work on causal ordering mechanisms assumes theoretically unbounded counters for timestamps, thus ignoring the real world problem that arises if one is actually interested in an operable implementation, since unbounded counters simply cannot be realized. An argument for its justification often encountered states, that the counter size can be chosen such that counters practically do not overflow or wrap around. For example, using matrix timestamps in a distributed computation involving not more than 50 processes and 32 bits per integer, results in a timestamp size of almost 10 K byte. We present a solution, called Factorized Timestamp Approach (FTA) that substantially reduces the amount of piggybacked control information. It is based on introducing the notion of phases in which much smaller timestamps are used. Simulation results given in the paper show the suitability of this approach.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123988722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741023
T. Mitra, T. Chiueh
Describes the implementation and performance evaluation of a 3D graphics library that can be readily linked with parallel applications to provide run-time visualization on large-scale message-passing parallel machines, such as the Intel Paragon. The prototype implementation is currently fully operational, and is based on Mesa, a public-domain OpenGL implementation, and on a sort-last parallelization strategy. Through a detailed performance analysis, we show that the scalability of the current prototype is close to the theoretical limit for the given hardware architecture. We have also developed a unified framework to describe parallel compositing algorithms and show that two popular parallel compositing algorithms, binary swapping and parallel pipeline compositing, are just two extreme instances of this framework. Such a framework is important because it allows users to tailor the compositing algorithm according to the computation/communication characteristics of specific parallel machines by tuning the parameters appropriately. The current parallel Mesa library prototype implements such a parameterizable family of compositing algorithms.
{"title":"Implementation and evaluation of the parallel Mesa library","authors":"T. Mitra, T. Chiueh","doi":"10.1109/ICPADS.1998.741023","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741023","url":null,"abstract":"Describes the implementation and performance evaluation of a 3D graphics library that can be readily linked with parallel applications to provide run-time visualization on large-scale message-passing parallel machines, such as the Intel Paragon. The prototype implementation is currently fully operational, and is based on Mesa, a public-domain OpenGL implementation, and on a sort-last parallelization strategy. Through a detailed performance analysis, we show that the scalability of the current prototype is close to the theoretical limit for the given hardware architecture. We have also developed a unified framework to describe parallel compositing algorithms and show that two popular parallel compositing algorithms, binary swapping and parallel pipeline compositing, are just two extreme instances of this framework. Such a framework is important because it allows users to tailor the compositing algorithm according to the computation/communication characteristics of specific parallel machines by tuning the parameters appropriately. The current parallel Mesa library prototype implements such a parameterizable family of compositing algorithms.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130362008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741117
Ching-Yun Lee, Y. Yeh, Deng-Jyi Chen, K. Ku
The use of the Internet for various business applications and resource sharing has grown tremendously over the last few years. In some applications, an important document may be required to be divided into pieces and be allocated in different locations over the Internet for security access concerns. For example, an important map that can be used to access a military base, an important key that can be used to give a military order or command. To access such an important document, one must reconstruct the divided pieces from different locations. In this paper, a probability model for reconstructing the secret sharing under the Internet is proposed. Also, how to assign the divided shares into different locations is studied. Particularly, algorithms to perform share assignment and to reconstruct the divided pieces into the original secret are proposed.
{"title":"A share assignment method to maximize the probability of secret sharing reconstruction under the Internet","authors":"Ching-Yun Lee, Y. Yeh, Deng-Jyi Chen, K. Ku","doi":"10.1109/ICPADS.1998.741117","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741117","url":null,"abstract":"The use of the Internet for various business applications and resource sharing has grown tremendously over the last few years. In some applications, an important document may be required to be divided into pieces and be allocated in different locations over the Internet for security access concerns. For example, an important map that can be used to access a military base, an important key that can be used to give a military order or command. To access such an important document, one must reconstruct the divided pieces from different locations. In this paper, a probability model for reconstructing the secret sharing under the Internet is proposed. Also, how to assign the divided shares into different locations is studied. Particularly, algorithms to perform share assignment and to reconstruct the divided pieces into the original secret are proposed.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126246033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741114
Xiaomei Zhu, Wei-Ming Lin
The mesh is a widely used architecture in parallel computing systems. Research on efficient allocation of processors to incoming tasks on a mesh architecture is very important in achieving the desired high performance. The processor allocation strategy proposed in this paper is based on a well-known boundary search approach by considering allocation time similarity as another primary allocation decision-making factor. In this proposed technique, an additional novel heuristic is employed to consider allocating tasks with similar allocation times with submeshes adjacent to each other, whenever feasible. The external fragmentation problem is expected to be alleviated, which leads to improvement in, better utilization and shorter task waiting time. Our simulation results demonstrate a substantial improvement.
{"title":"Allocation time-based processor allocation scheme for 2D mesh architecture","authors":"Xiaomei Zhu, Wei-Ming Lin","doi":"10.1109/ICPADS.1998.741114","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741114","url":null,"abstract":"The mesh is a widely used architecture in parallel computing systems. Research on efficient allocation of processors to incoming tasks on a mesh architecture is very important in achieving the desired high performance. The processor allocation strategy proposed in this paper is based on a well-known boundary search approach by considering allocation time similarity as another primary allocation decision-making factor. In this proposed technique, an additional novel heuristic is employed to consider allocating tasks with similar allocation times with submeshes adjacent to each other, whenever feasible. The external fragmentation problem is expected to be alleviated, which leads to improvement in, better utilization and shorter task waiting time. Our simulation results demonstrate a substantial improvement.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124697908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}