Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741160
Q. Gu, S. Peng
One of the fundamental routing problems is to find a path from a source node s to a target node t in computer/communication networks. In an n-connected network, a nonfaulty path from s to t exists if there are at most n-1 faulty nodes. However, the network can be disconnected by n faulty nodes. Since the connectivity is usually a worst-case measure which is unlikely to happen in practice, it is important to develop routing algorithms for the case that more than n-1 faulty nodes present. We propose algorithms for finding the routing path from s to t in a hypercube with a large number of faulty nodes. Let H/sub n/ be the n-dimensional hypercube and H/sub n//F be the reduced graph obtained by removing the nodes of F from H/sub n/. The reduced graph H/sub n/F is called k-safe if each node of H/sub n//F has degree at least k. Our first algorithm, given a set F of faulty nodes in H/sub n/ such that |F|/spl les/2/sup k/(n-k)-1 and H/sub n//F is k-safe for 0/spl les/k/spl les/n/2, and s,t /spl isin/H/sub n//F, finds a nonfaulty free path s/spl rarr/t of length d(s,t)+O(k/sup 2/) in O(|F|+n) optimal time, where d(s,t) is the distance between s and t. We show that a lower bound on the length of the nonfaulty path s/spl rarr/t is d(s,t)+2(k+1) for 0/spl les/k/spl les/n/2. Furthermore, for k=1 and 2, we give O(n) time algorithms which find a nonfaulty path s/spl rarr/t of length at most d(s,t)+4 and d(s,t)+6, respectively, which is tight to the lower bound.
{"title":"Routing in hypercubes with large number of faulty nodes","authors":"Q. Gu, S. Peng","doi":"10.1109/ICPADS.1998.741160","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741160","url":null,"abstract":"One of the fundamental routing problems is to find a path from a source node s to a target node t in computer/communication networks. In an n-connected network, a nonfaulty path from s to t exists if there are at most n-1 faulty nodes. However, the network can be disconnected by n faulty nodes. Since the connectivity is usually a worst-case measure which is unlikely to happen in practice, it is important to develop routing algorithms for the case that more than n-1 faulty nodes present. We propose algorithms for finding the routing path from s to t in a hypercube with a large number of faulty nodes. Let H/sub n/ be the n-dimensional hypercube and H/sub n//F be the reduced graph obtained by removing the nodes of F from H/sub n/. The reduced graph H/sub n/F is called k-safe if each node of H/sub n//F has degree at least k. Our first algorithm, given a set F of faulty nodes in H/sub n/ such that |F|/spl les/2/sup k/(n-k)-1 and H/sub n//F is k-safe for 0/spl les/k/spl les/n/2, and s,t /spl isin/H/sub n//F, finds a nonfaulty free path s/spl rarr/t of length d(s,t)+O(k/sup 2/) in O(|F|+n) optimal time, where d(s,t) is the distance between s and t. We show that a lower bound on the length of the nonfaulty path s/spl rarr/t is d(s,t)+2(k+1) for 0/spl les/k/spl les/n/2. Furthermore, for k=1 and 2, we give O(n) time algorithms which find a nonfaulty path s/spl rarr/t of length at most d(s,t)+4 and d(s,t)+6, respectively, which is tight to the lower bound.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123803961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741159
C. Sue, S. Kuo
In this paper, we propose a synchronous flow control mechanism in wormhole routed optical networks. It is expected that the benefit of shorter routing delay and smaller buffer size requirement in wormhole routing will be significant in optical networks. Different from the traditional bi-directional asynchronous back-pressure flow control, the flow control is modified to be unidirectional and synchronous. The size of synchronized control slot does not depend on the routing path length and the number of bits is a constant which is equal to the total number of virtual channels and nodes. The proposed flow control takes advantage of the restricted order of accessing channels in deadlock-free routing to broadcast their control information in a corresponding restricted order. Furthermore, in order to reduce the buffer size to only one unit, the virtual channels which share the same physical channel must be able to simultaneously transmit data. The low channel utilization induced by such mechanism is overcome by our modified source routing. In summary, this paper introduces a flow control mechanism which easily incorporates the benefit of wormhole routing into the limited-resource optical networks.
{"title":"Synchronous flow control in wormhole routed optical networks","authors":"C. Sue, S. Kuo","doi":"10.1109/ICPADS.1998.741159","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741159","url":null,"abstract":"In this paper, we propose a synchronous flow control mechanism in wormhole routed optical networks. It is expected that the benefit of shorter routing delay and smaller buffer size requirement in wormhole routing will be significant in optical networks. Different from the traditional bi-directional asynchronous back-pressure flow control, the flow control is modified to be unidirectional and synchronous. The size of synchronized control slot does not depend on the routing path length and the number of bits is a constant which is equal to the total number of virtual channels and nodes. The proposed flow control takes advantage of the restricted order of accessing channels in deadlock-free routing to broadcast their control information in a corresponding restricted order. Furthermore, in order to reduce the buffer size to only one unit, the virtual channels which share the same physical channel must be able to simultaneously transmit data. The low channel utilization induced by such mechanism is overcome by our modified source routing. In summary, this paper introduces a flow control mechanism which easily incorporates the benefit of wormhole routing into the limited-resource optical networks.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120976173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741022
Shin-ya Kobayashi, Koji Matsuura
Permanent time stamp ordering is one of the concurrency control mechanism for a distributed database system. It can guarantee that transaction can be processed in the order of arrival. According to this method, the transaction is temporally committed when it has been processed, and it is truly committed when its timestamp becomes less than the least timestamp of the processing transactions. In this paper, we propose a hierarchical commitment algorithm for permanent timestamp ordering and we also show that our new algorithm is superior to traditional centralized or distributed commitment algorithms with respect to the response time of the transaction.
{"title":"Hierarchical commitment algorithm for permanent time stamp ordering","authors":"Shin-ya Kobayashi, Koji Matsuura","doi":"10.1109/ICPADS.1998.741022","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741022","url":null,"abstract":"Permanent time stamp ordering is one of the concurrency control mechanism for a distributed database system. It can guarantee that transaction can be processed in the order of arrival. According to this method, the transaction is temporally committed when it has been processed, and it is truly committed when its timestamp becomes less than the least timestamp of the processing transactions. In this paper, we propose a hierarchical commitment algorithm for permanent timestamp ordering and we also show that our new algorithm is superior to traditional centralized or distributed commitment algorithms with respect to the response time of the transaction.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129720208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741136
P. Kopriva, P. Tvrdík
This paper deals with problems of decomposition of de Bruijn graph into isomorphic building blocks based on cover sets. The aim is to find so called lowest-cost cover sets which provide decompositions into building blocks with minimal number of external edges. We present formulae for the costs of basic covers. We give new results on the lowest-cost cover set design. We also give new results on the topology of graphs of building blocks based on basic cover sets. We also discuss several open problems.
{"title":"Decompositions of de Bruijn networks","authors":"P. Kopriva, P. Tvrdík","doi":"10.1109/ICPADS.1998.741136","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741136","url":null,"abstract":"This paper deals with problems of decomposition of de Bruijn graph into isomorphic building blocks based on cover sets. The aim is to find so called lowest-cost cover sets which provide decompositions into building blocks with minimal number of external edges. We present formulae for the costs of basic covers. We give new results on the lowest-cost cover set design. We also give new results on the topology of graphs of building blocks based on basic cover sets. We also discuss several open problems.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123426811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741086
S. Lang, L. Mao
We extend a torus-based coterie structure for distributed mutual exclusion to allow k multiple entries in a critical section. In the original coterie, the system nodes are logically arranged in a rectangle, called a torus, in which the last row (column) is followed by the first row (column) using end wraparound. A torus quorum consists of a head and a tail, where the head contains one entire row and the tail contains one node from each of the s succeeding rows, s/spl ges/1 is a system parameter. It has been shown that by setting s=[h/2], where h=the number of rows, the collection of torus quorums form an equal-sized, equal-responsibility coterie. In this paper we propose two extensions to k-coteries: the Div-Torus method divides the system nodes into k clusters and runs a separate instance of a torus coterie in each cluster; the k-Torus method uses quorums of tail s=[h/(k+1)]. We compare the quorum size and quorum availability of the two proposed methods, and against the DIV method which is based on the majority quorums in each of the k divided clusters, assuming the node reliability is a constant. Numerical data demonstrate that DIV and Div-Torus have similar system availability, better than that of the k-Torus, although all 3 methods' availability becomes comparable when the node reliability is higher than 0.9. However, Div-Torus has the smallest quorum size and k-Torus the second smallest, which has the potential of causing less network traffic when requesting permissions from a quorum.
{"title":"A comparison of two torus-based k-coteries","authors":"S. Lang, L. Mao","doi":"10.1109/ICPADS.1998.741086","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741086","url":null,"abstract":"We extend a torus-based coterie structure for distributed mutual exclusion to allow k multiple entries in a critical section. In the original coterie, the system nodes are logically arranged in a rectangle, called a torus, in which the last row (column) is followed by the first row (column) using end wraparound. A torus quorum consists of a head and a tail, where the head contains one entire row and the tail contains one node from each of the s succeeding rows, s/spl ges/1 is a system parameter. It has been shown that by setting s=[h/2], where h=the number of rows, the collection of torus quorums form an equal-sized, equal-responsibility coterie. In this paper we propose two extensions to k-coteries: the Div-Torus method divides the system nodes into k clusters and runs a separate instance of a torus coterie in each cluster; the k-Torus method uses quorums of tail s=[h/(k+1)]. We compare the quorum size and quorum availability of the two proposed methods, and against the DIV method which is based on the majority quorums in each of the k divided clusters, assuming the node reliability is a constant. Numerical data demonstrate that DIV and Div-Torus have similar system availability, better than that of the k-Torus, although all 3 methods' availability becomes comparable when the node reliability is higher than 0.9. However, Div-Torus has the smallest quorum size and k-Torus the second smallest, which has the potential of causing less network traffic when requesting permissions from a quorum.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114970573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741119
S.-K. Cheng, R.-Ming Shiu, J. Shann
In the new generation of x86 microprocessors, superscalar techniques are used to achieve higher performance by executing multiple instructions in parallel. For compatibility and higher execution parallelism, the decoding units of these microprocessors translate the x86 instructions into primitive operations. These microprocessors translate x86 instructions by the similar way of merging address generating into load/store operations. We develop a new translation strategy of translating isolated address generation operations. Simulation results show that, in high issue rate decoding units, translating isolated address generation operations improves the performance for 20% to 25%. Besides, we find that enhancing the store buffer with the ability of snooping result buses is important for high issue rate decoding units. Furthermore, considering the tradeoff of the hardware cost and performance, we examine the decoding rules to design a decoding unit. According to the simulation results, we suggest a good decoding rule suitable for current commercial programs.
{"title":"Decoding unit with high issue rate for x86 superscalar microprocessors","authors":"S.-K. Cheng, R.-Ming Shiu, J. Shann","doi":"10.1109/ICPADS.1998.741119","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741119","url":null,"abstract":"In the new generation of x86 microprocessors, superscalar techniques are used to achieve higher performance by executing multiple instructions in parallel. For compatibility and higher execution parallelism, the decoding units of these microprocessors translate the x86 instructions into primitive operations. These microprocessors translate x86 instructions by the similar way of merging address generating into load/store operations. We develop a new translation strategy of translating isolated address generation operations. Simulation results show that, in high issue rate decoding units, translating isolated address generation operations improves the performance for 20% to 25%. Besides, we find that enhancing the store buffer with the ability of snooping result buses is important for high issue rate decoding units. Furthermore, considering the tradeoff of the hardware cost and performance, we examine the decoding rules to design a decoding unit. According to the simulation results, we suggest a good decoding rule suitable for current commercial programs.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127976606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741115
S. O. Hwang
This paper deals with a new payment scheme on the Internet, the electronic exchange check system, based on Nguyen's (1997) scheme and Hwang's (1997) scheme. The electronic exchange check system is a kind of electronic cash system which has pre-denoted receiver and effective time. The scheme satisfies all the basic requirements for an electronic payment scheme in the aspect of security and privacy. In particular, it provides the prior-restraint of double-spending as well as the detection of the identity of a double-spender after the fact.
{"title":"Electronic exchange check system on the Internet","authors":"S. O. Hwang","doi":"10.1109/ICPADS.1998.741115","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741115","url":null,"abstract":"This paper deals with a new payment scheme on the Internet, the electronic exchange check system, based on Nguyen's (1997) scheme and Hwang's (1997) scheme. The electronic exchange check system is a kind of electronic cash system which has pre-denoted receiver and effective time. The scheme satisfies all the basic requirements for an electronic payment scheme in the aspect of security and privacy. In particular, it provides the prior-restraint of double-spending as well as the detection of the identity of a double-spender after the fact.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124720127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741144
Renhao Lin, J.-M. Lin, H. J. Jiau
CORBA is becoming the most important middleware that supports object oriented and client/server paradigm in distributed computing systems. However the application systems based on CORBA are still scarce to date. One main reason is that only few CORBA object services have been developed. Without added help mechanisms, to have a new CORBA application, a programmer should make efforts to design a program with a CORBA interface from scratch. In our previous work (K. Liang et al., 1997), a reengineering approach was proposed to convert RPC based programs to CORBA objects, which successfully speeded up the development of CORBA applications. However the source code is required in this approach. In many cases, software designers may not be able to get hold of the source code, so it is necessary to adapt existing PC software applications, particularly for Windows applications, in binary code mode to the object services under CORBA. Our study addresses this problem. A graphic factory temperature monitor system, which integrates MS-Excel under MS-Windows has been implemented to demonstrate the feasibility of our approach.
CORBA正在成为分布式计算系统中支持面向对象和客户机/服务器范式的最重要的中间件。然而,目前基于CORBA的应用系统仍然很少。一个主要原因是只开发了很少的CORBA对象服务。在没有添加帮助机制的情况下,要有一个新的CORBA应用程序,程序员应该努力从头设计一个带有CORBA接口的程序。在我们之前的工作(K. Liang et al., 1997)中,提出了一种将基于RPC的程序转换为CORBA对象的再工程方法,成功地加快了CORBA应用程序的开发。但是,在这种方法中需要源代码。在许多情况下,软件设计人员可能无法获得源代码,因此有必要使现有的PC软件应用程序,特别是Windows应用程序,以二进制代码模式适应CORBA下的对象服务。我们的研究解决了这个问题。通过在MS-Windows下集成MS-Excel的图形化工厂温度监控系统的实现,验证了该方法的可行性。
{"title":"Reusing MS-Windows software applications under CORBA environment","authors":"Renhao Lin, J.-M. Lin, H. J. Jiau","doi":"10.1109/ICPADS.1998.741144","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741144","url":null,"abstract":"CORBA is becoming the most important middleware that supports object oriented and client/server paradigm in distributed computing systems. However the application systems based on CORBA are still scarce to date. One main reason is that only few CORBA object services have been developed. Without added help mechanisms, to have a new CORBA application, a programmer should make efforts to design a program with a CORBA interface from scratch. In our previous work (K. Liang et al., 1997), a reengineering approach was proposed to convert RPC based programs to CORBA objects, which successfully speeded up the development of CORBA applications. However the source code is required in this approach. In many cases, software designers may not be able to get hold of the source code, so it is necessary to adapt existing PC software applications, particularly for Windows applications, in binary code mode to the object services under CORBA. Our study addresses this problem. A graphic factory temperature monitor system, which integrates MS-Excel under MS-Windows has been implemented to demonstrate the feasibility of our approach.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114700690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741094
C. Yeh, Emmanouel Varvarigos, Hua Lee
The dynamic broadcast problem is the communication problem where source packets to be broadcast to all the other nodes are generated at each node of a parallel computer according to a certain random process, such as a Poisson process. The lower bounds on the average reception delay required by any oblivious dynamic broadcast algorithm in a d-dimensional hypercube are /spl Omega/(d+1/1-/spl rho/) when packets are generated according to a Poisson process, where p is the load factor. The best previous algorithms for hypercubes only achieve /spl Omega/(d/1-/spl rho/) average reception delay. In this paper, we propose dynamic broadcast algorithms that require optimal O(d+1/1-/spl rho/) average reception delay in d-dimensional hypercubes and n/sub 1//spl times/n/sub 2//spl times//spl middot//spl middot//spl middot/n/sub d/ tori with n/sub i/=O(1). We apply the proposed broadcast scheme to a variety of other network topologies for efficient dynamic broadcast and present several methods for assigning priority classes to packets.
{"title":"An optimal routing scheme for multiple broadcast","authors":"C. Yeh, Emmanouel Varvarigos, Hua Lee","doi":"10.1109/ICPADS.1998.741094","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741094","url":null,"abstract":"The dynamic broadcast problem is the communication problem where source packets to be broadcast to all the other nodes are generated at each node of a parallel computer according to a certain random process, such as a Poisson process. The lower bounds on the average reception delay required by any oblivious dynamic broadcast algorithm in a d-dimensional hypercube are /spl Omega/(d+1/1-/spl rho/) when packets are generated according to a Poisson process, where p is the load factor. The best previous algorithms for hypercubes only achieve /spl Omega/(d/1-/spl rho/) average reception delay. In this paper, we propose dynamic broadcast algorithms that require optimal O(d+1/1-/spl rho/) average reception delay in d-dimensional hypercubes and n/sub 1//spl times/n/sub 2//spl times//spl middot//spl middot//spl middot/n/sub d/ tori with n/sub i/=O(1). We apply the proposed broadcast scheme to a variety of other network topologies for efficient dynamic broadcast and present several methods for assigning priority classes to packets.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129899656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-14DOI: 10.1109/ICPADS.1998.741146
Xinan Tang, G. Gao
In order to program multithreaded architectures effectively, compiler support to automatically partition programs into threads is essential. This paper proposes a remote-path-based thread partitioning framework, which can generate low-level threads from procedural programs automatically. The framework has been implemented in the EARTH-C compiler, which uses a data dependence graph (DDG) as an intermediate representation for thread partitioning. To make the compiler work fast, a practical O(n/sup 2/) algorithm is designed to build a non-redundant DDG. To generate correct and efficient threaded code, the remote path heuristic is employed to satisfy thread partitioning constraints and to schedule threads to run quickly. Experimental results show that the DDG building algorithm is fast and the remote-path-based heuristic is very effective in partitioning programs into "optimized" threads.
{"title":"Automatically partitioning threads based on remote paths","authors":"Xinan Tang, G. Gao","doi":"10.1109/ICPADS.1998.741146","DOIUrl":"https://doi.org/10.1109/ICPADS.1998.741146","url":null,"abstract":"In order to program multithreaded architectures effectively, compiler support to automatically partition programs into threads is essential. This paper proposes a remote-path-based thread partitioning framework, which can generate low-level threads from procedural programs automatically. The framework has been implemented in the EARTH-C compiler, which uses a data dependence graph (DDG) as an intermediate representation for thread partitioning. To make the compiler work fast, a practical O(n/sup 2/) algorithm is designed to build a non-redundant DDG. To generate correct and efficient threaded code, the remote path heuristic is employed to satisfy thread partitioning constraints and to schedule threads to run quickly. Experimental results show that the DDG building algorithm is fast and the remote-path-based heuristic is very effective in partitioning programs into \"optimized\" threads.","PeriodicalId":226947,"journal":{"name":"Proceedings 1998 International Conference on Parallel and Distributed Systems (Cat. No.98TB100250)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130875260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}