Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004280
K. Qiu, Sajal K. Das
Interconnection networks of various topologies are used in parallel computing. It is important to study the graph theoretical/combinatorial properties of the underlying networks in order to better understand them and develop more efficient parallel algorithms as well as fault-tolerant communication/routing algorithms. In this paper, we approach this problem from a new angle by looking into the spectra (eigenvalues and their multiplicities) of these networks. Eigenvalues of the adjacency matrix of a graph can reveal certain properties of the graph since they are closely related to some of its combinatorial invariants. Specifically, for some of the popular interconnection networks, we study their eigenvalues and multiplicities by (1) summarizing the currently available results; (2) deriving some of these results in a more straightforward way; (3) obtaining new results; and (4) presenting experimental results on several interconnection networks. In addition, we briefly survey the results that relate spectra of graphs to their structural properties. Although much work remains to be done, by looking into the spectra of interconnection networks, we hope to bring about a more unified approach to studying their topological properties.
{"title":"Interconnection networks and their eigenvalues","authors":"K. Qiu, Sajal K. Das","doi":"10.1109/ISPAN.2002.1004280","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004280","url":null,"abstract":"Interconnection networks of various topologies are used in parallel computing. It is important to study the graph theoretical/combinatorial properties of the underlying networks in order to better understand them and develop more efficient parallel algorithms as well as fault-tolerant communication/routing algorithms. In this paper, we approach this problem from a new angle by looking into the spectra (eigenvalues and their multiplicities) of these networks. Eigenvalues of the adjacency matrix of a graph can reveal certain properties of the graph since they are closely related to some of its combinatorial invariants. Specifically, for some of the popular interconnection networks, we study their eigenvalues and multiplicities by (1) summarizing the currently available results; (2) deriving some of these results in a more straightforward way; (3) obtaining new results; and (4) presenting experimental results on several interconnection networks. In addition, we briefly survey the results that relate spectra of graphs to their structural properties. Although much work remains to be done, by looking into the spectra of interconnection networks, we hope to bring about a more unified approach to studying their topological properties.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130271720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004265
Jie Wu, Fei Dai
Routing based on a connected dominating set is a promising approach, where the search space for a route is reduced to the hosts in the set. A set is dominating if all the hosts in the system are either in the set or neighbors of hosts in the set. In this paper we first review a distributed formation of a connected dominating set called marking process and dominating-set-based routing. Then we propose several ways to reduce the size of the dominating set and study the locality of dominating set in ad hoc wireless networks with switch-on/off operations. Results show that the dominating set derived from the marking process exhibits good locality properties; i.e., the change of a host status, gateway (dominating) or non-gateway (dominated), affects only the status of hosts in a restricted vicinity.
{"title":"On locality of dominating set in ad hoc networks with switch-on/off operations","authors":"Jie Wu, Fei Dai","doi":"10.1109/ISPAN.2002.1004265","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004265","url":null,"abstract":"Routing based on a connected dominating set is a promising approach, where the search space for a route is reduced to the hosts in the set. A set is dominating if all the hosts in the system are either in the set or neighbors of hosts in the set. In this paper we first review a distributed formation of a connected dominating set called marking process and dominating-set-based routing. Then we propose several ways to reduce the size of the dominating set and study the locality of dominating set in ad hoc wireless networks with switch-on/off operations. Results show that the dominating set derived from the marking process exhibits good locality properties; i.e., the change of a host status, gateway (dominating) or non-gateway (dominated), affects only the status of hosts in a restricted vicinity.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"7 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132089869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004267
Yen-Chun Lin, Jun-Wei Hsiao
Prefix computation has many applications, and should be implemented as a primitive operation. Many combinational circuits for performing the prefix operation in parallel, called parallel prefix circuits, have been designed and studied. The size of a prefix circuit D, s(D), is the number of operation nodes in D, and the depth of D, d(D), is the maximum level of operation nodes in D. Smaller depth implies faster computation, while smaller size implies less power consumption and smaller area in VLSI implementation and thus less cost. D is depth-size optimal if d(D)+s(D)=2n-2. Another circuit parameter is fan-out. A circuit having a smaller fan-out is faster and smaller in VLSI implementation. Thus, a circuit should have a small fan-out for it to be of practical use. In this paper, we take a new approach to designing a depth-size optimal parallel prefix circuit, WE4, with fan-out 4 and small depth. In many cases of n, WE4 has the smallest depth among all known prefix circuits.
{"title":"A new approach to constructing optimal prefix circuits with small depth","authors":"Yen-Chun Lin, Jun-Wei Hsiao","doi":"10.1109/ISPAN.2002.1004267","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004267","url":null,"abstract":"Prefix computation has many applications, and should be implemented as a primitive operation. Many combinational circuits for performing the prefix operation in parallel, called parallel prefix circuits, have been designed and studied. The size of a prefix circuit D, s(D), is the number of operation nodes in D, and the depth of D, d(D), is the maximum level of operation nodes in D. Smaller depth implies faster computation, while smaller size implies less power consumption and smaller area in VLSI implementation and thus less cost. D is depth-size optimal if d(D)+s(D)=2n-2. Another circuit parameter is fan-out. A circuit having a smaller fan-out is faster and smaller in VLSI implementation. Thus, a circuit should have a small fan-out for it to be of practical use. In this paper, we take a new approach to designing a depth-size optimal parallel prefix circuit, WE4, with fan-out 4 and small depth. In many cases of n, WE4 has the smallest depth among all known prefix circuits.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131403024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004282
Sunghyun Jee, K. Palaniappan
The dynamically instruction-scheduled VLIW (DISVLIW) processor architecture is designed for balancing scheduling effort more evenly between the compiler and the processor. The DISVLIW instruction format is augmented to allow dependency bit vectors to be placed in the same VLIW word. Dependency bit vectors are added to each instruction format within long instructions to enable synchronization between prior and subsequent instructions. The DISVLIW processor dynamically schedules each instruction in long instructions using functional unit and dynamic scheduler pairs. Each dynamic scheduler dynamically checks for data dependencies and resource collisions while scheduling each instruction. Features such as explicit parallelism, balanced scheduling effort and dynamic scheduling can be used to provide a sound infrastructure for supercomputing. We simulate the DISVLIW architecture and show that the DISVLIW processor performs significantly better than the VLIW processor for a wide range of cache sizes and across numerical benchmark applications.
{"title":"Compiler processor tradeoffs for DISVLIW architecture","authors":"Sunghyun Jee, K. Palaniappan","doi":"10.1109/ISPAN.2002.1004282","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004282","url":null,"abstract":"The dynamically instruction-scheduled VLIW (DISVLIW) processor architecture is designed for balancing scheduling effort more evenly between the compiler and the processor. The DISVLIW instruction format is augmented to allow dependency bit vectors to be placed in the same VLIW word. Dependency bit vectors are added to each instruction format within long instructions to enable synchronization between prior and subsequent instructions. The DISVLIW processor dynamically schedules each instruction in long instructions using functional unit and dynamic scheduler pairs. Each dynamic scheduler dynamically checks for data dependencies and resource collisions while scheduling each instruction. Features such as explicit parallelism, balanced scheduling effort and dynamic scheduling can be used to provide a sound infrastructure for supercomputing. We simulate the DISVLIW architecture and show that the DISVLIW processor performs significantly better than the VLIW processor for a wide range of cache sizes and across numerical benchmark applications.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126548474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004257
Thanasis Loukopoulos, D. Papadias, I. Ahmad
The proliferation of the Internet is leading to high expectation of the fast turnaround time. Clients abandoning their connections due to excessive downloading delays translates directly to profit losses. Hence, minimizing the latency perceived by end-users has become the primary performance objective compared to more traditional issues, such as server utilization. The two promising techniques for improving Internet responsiveness are caching and replication. In this paper we present an overview of recent research in replication. We begin by arguing on the important role of replication in decreasing client perceived response time and illustrate the main topics that affect its successful deployment on the Internet. We analyze and characterize existing research, providing taxonomies and classifications whenever possible. Our discussion reveals several open problems and research directions.
{"title":"An overview of data replication on the Internet","authors":"Thanasis Loukopoulos, D. Papadias, I. Ahmad","doi":"10.1109/ISPAN.2002.1004257","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004257","url":null,"abstract":"The proliferation of the Internet is leading to high expectation of the fast turnaround time. Clients abandoning their connections due to excessive downloading delays translates directly to profit losses. Hence, minimizing the latency perceived by end-users has become the primary performance objective compared to more traditional issues, such as server utilization. The two promising techniques for improving Internet responsiveness are caching and replication. In this paper we present an overview of recent research in replication. We begin by arguing on the important role of replication in decreasing client perceived response time and illustrate the main topics that affect its successful deployment on the Internet. We analyze and characterize existing research, providing taxonomies and classifications whenever possible. Our discussion reveals several open problems and research directions.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114892365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004298
S. Yeh, Chang-Biau Yang, Hon-Chan Chen
The concept of a safety vector can guide efficient fault-tolerant routing on interconnection networks. The safety vector on a hypercube is based on the distance between a pair of nodes. However, the distance measure cannot be applied on star graphs directly, since there are many routing path patterns when the distances between two pairs of nodes are the same. Thus, on star graphs, we define the safety vector based on the routing path patterns. Based on this concept of routing path patterns, we first define an undirected safety vector, which is a 1D vector on each node. In addition, we propose some methods for solving some problems concerning the safety vectors of the star graph, such as the length of the safety vectors and the ranking of the routing path patterns.
{"title":"Fault-tolerant routing on the star graph with safety vectors","authors":"S. Yeh, Chang-Biau Yang, Hon-Chan Chen","doi":"10.1109/ISPAN.2002.1004298","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004298","url":null,"abstract":"The concept of a safety vector can guide efficient fault-tolerant routing on interconnection networks. The safety vector on a hypercube is based on the distance between a pair of nodes. However, the distance measure cannot be applied on star graphs directly, since there are many routing path patterns when the distances between two pairs of nodes are the same. Thus, on star graphs, we define the safety vector based on the routing path patterns. Based on this concept of routing path patterns, we first define an undirected safety vector, which is a 1D vector on each node. In addition, we propose some methods for solving some problems concerning the safety vectors of the star graph, such as the length of the safety vectors and the ranking of the routing path patterns.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115797155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004303
Tzu-Lun Huang, Der-Tsai Lee
In this paper we propose a delay-constrained distributed multicast routing algorithm based on token passing. This, algorithm is fully distributed and generates a multicast routing tree, which not only meets the realtime requirement, but also has a sub-optimal network cost. Simulations have been done and the results have shown that the multicast routing tree generated by our algorithm has better performance than previously known results.
{"title":"A distributed multicast routing algorithm for real-time applications in wide area networks","authors":"Tzu-Lun Huang, Der-Tsai Lee","doi":"10.1109/ISPAN.2002.1004303","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004303","url":null,"abstract":"In this paper we propose a delay-constrained distributed multicast routing algorithm based on token passing. This, algorithm is fully distributed and generates a multicast routing tree, which not only meets the realtime requirement, but also has a sub-optimal network cost. Simulations have been done and the results have shown that the multicast routing tree generated by our algorithm has better performance than previously known results.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125816384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004259
James A. Esquivel, Philip K. Chan
The use of a page-level join index in parallel join optimization requires a proper sequence for accessing data pages in the form of join components. The current approach to this method involves a strategy that first retrieves those components with a high number of page joins so as to keep all processors busy early in the join execution. However, problems regarding conflicts with other valid reading strategies and the choice of an appropriate component whenever several of them satisfy the selection criterion have not been specifically addressed We call such conflicts the join component selection (JCS) problem. To resolve this problem, this paper proposes appropriate component retrieval strategies that will further optimize the parallel join execution. Simulation results demonstrate an improvement over the existing one.
{"title":"An algorithm for resolving the join component selection problem in parallel join optimization","authors":"James A. Esquivel, Philip K. Chan","doi":"10.1109/ISPAN.2002.1004259","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004259","url":null,"abstract":"The use of a page-level join index in parallel join optimization requires a proper sequence for accessing data pages in the form of join components. The current approach to this method involves a strategy that first retrieves those components with a high number of page joins so as to keep all processors busy early in the join execution. However, problems regarding conflicts with other valid reading strategies and the choice of an appropriate component whenever several of them satisfy the selection criterion have not been specifically addressed We call such conflicts the join component selection (JCS) problem. To resolve this problem, this paper proposes appropriate component retrieval strategies that will further optimize the parallel join execution. Simulation results demonstrate an improvement over the existing one.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126931952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004307
Sasthi C. Ghosh, B. Sinha, Nabanita Das
We first introduce the notion of a critical block of hexagonal cellular network with 2-band buffering, where the channel interference does not extend beyond two cells. For a network with a given demand vector and frequency separation constraints, we present an algorithm for finding its critical block. A novel idea of partitioning the critical block into several smaller sub-networks with homogeneous demands has been introduced which provides an elegant way of assigning frequencies to the critical block. This idea of partitioning is then extended for the frequency assignment to the rest of the network. The proposed algorithm provides an optimal assignment for eight well-known benchmark instances including the most difficult two. It is shown to be superior to the existing frequency assignment algorithms, reported so far, in terms of both bandwidth and computation time.
{"title":"An efficient channel assignment technique for hexagonal cellular networks","authors":"Sasthi C. Ghosh, B. Sinha, Nabanita Das","doi":"10.1109/ISPAN.2002.1004307","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004307","url":null,"abstract":"We first introduce the notion of a critical block of hexagonal cellular network with 2-band buffering, where the channel interference does not extend beyond two cells. For a network with a given demand vector and frequency separation constraints, we present an algorithm for finding its critical block. A novel idea of partitioning the critical block into several smaller sub-networks with homogeneous demands has been introduced which provides an elegant way of assigning frequencies to the critical block. This idea of partitioning is then extended for the frequency assignment to the rest of the network. The proposed algorithm provides an optimal assignment for eight well-known benchmark instances including the most difficult two. It is shown to be superior to the existing frequency assignment algorithms, reported so far, in terms of both bandwidth and computation time.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133438116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-07DOI: 10.1109/ISPAN.2002.1004277
Yamin Li, S. Peng, Wanming Chu
This paper introduces a new interconnection network for very large parallel computers called metacube (MC). An MC network has a 2-level cube structure. An MC(k,m) network connects 2(m2/sup k/+k) nodes with m + k links per node, where k is the dimension of a high-level cube and m is the dimension of low-level cubes (clusters). An MC network is a symmetric network with short diameter, easy and efficient routing similar to that of hypercubes. However, an MC network can connect more than one hundred of millions of nodes with only 6 links per node. Design of efficient routing algorithms for collective communications is the key issue for any interconnection network. In this paper we also show that total exchange (all-to-all personalized communication) can be done efficiently in metacube.
{"title":"Efficient communication in metacube: a new interconnection network","authors":"Yamin Li, S. Peng, Wanming Chu","doi":"10.1109/ISPAN.2002.1004277","DOIUrl":"https://doi.org/10.1109/ISPAN.2002.1004277","url":null,"abstract":"This paper introduces a new interconnection network for very large parallel computers called metacube (MC). An MC network has a 2-level cube structure. An MC(k,m) network connects 2(m2/sup k/+k) nodes with m + k links per node, where k is the dimension of a high-level cube and m is the dimension of low-level cubes (clusters). An MC network is a symmetric network with short diameter, easy and efficient routing similar to that of hypercubes. However, an MC network can connect more than one hundred of millions of nodes with only 6 links per node. Design of efficient routing algorithms for collective communications is the key issue for any interconnection network. In this paper we also show that total exchange (all-to-all personalized communication) can be done efficiently in metacube.","PeriodicalId":255069,"journal":{"name":"Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN'02","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122195051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}