首页 > 最新文献

Proceedings Sixth International Parallel Processing Symposium最新文献

英文 中文
Comparisons and analysis of massively parallel SIMD architectures for parallel logic simulation 用于并行逻辑仿真的大规模并行SIMD体系结构的比较与分析
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.222986
Eunmi Choi, M. Chung, Yunmo Chung
This paper compares and analyzes massively parallel SIMD architectures as processing environments for parallel logic simulation. The CM-2 and the MP-1 are considered as target machines for the comparison. Detailed contrasts between the two parallel schemes are made based on actual simulation results and system performance. Distributed event-driven simulation protocols are used to obtain experimental results for the two massively SIMD machines. According to the results, the MP-1 is 2 to 2.5 times faster than the CM-2 for up to 16 K gate benchmark circuits, while the CM-2 can accommodate circuits with a larger number of gates of processors. The presented comparisons and analysis of the two machines can be used to choose a SIMD machine for efficient parallel logic simulation.
本文比较分析了大规模并行SIMD体系结构作为并行逻辑仿真的处理环境。CM-2和MP-1被认为是比较的目标机。根据实际仿真结果和系统性能对两种并行方案进行了详细对比。采用分布式事件驱动仿真协议对两台大型SIMD机器进行了实验验证。根据结果,MP-1的速度是CM-2的2到2.5倍,适用于16k门基准电路,而CM-2可以适应具有更多处理器门数的电路。通过对两种机器的比较和分析,可以为选择一种SIMD机器进行高效的并行逻辑仿真提供参考。
{"title":"Comparisons and analysis of massively parallel SIMD architectures for parallel logic simulation","authors":"Eunmi Choi, M. Chung, Yunmo Chung","doi":"10.1109/IPPS.1992.222986","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222986","url":null,"abstract":"This paper compares and analyzes massively parallel SIMD architectures as processing environments for parallel logic simulation. The CM-2 and the MP-1 are considered as target machines for the comparison. Detailed contrasts between the two parallel schemes are made based on actual simulation results and system performance. Distributed event-driven simulation protocols are used to obtain experimental results for the two massively SIMD machines. According to the results, the MP-1 is 2 to 2.5 times faster than the CM-2 for up to 16 K gate benchmark circuits, while the CM-2 can accommodate circuits with a larger number of gates of processors. The presented comparisons and analysis of the two machines can be used to choose a SIMD machine for efficient parallel logic simulation.","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123305251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient parallel algorithms for selection and searching on sorted matrices 排序矩阵选择与搜索的高效并行算法
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.223063
R. Sarnath, Xin He
Parallel algorithms for more general versions of the well known selection and searching problems are formulated. The authors look at these problems when the set of elements can be represented as an n*n matrix with sorted rows and columns. The selection algorithm takes O(lognloglogn log* n) time with O(n/log nlog* n) processors on an EREW PRAM. The searching algorithm takes O(loglogn) time with O(n/loglogn) processors on a CREW PRAM, which is optimal. The authors also show that no algorithm using at most n log/sup c/ n processors, c>or=1, can solve the matrix search problem in time faster than Omega (log log n).<>
为更一般版本的众所周知的选择和搜索问题制定了并行算法。当元素集可以表示为具有排序行和列的n*n矩阵时,作者会考虑这些问题。在EREW PRAM上使用O(n/log nlog* n)个处理器,选择算法需要O(logloglog * n)时间。在CREW PRAM上使用O(n/loglog)个处理器,搜索算法需要O(loglog)时间,这是最优的。作者还证明,在c>或=1的情况下,使用最多n log/sup c/ n个处理器的算法都不能比Omega (log log n)更快地解决矩阵搜索问题。
{"title":"Efficient parallel algorithms for selection and searching on sorted matrices","authors":"R. Sarnath, Xin He","doi":"10.1109/IPPS.1992.223063","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223063","url":null,"abstract":"Parallel algorithms for more general versions of the well known selection and searching problems are formulated. The authors look at these problems when the set of elements can be represented as an n*n matrix with sorted rows and columns. The selection algorithm takes O(lognloglogn log* n) time with O(n/log nlog* n) processors on an EREW PRAM. The searching algorithm takes O(loglogn) time with O(n/loglogn) processors on a CREW PRAM, which is optimal. The authors also show that no algorithm using at most n log/sup c/ n processors, c>or=1, can solve the matrix search problem in time faster than Omega (log log n).<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131618988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Asymmetrical multiconnection three-stage Clos networks 不对称多连接三级Clos网络
Pub Date : 1992-03-01 DOI: 10.1002/net.3230230423
A. Varma, S. Chalasani
The authors study routing problems in a general class of asymmetrical three-stage Clos networks. This class covers many asymmetrical three-stage networks considered by earlier researchers. They derive necessary and sufficient conditions under which this class of networks is rearrangeable with respect to a set of multiconnections, that is, connections where the paired entities are not limited to single terminals but can be arbitrary subsets of the terminals. They model the routing problem in these networks as a network-flow problem. If the number of switching elements in the first and last stages of the network is O(f) and the number of switching elements in the middle stage is m, then the network-flow model yields a routing algorithm with running time O(mf/sup 3/).<>
研究一类非对称三阶段Clos网络的路由问题。本课程涵盖了许多早期研究人员所考虑的不对称三阶段网络。他们推导了该类网络相对于一组多连接是可重排的充分必要条件,即在多连接中,配对实体不限于单个终端,而可以是终端的任意子集。他们将这些网络中的路由问题建模为网络流问题。如果网络第一阶段和最后阶段的交换元素数量为O(f),中间阶段的交换元素数量为m,则网络流模型得到运行时间为O(mf/sup 3/)的路由算法。
{"title":"Asymmetrical multiconnection three-stage Clos networks","authors":"A. Varma, S. Chalasani","doi":"10.1002/net.3230230423","DOIUrl":"https://doi.org/10.1002/net.3230230423","url":null,"abstract":"The authors study routing problems in a general class of asymmetrical three-stage Clos networks. This class covers many asymmetrical three-stage networks considered by earlier researchers. They derive necessary and sufficient conditions under which this class of networks is rearrangeable with respect to a set of multiconnections, that is, connections where the paired entities are not limited to single terminals but can be arbitrary subsets of the terminals. They model the routing problem in these networks as a network-flow problem. If the number of switching elements in the first and last stages of the network is O(f) and the number of switching elements in the middle stage is m, then the network-flow model yields a routing algorithm with running time O(mf/sup 3/).<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125043569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A scheme for state change in a distributed environment using weighted throw counting 一种在分布式环境中使用加权抛出计数的状态改变方案
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.222992
K. Rokusawa, N. Ichiyoshi
This paper proposes a scheme for changing the execution state of a pool of processes in a distributed environment where there may be processes in transit. The scheme can detect the completion of state change using weighted throw counting and detect the termination as well. It works whether the communication channels are synchronous or asynchronous, FIFO or non-FIFO. The message complexity of the scheme is typically O(number of processing elements).<>
本文提出了一种在分布式环境中改变进程池执行状态的方案。该方案可以使用加权抛出计数检测状态变化的完成情况,也可以检测状态变化的终止情况。无论通信通道是同步还是异步,FIFO还是非FIFO,它都可以工作。该方案的消息复杂度通常为0(处理元素的数量)
{"title":"A scheme for state change in a distributed environment using weighted throw counting","authors":"K. Rokusawa, N. Ichiyoshi","doi":"10.1109/IPPS.1992.222992","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222992","url":null,"abstract":"This paper proposes a scheme for changing the execution state of a pool of processes in a distributed environment where there may be processes in transit. The scheme can detect the completion of state change using weighted throw counting and detect the termination as well. It works whether the communication channels are synchronous or asynchronous, FIFO or non-FIFO. The message complexity of the scheme is typically O(number of processing elements).<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134351561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A conceptual framework for implementing neural networks on massively parallel machines 在大规模并行机器上实现神经网络的概念框架
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.222973
Magali E. Azema-Barac
This paper describes a framework for implementing neural networks on massively parallel machines. The framework is generic and applies to a range of neural networks (Multi Layer Perceptron, Competitive Learning, Self-Organising Map, etc.) as well as a range of massively parallel machines (Connection Machine, Distributed Array Processor, MasPar). It consists of two phases: an abstract decomposition of neural networks and a machine specific decomposition. The abstract decomposition identifies the parallelism implemented by neural networks, and provides alternative distribution schemes according to the required exploitation of parallelism. The machine specific decomposition considers the relevant machine criteria, and integrates these with the result of the abstract decomposition to form a 'decision' system. This system formalises the relative gain of each distribution scheme according to neural network and machine criteria. It then identifies their possible optimisations. Finally, it computes and ranks the absolute speed up of each distribution scheme.<>
本文描述了一个在大规模并行机器上实现神经网络的框架。该框架是通用的,适用于一系列神经网络(多层感知器,竞争学习,自组织地图等)以及一系列大规模并行机器(连接机,分布式阵列处理器,MasPar)。它包括两个阶段:神经网络的抽象分解和机器特定的分解。抽象分解识别神经网络实现的并行性,并根据并行性开发的需要提供可选的分配方案。特定于机器的分解考虑了相关的机器标准,并将这些标准与抽象分解的结果集成在一起,形成一个“决策”系统。该系统根据神经网络和机器准则对各分配方案的相对增益进行形式化。然后识别它们可能的优化。最后,对各分配方案的绝对速度进行了计算和排序。
{"title":"A conceptual framework for implementing neural networks on massively parallel machines","authors":"Magali E. Azema-Barac","doi":"10.1109/IPPS.1992.222973","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222973","url":null,"abstract":"This paper describes a framework for implementing neural networks on massively parallel machines. The framework is generic and applies to a range of neural networks (Multi Layer Perceptron, Competitive Learning, Self-Organising Map, etc.) as well as a range of massively parallel machines (Connection Machine, Distributed Array Processor, MasPar). It consists of two phases: an abstract decomposition of neural networks and a machine specific decomposition. The abstract decomposition identifies the parallelism implemented by neural networks, and provides alternative distribution schemes according to the required exploitation of parallelism. The machine specific decomposition considers the relevant machine criteria, and integrates these with the result of the abstract decomposition to form a 'decision' system. This system formalises the relative gain of each distribution scheme according to neural network and machine criteria. It then identifies their possible optimisations. Finally, it computes and ranks the absolute speed up of each distribution scheme.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133600571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A functional execution model for a non-dataflow tagged token architecture 非数据流标记令牌架构的功能执行模型
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.222978
G. Jennings
The author proposes a new execution model for a non-dataflow tagged-token architecture which is not Petri-net based but rather more closely related to the lambda calculus. The model exploits a functional programming style having applicative-order evaluation. The computation's execution graph is dynamically generated according to easily understood dynamic tagging rules which have been demonstrated to be implementable. The model permits conceptually unbounded parallelism for an interesting class of list-oriented computations. The author explains the model with the help of a simple dot-product computation as an example. He highlights some of the major differences between the dataflow paradigm and his own. Architectural issues toward implementation are briefly discussed.<>
作者提出了一种新的非数据流标记令牌架构的执行模型,该模型不是基于Petri-net的,而是与lambda演算更密切相关。该模型利用具有应用级求值的函数式编程风格。计算的执行图是根据易于理解的动态标记规则动态生成的,这些规则已被证明是可实现的。对于一类有趣的面向列表的计算,该模型允许概念上的无界并行。作者以一个简单的点积计算为例对该模型进行了说明。他强调了数据流范式和他自己的范式之间的一些主要区别。简要讨论了实现的体系结构问题。
{"title":"A functional execution model for a non-dataflow tagged token architecture","authors":"G. Jennings","doi":"10.1109/IPPS.1992.222978","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222978","url":null,"abstract":"The author proposes a new execution model for a non-dataflow tagged-token architecture which is not Petri-net based but rather more closely related to the lambda calculus. The model exploits a functional programming style having applicative-order evaluation. The computation's execution graph is dynamically generated according to easily understood dynamic tagging rules which have been demonstrated to be implementable. The model permits conceptually unbounded parallelism for an interesting class of list-oriented computations. The author explains the model with the help of a simple dot-product computation as an example. He highlights some of the major differences between the dataflow paradigm and his own. Architectural issues toward implementation are briefly discussed.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115381561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Optimal allocation of shared data over distributed memory hierarchies 分布式内存层次结构上共享数据的最佳分配
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.222974
E. Haddad
Nonreplicated shared data of distributed applications is optimally allocated to pre-specified multilevel memory partitions at the sites of a heterogeneous multicomputer network to minimize a weighted combination of systemwide mean time delay performance and mean communication cost per access request. Greedy and fast optimization algorithms are presented for nonqueueing lightly-loaded as well as heavily-loaded multiqueue system models with channel, l/O, and memory hierarchy queues. Extensions to data exhibiting nonuniform access demand rates and distinct query and update statistics are presented.<>
分布式应用程序的非复制共享数据被最佳地分配到异构多计算机网络站点上预先指定的多层内存分区,以最小化系统范围内平均时延性能和每个访问请求的平均通信成本的加权组合。针对具有通道、l/O和内存层次队列的非排队轻负载和重负载多队列系统模型,提出了贪婪和快速优化算法。对显示非统一访问需求率和不同查询和更新统计的数据进行了扩展
{"title":"Optimal allocation of shared data over distributed memory hierarchies","authors":"E. Haddad","doi":"10.1109/IPPS.1992.222974","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222974","url":null,"abstract":"Nonreplicated shared data of distributed applications is optimally allocated to pre-specified multilevel memory partitions at the sites of a heterogeneous multicomputer network to minimize a weighted combination of systemwide mean time delay performance and mean communication cost per access request. Greedy and fast optimization algorithms are presented for nonqueueing lightly-loaded as well as heavily-loaded multiqueue system models with channel, l/O, and memory hierarchy queues. Extensions to data exhibiting nonuniform access demand rates and distinct query and update statistics are presented.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116684505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The bus-usage method for the analysis of reconfiguring networks algorithms 分析重构网络算法的总线使用方法
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.223056
Y. Ben-Asher, A. Schuster
Reconfigurable networks have attracted increased attention recently, as an extremely strong parallel model which is realizable in hardware. The authors consider the basic problem of gathering information which is dispersed among the nodes of the network. They analyze the complexity of the problem on reconfigurable linear-arrays. The analysis introduces a novel criteria for the efficiency of reconfigurable network algorithms, namely the bus-usage. The bus-usage quantity measures the utilization of the network sub-buses by the algorithm. It is shown how this yields bounds on the algorithm run-time, by deriving a run-time to bus-usage trade-off.<>
可重构网络作为一种可在硬件上实现的极强的并行模型,近年来受到越来越多的关注。作者考虑了分散在网络节点上的信息收集的基本问题。他们分析了可重构线性阵列问题的复杂性。该分析引入了一种新的可重构网络算法效率标准,即总线使用率。总线使用率通过算法来衡量网络子总线的利用率。它显示了如何通过导出运行时与总线使用的权衡来产生算法运行时的边界。
{"title":"The bus-usage method for the analysis of reconfiguring networks algorithms","authors":"Y. Ben-Asher, A. Schuster","doi":"10.1109/IPPS.1992.223056","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223056","url":null,"abstract":"Reconfigurable networks have attracted increased attention recently, as an extremely strong parallel model which is realizable in hardware. The authors consider the basic problem of gathering information which is dispersed among the nodes of the network. They analyze the complexity of the problem on reconfigurable linear-arrays. The analysis introduces a novel criteria for the efficiency of reconfigurable network algorithms, namely the bus-usage. The bus-usage quantity measures the utilization of the network sub-buses by the algorithm. It is shown how this yields bounds on the algorithm run-time, by deriving a run-time to bus-usage trade-off.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115769957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A hierarchical directory scheme for large-scale cache-coherent multiprocessors 面向大规模缓存相干多处理器的分层目录方案
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.223074
Y. Maa, D. Pradhan, D. Thiébaut
Cache coherence problem is a major design issue for shared-memory multiprocessors. As the system size scales, traditional bus-based snoopy cache coherence schemes are no longer adequate. Instead, the directory-based scheme is a promising approach to deal with the large-scale cache coherence problem. However, the storage overhead of directory schemes often becomes too prohibitive as the system size increases. The paper proposes the hierarchical full-map directory to reduce the storage requirement while still achieving satisfactory performance. The key point is to exploit the inherent geographical interprocessor locality among shared data in the parallel programs. Trace-driven evaluations show that the performance of the proposed scheme compares competitively to the full-map directory scheme, while reducing the storage overhead by over 90%. The proposed hierarchical full-map directory scheme seems to be a promising hardware approach for handling cache coherence in the design of future large-scale multiprocessor memory systems.<>
缓存一致性问题是共享内存多处理器设计中的一个主要问题。随着系统规模的扩大,传统的基于总线的snoopy缓存一致性方案已不再适用。相反,基于目录的方案是处理大规模缓存一致性问题的一种很有前途的方法。但是,随着系统大小的增加,目录方案的存储开销常常变得过高。本文提出了分层全映射目录,以减少存储需求,同时仍能获得满意的性能。关键是利用并行程序中共享数据间固有的地理局部性。跟踪驱动的评估表明,该方案的性能与全映射目录方案相比具有竞争力,同时将存储开销减少了90%以上。所提出的分层全映射目录方案似乎是未来大规模多处理器存储系统设计中处理缓存一致性的一种有前途的硬件方法。
{"title":"A hierarchical directory scheme for large-scale cache-coherent multiprocessors","authors":"Y. Maa, D. Pradhan, D. Thiébaut","doi":"10.1109/IPPS.1992.223074","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223074","url":null,"abstract":"Cache coherence problem is a major design issue for shared-memory multiprocessors. As the system size scales, traditional bus-based snoopy cache coherence schemes are no longer adequate. Instead, the directory-based scheme is a promising approach to deal with the large-scale cache coherence problem. However, the storage overhead of directory schemes often becomes too prohibitive as the system size increases. The paper proposes the hierarchical full-map directory to reduce the storage requirement while still achieving satisfactory performance. The key point is to exploit the inherent geographical interprocessor locality among shared data in the parallel programs. Trace-driven evaluations show that the performance of the proposed scheme compares competitively to the full-map directory scheme, while reducing the storage overhead by over 90%. The proposed hierarchical full-map directory scheme seems to be a promising hardware approach for handling cache coherence in the design of future large-scale multiprocessor memory systems.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124761687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Serial and parallel algorithms for the medial axis transform 中轴线变换的串行和并行算法
Pub Date : 1992-03-01 DOI: 10.1109/IPPS.1992.223025
J. Jenq, S. Sahni
The authors develop an O(n/sup 2/) time serial algorithm to obtain the medial axis transform (MAT) of an n*n image. An O(logn) time CREW PRAM algorithm and an O(log/sup 2/n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n/sup 2/) processors. Two problems associated with the MAT are also studied. These are the area and perimeter reporting problem. The authors develop an O(logn) time hypercube algorithm for both of these problems. Here n is the number of squares in the MAT and the algorithms use O(n/sup 2/) processors.<>
本文提出了一种O(n/sup 2/)时间序列算法来获取n*n图像的中轴变换(MAT)。同时提出了一种O(logn)时间的CREW PRAM算法和O(log/sup 2/n)时间的SIMD超立方并行算法。它们都使用O(n/sup 2/)处理器。与MAT相关的两个问题也进行了研究。这是区域和周边报告问题。针对这两个问题,作者开发了一个O(logn)时间的超立方体算法。这里n是MAT中的方格数,算法使用O(n/sup 2/)个处理器。
{"title":"Serial and parallel algorithms for the medial axis transform","authors":"J. Jenq, S. Sahni","doi":"10.1109/IPPS.1992.223025","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223025","url":null,"abstract":"The authors develop an O(n/sup 2/) time serial algorithm to obtain the medial axis transform (MAT) of an n*n image. An O(logn) time CREW PRAM algorithm and an O(log/sup 2/n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n/sup 2/) processors. Two problems associated with the MAT are also studied. These are the area and perimeter reporting problem. The authors develop an O(logn) time hypercube algorithm for both of these problems. Here n is the number of squares in the MAT and the algorithms use O(n/sup 2/) processors.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123860947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
期刊
Proceedings Sixth International Parallel Processing Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1