Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.555419
C. Grosch, M. Ghose, S.N. Gupta, T. L. Jackson, M. Zubair
We present a systematic study of the applicability of massively parallel computers, the AMT DAP-510/610 and the TMC CM-2, to the solution of the two-dimensional unsteady Euler equa tions using a compact high-order scheme. The performance of these machines is compared to that of the Cray-2 and the Cray-YMP/832 using the same algorithm and for the same test problem.
{"title":"Massively Parallel Computation of the Euler Equations","authors":"C. Grosch, M. Ghose, S.N. Gupta, T. L. Jackson, M. Zubair","doi":"10.1109/DMCC.1990.555419","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555419","url":null,"abstract":"We present a systematic study of the applicability of massively parallel computers, the AMT DAP-510/610 and the TMC CM-2, to the solution of the two-dimensional unsteady Euler equa tions using a compact high-order scheme. The performance of these machines is compared to that of the Cray-2 and the Cray-YMP/832 using the same algorithm and for the same test problem.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115264077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.556397
B. Mak, O. Egecioglu
The communication complexity on Intel’s second generation iPSC/2 hypercube and its effect on parallelization of Back Propagation type training algorithms for neural networks are explored. On iPSC/2 , different broadcasting methods are tested and three inter-node communication schemes are evaluated based on their performance on vector addition. These communication schemes are then utilized on parallel versions of the Back Propagation training algorithm. The performance of the resulting parallel variants of Back Propagation are analyzed using two medium size problems: vowel classification and English text-to-speech conversion (NETtalk data).
{"title":"Communication Parameter Tests and Parallel Back Propagation Algorithms on iPSC/2 Hypercube Multiprocessor","authors":"B. Mak, O. Egecioglu","doi":"10.1109/DMCC.1990.556397","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556397","url":null,"abstract":"The communication complexity on Intel’s second generation iPSC/2 hypercube and its effect on parallelization of Back Propagation type training algorithms for neural networks are explored. On iPSC/2 , different broadcasting methods are tested and three inter-node communication schemes are evaluated based on their performance on vector addition. These communication schemes are then utilized on parallel versions of the Back Propagation training algorithm. The performance of the resulting parallel variants of Back Propagation are analyzed using two medium size problems: vowel classification and English text-to-speech conversion (NETtalk data).","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123338190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.556267
Rutger F. H. Hofman
An architecture, which is a hybrid of local memory and shared memory, is described in this report: it uses dual ported memories (DPMs), each accessed by two processors. Each processor is connected to a number of DPMs. The profit that is gained by using a DPM as a shared memory between two processors appears from task allocation results: task transport costs are avoided when a task, newly created in DPM d by one of d’s two processors, is allocated to the other processor at d. For a number of task allocation strategies, simulation studies show that the fraction of the tasks that benefit from this optimisation decreases with the number of processors in the multiprocessor. For larger numbers of processors, this fraction is considerably higher than the fraction under random allocation.
{"title":"Evaluation of Dual Ported Memories from the Task Level","authors":"Rutger F. H. Hofman","doi":"10.1109/DMCC.1990.556267","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556267","url":null,"abstract":"An architecture, which is a hybrid of local memory and shared memory, is described in this report: it uses dual ported memories (DPMs), each accessed by two processors. Each processor is connected to a number of DPMs. The profit that is gained by using a DPM as a shared memory between two processors appears from task allocation results: task transport costs are avoided when a task, newly created in DPM d by one of d’s two processors, is allocated to the other processor at d. For a number of task allocation strategies, simulation studies show that the fraction of the tasks that benefit from this optimisation decreases with the number of processors in the multiprocessor. For larger numbers of processors, this fraction is considerably higher than the fraction under random allocation.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125575999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.555423
S. Plimpton
Two parallel algorithms for classical molecular dynamics are presented. The first assigns each processor to a subset of particles; the second assigns each to a fixed region of 3d space. The algorithms are implemented on 1024-node hypercubes for problems characterized by short-range forces, diffusion (so that each particle’s neighbors change in time), and problem size ranging from 250 to 10000 particles. Timings for the algorithms on the 1024-node NCUBE/ten and the newer NCUBE 2 hypercubes are given. The latter is found to be competitive with a CRAY-XMP, running an optimized serial algorithm. For smaller problems the NCUBE 2 and CRAY-XMP are roughly the same; for larger ones the NCUBE 2 (with 1024 nodes) is up to twice as fast. Parallel efficiencies of the algorithms and communication parameters for the two hypercubes are also examined.
{"title":"Molecular Dynamics Simulations of Short-Range Force Systems on 1024-Node Hypercubes","authors":"S. Plimpton","doi":"10.1109/DMCC.1990.555423","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555423","url":null,"abstract":"Two parallel algorithms for classical molecular dynamics are presented. The first assigns each processor to a subset of particles; the second assigns each to a fixed region of 3d space. The algorithms are implemented on 1024-node hypercubes for problems characterized by short-range forces, diffusion (so that each particle’s neighbors change in time), and problem size ranging from 250 to 10000 particles. Timings for the algorithms on the 1024-node NCUBE/ten and the newer NCUBE 2 hypercubes are given. The latter is found to be competitive with a CRAY-XMP, running an optimized serial algorithm. For smaller problems the NCUBE 2 and CRAY-XMP are roughly the same; for larger ones the NCUBE 2 (with 1024 nodes) is up to twice as fast. Parallel efficiencies of the algorithms and communication parameters for the two hypercubes are also examined.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130773478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.556305
D. King, E. Wegman
This paper reports on the results of a preliminary study in dynamic load balancing on an Intel Hypercube. The purpose of this research is to provide experimental data in how parallel algorithms should be constructed to obtain maximal utilization of a parallel architecture. This study is one aspect of an ongoing research project into the construction of an automated parallelization tool. This tool will take FORTRAN source as input, and construct a parallel algorithm that will produce the same results as the original serial input. The focus of this paper is on the load balancing aspect of that project. The basic idea is to reserve a certain percentage of the computation task, subdivide that percentage into arbitrarily fine tasks, and dole those small tasks out to nodes on request. Ij” the percentage is chosen correctly, then a minority of nodes should be involved in consuming the filler tasks, and the overall throughput of the job should increase as a result of the individual node efJciencies having increased. This paper will outline our approach to performing dynamic load balancing on an Intel iPSC/2. We take the view that the problem of load balancing is really a problem of dividing a “computational task” into smaller components, each of roughly equal complexity, and each an independent event. After this is done, the components of the task can be sent to a node for execution. The key to an optimally balanced load across all computational nodes is the ability to form a statistical profile of the individual components of each computational task. This statistical profile will determine an initial sequence of execution. Our experience indicates that a speedup on the order of 80% is achievable with the judicious use of profiled load balancing. During the process of execution, the initial profile will be altered according to the actual behavior exhibited by the nodes. The difference between the actual and expected performance will be used to determine how much additional time should be devoted to altering the current execution schedule. Currently, our work involves statically setting the load balancing parameters. Our load balancing system determines the execution schedule
{"title":"Hypercube Dynamic Load Balancing","authors":"D. King, E. Wegman","doi":"10.1109/DMCC.1990.556305","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556305","url":null,"abstract":"This paper reports on the results of a preliminary study in dynamic load balancing on an Intel Hypercube. The purpose of this research is to provide experimental data in how parallel algorithms should be constructed to obtain maximal utilization of a parallel architecture. This study is one aspect of an ongoing research project into the construction of an automated parallelization tool. This tool will take FORTRAN source as input, and construct a parallel algorithm that will produce the same results as the original serial input. The focus of this paper is on the load balancing aspect of that project. The basic idea is to reserve a certain percentage of the computation task, subdivide that percentage into arbitrarily fine tasks, and dole those small tasks out to nodes on request. Ij” the percentage is chosen correctly, then a minority of nodes should be involved in consuming the filler tasks, and the overall throughput of the job should increase as a result of the individual node efJciencies having increased. This paper will outline our approach to performing dynamic load balancing on an Intel iPSC/2. We take the view that the problem of load balancing is really a problem of dividing a “computational task” into smaller components, each of roughly equal complexity, and each an independent event. After this is done, the components of the task can be sent to a node for execution. The key to an optimally balanced load across all computational nodes is the ability to form a statistical profile of the individual components of each computational task. This statistical profile will determine an initial sequence of execution. Our experience indicates that a speedup on the order of 80% is achievable with the judicious use of profiled load balancing. During the process of execution, the initial profile will be altered according to the actual behavior exhibited by the nodes. The difference between the actual and expected performance will be used to determine how much additional time should be devoted to altering the current execution schedule. Currently, our work involves statically setting the load balancing parameters. Our load balancing system determines the execution schedule","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128696924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.555382
R. Battiti
Recently multigrid techniques have been proposed for solving low-level vision problems in optimal time (i.e. time proportional to the number of pixels). In the present work this method is extended to incorporate a discontinuity detection process cooperating with the smoothing phase on all scales. Activation of line element detectors that signal the presence of relevant discontinuities is based on information gathered from neighboring points at the same and different scales. Because the required computation is local, parallelism can be profitably used. A mapping of the required data structure onto a two dimensional mesh of processors is suggested. Domain decomposition is shown to be efficient on MIMD computers capable of containing many individual cells in each processor. Some examples of the proposed multiscale solution techniques are shown for two different applications. In the first case a surface is reconstructed from first derivative information (extracted from the intensity data), in the second case from noisy depth constraints.
{"title":"Surface Reconstruction and Discontinuity Detection: A Fast Hierarchical Approach on a Two-Dimensional Mesh","authors":"R. Battiti","doi":"10.1109/DMCC.1990.555382","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555382","url":null,"abstract":"Recently multigrid techniques have been proposed for solving low-level vision problems in optimal time (i.e. time proportional to the number of pixels). In the present work this method is extended to incorporate a discontinuity detection process cooperating with the smoothing phase on all scales. Activation of line element detectors that signal the presence of relevant discontinuities is based on information gathered from neighboring points at the same and different scales. Because the required computation is local, parallelism can be profitably used. A mapping of the required data structure onto a two dimensional mesh of processors is suggested. Domain decomposition is shown to be efficient on MIMD computers capable of containing many individual cells in each processor. Some examples of the proposed multiscale solution techniques are shown for two different applications. In the first case a surface is reconstructed from first derivative information (extracted from the intensity data), in the second case from noisy depth constraints.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121840684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.556322
C. Koelbel, P. Mehrotra, J. Saltz, H. Berryman
Any programming environment for distributed memory machines that allows the user to specify pdwallel do loops over globally defined data structures requires optimizations that go beyond the specification of Lrppropriate data and workload partitionings. In this paper, we consider optimizations that are required for efficient execution of a code segment that consists of pmallel loops over distributed data Structures. On distributed memory machines it is typically very expensive tci fetch individual data elements. Instead, before a parallirl loop executes, it is desirable to prefetch all off-processor data required in the loop. We specify a scheme for s boring copies of fetched data along with a scheme for accessing copies of off-processor data during the computafJ ion of the loop. The performance of such optimizations rm the iPSC/2 and the NCUBE is also presented.
{"title":"Parallel Loops on Distributed Machines","authors":"C. Koelbel, P. Mehrotra, J. Saltz, H. Berryman","doi":"10.1109/DMCC.1990.556322","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556322","url":null,"abstract":"Any programming environment for distributed memory machines that allows the user to specify pdwallel do loops over globally defined data structures requires optimizations that go beyond the specification of Lrppropriate data and workload partitionings. In this paper, we consider optimizations that are required for efficient execution of a code segment that consists of pmallel loops over distributed data Structures. On distributed memory machines it is typically very expensive tci fetch individual data elements. Instead, before a parallirl loop executes, it is desirable to prefetch all off-processor data required in the loop. We specify a scheme for s boring copies of fetched data along with a scheme for accessing copies of off-processor data during the computafJ ion of the loop. The performance of such optimizations rm the iPSC/2 and the NCUBE is also presented.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116018461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.555358
J. Blanc, D. Trystram, J. Ryckbosch
LMC-IIVLAG EDF-DER Abstract. For the past 20 years, an increasing interest has been devoted to the sequential Conjugate Gradient Method for solving large linear systems arising from the modeling of physical problems (especially for very large systems with sparse matrices). This paper deals with the implementation on parallel supercomputers of a preconditioned conjugate gradient method for solving the corrective switching problem obtained while modeling the behavior of power systems in electrical networks. This problem consists in finding the successive solutions of many close linear systems (not too large) with very ill-conditioned matrices (sometimes even singular). We present a new method based on the Preconditioned Conjugate Gradient algorithm with an original preconditioning and study its parallelization on both shared and distributed memory computers. 1. Setting of the problem During the control of electrical networks, the operator must ensure the system to bc in a safc state (i.e. to be able to protect the system against incidents liable to occur in real time). The demand and the possibility of the plants are such that nuclear energy between two plants flows from various nodes of the network. The loss of one element could jeopardize the security of the whole system by a chain tripping: in such case, an overload line occurs and without any operation the protective devices will act and the line will trip out. In actual operations conditions, the switching actions that the operator applies to the electrical network ensure that overloads will disappear before the delayed protective devices go into action. Such actions are shown on the picture at the end of the paper. The computation of switching actions is a combinatorial problem, very hard to solve. The connections of the switching elements are described as discrete variables. The corrective switching problem corresponds to determine the various possible solutions of the load flow calculation. Each such situation requires to solve a linear system where the matrices have only a few elements which differ from each other. Let us consider the N consecutive linear systems below: (Si) Ajx; = b;, lG
LMC-IIVLAG EDF-DER摘要。在过去的20年里,序列共轭梯度法在求解大型线性系统(特别是具有稀疏矩阵的非常大的系统)的物理问题建模中引起了越来越多的兴趣。本文研究了一种预条件共轭梯度法在并行超级计算机上的实现,该方法用于求解电网中电力系统行为建模时得到的校正开关问题。这个问题包括寻找许多具有非常病态矩阵(有时甚至是奇异矩阵)的紧密线性系统(不太大)的连续解。提出了一种基于预条件共轭梯度算法的新方法,并对其在共享和分布式存储计算机上的并行化进行了研究。1. 在电网控制过程中,操作员必须确保系统处于安全状态(即能够保护系统免受实时可能发生的事故的影响)。电厂的需求和可能性是这样的,两个电厂之间的核能从网络的不同节点流动。一个元件的丢失可能会因链式跳闸而危及整个系统的安全:在这种情况下,线路发生过载,不需要任何操作,保护装置就会起作用,线路就会跳闸。在实际运行条件下,操作人员对电网的切换动作确保在延迟保护装置动作之前过载消失。这些动作在文章末尾的图片中都有显示。开关动作的计算是一个很难解决的组合问题。开关元件的连接被描述为离散变量。纠偏切换问题对应于确定潮流计算的各种可能解。每个这样的情况都需要求解一个线性系统,其中矩阵只有几个元素彼此不同。让我们考虑下面的N个连续线性系统:(Si) Ajx;= b;, lG
{"title":"Parallel Distributed-Memory Implementation of the Corrective Switching Problem","authors":"J. Blanc, D. Trystram, J. Ryckbosch","doi":"10.1109/DMCC.1990.555358","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555358","url":null,"abstract":"LMC-IIVLAG EDF-DER Abstract. For the past 20 years, an increasing interest has been devoted to the sequential Conjugate Gradient Method for solving large linear systems arising from the modeling of physical problems (especially for very large systems with sparse matrices). This paper deals with the implementation on parallel supercomputers of a preconditioned conjugate gradient method for solving the corrective switching problem obtained while modeling the behavior of power systems in electrical networks. This problem consists in finding the successive solutions of many close linear systems (not too large) with very ill-conditioned matrices (sometimes even singular). We present a new method based on the Preconditioned Conjugate Gradient algorithm with an original preconditioning and study its parallelization on both shared and distributed memory computers. 1. Setting of the problem During the control of electrical networks, the operator must ensure the system to bc in a safc state (i.e. to be able to protect the system against incidents liable to occur in real time). The demand and the possibility of the plants are such that nuclear energy between two plants flows from various nodes of the network. The loss of one element could jeopardize the security of the whole system by a chain tripping: in such case, an overload line occurs and without any operation the protective devices will act and the line will trip out. In actual operations conditions, the switching actions that the operator applies to the electrical network ensure that overloads will disappear before the delayed protective devices go into action. Such actions are shown on the picture at the end of the paper. The computation of switching actions is a combinatorial problem, very hard to solve. The connections of the switching elements are described as discrete variables. The corrective switching problem corresponds to determine the various possible solutions of the load flow calculation. Each such situation requires to solve a linear system where the matrices have only a few elements which differ from each other. Let us consider the N consecutive linear systems below: (Si) Ajx; = b;, lG<N where the matrices Ai (of size n by n) are \"close\" to each other, viz, A;+1 = Ai+Ai, with Ai of small norm. The solutions xi will be close to each other in this sense, and we want to take full advantage of this. Note that this problem also occurs in Adaptive Filtering or Finite Element modeling.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116306272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.555383
R. Battiti
The problem considered in this work is that of estimating the motion field (i.e. the projection of the velocity field onto the image plane) from a temporal sequence of images. Generic images contain different objects with diverse spatial frequencies and motion amplitudes. To deal with this complex environment in a fast and effective way, biological visual systems use parallel processing, visual channels at different resolutions and adaptive mechanisms. In this paper a new adaptive multiscale scheme is proposed, in which the spatial discretization scale is based on a local estimate of the errors involved. Considering the constraints for real-time operation, flexibility and portability, the scheme can be implemented on MIMD parallel computers with medium size grains with high efficiency. Tests with ray-traced and video-acquired images for different motion ranges show that this method produces a better estimation with respect to the homogeneous (no Gadap t ive) mult iscale met hod.
在这项工作中考虑的问题是从图像的时间序列中估计运动场(即速度场在图像平面上的投影)。通用图像包含不同空间频率和运动幅度的不同对象。为了快速有效地处理这种复杂的环境,生物视觉系统采用并行处理、不同分辨率的视觉通道和自适应机制。本文提出了一种新的自适应多尺度方案,该方案的空间离散尺度基于误差的局部估计。考虑到实时性、灵活性和可移植性的限制,该方案可以在中等粒度的MIMD并行计算机上高效实现。对不同运动范围的光线跟踪和视频采集图像进行的测试表明,该方法相对于均匀(无Gadap t - ive)多尺度方法产生了更好的估计。
{"title":"An Adaptive Multiscale Scheme for Real-Time Motion Field Estimation","authors":"R. Battiti","doi":"10.1109/DMCC.1990.555383","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555383","url":null,"abstract":"The problem considered in this work is that of estimating the motion field (i.e. the projection of the velocity field onto the image plane) from a temporal sequence of images. Generic images contain different objects with diverse spatial frequencies and motion amplitudes. To deal with this complex environment in a fast and effective way, biological visual systems use parallel processing, visual channels at different resolutions and adaptive mechanisms. In this paper a new adaptive multiscale scheme is proposed, in which the spatial discretization scale is based on a local estimate of the errors involved. Considering the constraints for real-time operation, flexibility and portability, the scheme can be implemented on MIMD parallel computers with medium size grains with high efficiency. Tests with ray-traced and video-acquired images for different motion ranges show that this method produces a better estimation with respect to the homogeneous (no Gadap t ive) mult iscale met hod.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126646154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-04-08DOI: 10.1109/DMCC.1990.556269
K. Gunter, E. Gehringer
{"title":"Hot-Spot Performance of Single-Stage and Multistage Interconnection Networks","authors":"K. Gunter, E. Gehringer","doi":"10.1109/DMCC.1990.556269","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556269","url":null,"abstract":"","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126673213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}