首页 > 最新文献

Proceedings of the Fifth Distributed Memory Computing Conference, 1990.最新文献

英文 中文
Massively Parallel Computation of the Euler Equations 欧拉方程的大规模并行计算
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555419
C. Grosch, M. Ghose, S.N. Gupta, T. L. Jackson, M. Zubair
We present a systematic study of the applicability of massively parallel computers, the AMT DAP-510/610 and the TMC CM-2, to the solution of the two-dimensional unsteady Euler equa tions using a compact high-order scheme. The performance of these machines is compared to that of the Cray-2 and the Cray-YMP/832 using the same algorithm and for the same test problem.
我们系统地研究了大规模并行计算机AMT DAP-510/610和TMC CM-2在使用紧凑高阶格式求解二维非定常欧拉方程中的适用性。使用相同的算法和相同的测试问题,将这些机器的性能与Cray-2和Cray-YMP/832进行比较。
{"title":"Massively Parallel Computation of the Euler Equations","authors":"C. Grosch, M. Ghose, S.N. Gupta, T. L. Jackson, M. Zubair","doi":"10.1109/DMCC.1990.555419","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555419","url":null,"abstract":"We present a systematic study of the applicability of massively parallel computers, the AMT DAP-510/610 and the TMC CM-2, to the solution of the two-dimensional unsteady Euler equa tions using a compact high-order scheme. The performance of these machines is compared to that of the Cray-2 and the Cray-YMP/832 using the same algorithm and for the same test problem.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115264077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Communication Parameter Tests and Parallel Back Propagation Algorithms on iPSC/2 Hypercube Multiprocessor iPSC/2超立方多处理机通信参数测试及并行反向传播算法
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556397
B. Mak, O. Egecioglu
The communication complexity on Intel’s second generation iPSC/2 hypercube and its effect on parallelization of Back Propagation type training algorithms for neural networks are explored. On iPSC/2 , different broadcasting methods are tested and three inter-node communication schemes are evaluated based on their performance on vector addition. These communication schemes are then utilized on parallel versions of the Back Propagation training algorithm. The performance of the resulting parallel variants of Back Propagation are analyzed using two medium size problems: vowel classification and English text-to-speech conversion (NETtalk data).
研究了Intel第二代iPSC/2超立方体处理器的通信复杂度及其对神经网络反向传播训练算法并行化的影响。在iPSC/2上对不同的广播方式进行了测试,并对三种节点间通信方案的矢量加法性能进行了评价。然后将这些通信方案用于并行版本的反向传播训练算法。使用两个中等规模的问题:元音分类和英语文本到语音转换(NETtalk数据)来分析反向传播的并行变体的性能。
{"title":"Communication Parameter Tests and Parallel Back Propagation Algorithms on iPSC/2 Hypercube Multiprocessor","authors":"B. Mak, O. Egecioglu","doi":"10.1109/DMCC.1990.556397","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556397","url":null,"abstract":"The communication complexity on Intel’s second generation iPSC/2 hypercube and its effect on parallelization of Back Propagation type training algorithms for neural networks are explored. On iPSC/2 , different broadcasting methods are tested and three inter-node communication schemes are evaluated based on their performance on vector addition. These communication schemes are then utilized on parallel versions of the Back Propagation training algorithm. The performance of the resulting parallel variants of Back Propagation are analyzed using two medium size problems: vowel classification and English text-to-speech conversion (NETtalk data).","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123338190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Evaluation of Dual Ported Memories from the Task Level 从任务层面评价双端口记忆
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556267
Rutger F. H. Hofman
An architecture, which is a hybrid of local memory and shared memory, is described in this report: it uses dual ported memories (DPMs), each accessed by two processors. Each processor is connected to a number of DPMs. The profit that is gained by using a DPM as a shared memory between two processors appears from task allocation results: task transport costs are avoided when a task, newly created in DPM d by one of d’s two processors, is allocated to the other processor at d. For a number of task allocation strategies, simulation studies show that the fraction of the tasks that benefit from this optimisation decreases with the number of processors in the multiprocessor. For larger numbers of processors, this fraction is considerably higher than the fraction under random allocation.
本报告描述了一种混合了本地内存和共享内存的体系结构:它使用双端口内存(dpm),每个dpm由两个处理器访问。每个处理器都连接到多个dpm。使用DPM作为两个处理器之间的共享内存所获得的利润出现在任务分配结果中:当DPM d中新创建的任务由d的两个处理器之一分配给d的另一个处理器时,可以避免任务传输成本。对于许多任务分配策略,仿真研究表明,受益于这种优化的任务比例随着多处理器中的处理器数量的减少而减少。对于更大数量的处理器,这个分数远远高于随机分配下的分数。
{"title":"Evaluation of Dual Ported Memories from the Task Level","authors":"Rutger F. H. Hofman","doi":"10.1109/DMCC.1990.556267","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556267","url":null,"abstract":"An architecture, which is a hybrid of local memory and shared memory, is described in this report: it uses dual ported memories (DPMs), each accessed by two processors. Each processor is connected to a number of DPMs. The profit that is gained by using a DPM as a shared memory between two processors appears from task allocation results: task transport costs are avoided when a task, newly created in DPM d by one of d’s two processors, is allocated to the other processor at d. For a number of task allocation strategies, simulation studies show that the fraction of the tasks that benefit from this optimisation decreases with the number of processors in the multiprocessor. For larger numbers of processors, this fraction is considerably higher than the fraction under random allocation.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125575999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Molecular Dynamics Simulations of Short-Range Force Systems on 1024-Node Hypercubes 1024节点超立方体上短程力系统的分子动力学模拟
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555423
S. Plimpton
Two parallel algorithms for classical molecular dynamics are presented. The first assigns each processor to a subset of particles; the second assigns each to a fixed region of 3d space. The algorithms are implemented on 1024-node hypercubes for problems characterized by short-range forces, diffusion (so that each particle’s neighbors change in time), and problem size ranging from 250 to 10000 particles. Timings for the algorithms on the 1024-node NCUBE/ten and the newer NCUBE 2 hypercubes are given. The latter is found to be competitive with a CRAY-XMP, running an optimized serial algorithm. For smaller problems the NCUBE 2 and CRAY-XMP are roughly the same; for larger ones the NCUBE 2 (with 1024 nodes) is up to twice as fast. Parallel efficiencies of the algorithms and communication parameters for the two hypercubes are also examined.
提出了经典分子动力学的两种并行算法。第一种方法将每个处理器分配给一个粒子子集;第二种方法是将每个人分配到三维空间的一个固定区域。这些算法在1024节点的超立方体上实现,用于具有短程力、扩散(因此每个粒子的邻居随时间变化)和问题大小从250到10000个粒子的问题。给出了算法在1024节点的NCUBE/ 10和较新的NCUBE 2超立方体上的时序。后者可以与运行优化串行算法的CRAY-XMP相竞争。对于较小的问题,NCUBE 2和CRAY-XMP大致相同;对于更大的节点,NCUBE 2(有1024个节点)的速度是前者的两倍。研究了两个超立方体算法的并行效率和通信参数。
{"title":"Molecular Dynamics Simulations of Short-Range Force Systems on 1024-Node Hypercubes","authors":"S. Plimpton","doi":"10.1109/DMCC.1990.555423","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555423","url":null,"abstract":"Two parallel algorithms for classical molecular dynamics are presented. The first assigns each processor to a subset of particles; the second assigns each to a fixed region of 3d space. The algorithms are implemented on 1024-node hypercubes for problems characterized by short-range forces, diffusion (so that each particle’s neighbors change in time), and problem size ranging from 250 to 10000 particles. Timings for the algorithms on the 1024-node NCUBE/ten and the newer NCUBE 2 hypercubes are given. The latter is found to be competitive with a CRAY-XMP, running an optimized serial algorithm. For smaller problems the NCUBE 2 and CRAY-XMP are roughly the same; for larger ones the NCUBE 2 (with 1024 nodes) is up to twice as fast. Parallel efficiencies of the algorithms and communication parameters for the two hypercubes are also examined.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130773478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Hypercube Dynamic Load Balancing Hypercube动态负载平衡
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556305
D. King, E. Wegman
This paper reports on the results of a preliminary study in dynamic load balancing on an Intel Hypercube. The purpose of this research is to provide experimental data in how parallel algorithms should be constructed to obtain maximal utilization of a parallel architecture. This study is one aspect of an ongoing research project into the construction of an automated parallelization tool. This tool will take FORTRAN source as input, and construct a parallel algorithm that will produce the same results as the original serial input. The focus of this paper is on the load balancing aspect of that project. The basic idea is to reserve a certain percentage of the computation task, subdivide that percentage into arbitrarily fine tasks, and dole those small tasks out to nodes on request. Ij” the percentage is chosen correctly, then a minority of nodes should be involved in consuming the filler tasks, and the overall throughput of the job should increase as a result of the individual node efJciencies having increased. This paper will outline our approach to performing dynamic load balancing on an Intel iPSC/2. We take the view that the problem of load balancing is really a problem of dividing a “computational task” into smaller components, each of roughly equal complexity, and each an independent event. After this is done, the components of the task can be sent to a node for execution. The key to an optimally balanced load across all computational nodes is the ability to form a statistical profile of the individual components of each computational task. This statistical profile will determine an initial sequence of execution. Our experience indicates that a speedup on the order of 80% is achievable with the judicious use of profiled load balancing. During the process of execution, the initial profile will be altered according to the actual behavior exhibited by the nodes. The difference between the actual and expected performance will be used to determine how much additional time should be devoted to altering the current execution schedule. Currently, our work involves statically setting the load balancing parameters. Our load balancing system determines the execution schedule
本文报告了在Intel Hypercube上进行动态负载平衡的初步研究结果。本研究的目的是为如何构建并行算法以最大限度地利用并行架构提供实验数据。这项研究是正在进行的自动化并行化工具构建研究项目的一个方面。该工具将采用FORTRAN源作为输入,并构造一个并行算法,该算法将产生与原始串行输入相同的结果。本文的重点是该项目的负载平衡方面。其基本思想是预留一定百分比的计算任务,将该百分比细分为任意精细的任务,并根据请求将这些小任务分发给节点。如果正确选择了百分比,则应该有少数节点参与使用填充任务,并且由于单个节点效率的提高,作业的总体吞吐量应该增加。本文将概述我们在英特尔iPSC/2上执行动态负载平衡的方法。我们认为,负载平衡问题实际上是将“计算任务”划分为更小的组件的问题,每个组件的复杂性大致相等,每个组件都是独立的事件。完成此操作后,可以将任务的组件发送到节点执行。在所有计算节点之间实现最佳均衡负载的关键是能够形成每个计算任务的单个组件的统计概要。此统计概要文件将确定初始执行顺序。我们的经验表明,通过明智地使用概要负载平衡,可以实现80%左右的加速。在执行过程中,初始配置文件将根据节点显示的实际行为进行更改。实际性能和预期性能之间的差异将用于确定应该投入多少额外时间来更改当前执行计划。目前,我们的工作涉及静态设置负载平衡参数。我们的负载平衡系统决定执行时间表
{"title":"Hypercube Dynamic Load Balancing","authors":"D. King, E. Wegman","doi":"10.1109/DMCC.1990.556305","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556305","url":null,"abstract":"This paper reports on the results of a preliminary study in dynamic load balancing on an Intel Hypercube. The purpose of this research is to provide experimental data in how parallel algorithms should be constructed to obtain maximal utilization of a parallel architecture. This study is one aspect of an ongoing research project into the construction of an automated parallelization tool. This tool will take FORTRAN source as input, and construct a parallel algorithm that will produce the same results as the original serial input. The focus of this paper is on the load balancing aspect of that project. The basic idea is to reserve a certain percentage of the computation task, subdivide that percentage into arbitrarily fine tasks, and dole those small tasks out to nodes on request. Ij” the percentage is chosen correctly, then a minority of nodes should be involved in consuming the filler tasks, and the overall throughput of the job should increase as a result of the individual node efJciencies having increased. This paper will outline our approach to performing dynamic load balancing on an Intel iPSC/2. We take the view that the problem of load balancing is really a problem of dividing a “computational task” into smaller components, each of roughly equal complexity, and each an independent event. After this is done, the components of the task can be sent to a node for execution. The key to an optimally balanced load across all computational nodes is the ability to form a statistical profile of the individual components of each computational task. This statistical profile will determine an initial sequence of execution. Our experience indicates that a speedup on the order of 80% is achievable with the judicious use of profiled load balancing. During the process of execution, the initial profile will be altered according to the actual behavior exhibited by the nodes. The difference between the actual and expected performance will be used to determine how much additional time should be devoted to altering the current execution schedule. Currently, our work involves statically setting the load balancing parameters. Our load balancing system determines the execution schedule","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128696924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Surface Reconstruction and Discontinuity Detection: A Fast Hierarchical Approach on a Two-Dimensional Mesh 二维网格表面重构与不连续检测:一种快速分层方法
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555382
R. Battiti
Recently multigrid techniques have been proposed for solving low-level vision problems in optimal time (i.e. time proportional to the number of pixels). In the present work this method is extended to incorporate a discontinuity detection process cooperating with the smoothing phase on all scales. Activation of line element detectors that signal the presence of relevant discontinuities is based on information gathered from neighboring points at the same and different scales. Because the required computation is local, parallelism can be profitably used. A mapping of the required data structure onto a two dimensional mesh of processors is suggested. Domain decomposition is shown to be efficient on MIMD computers capable of containing many individual cells in each processor. Some examples of the proposed multiscale solution techniques are shown for two different applications. In the first case a surface is reconstructed from first derivative information (extracted from the intensity data), in the second case from noisy depth constraints.
最近,多网格技术被提出用于在最佳时间(即与像素数成比例的时间)内解决低级视觉问题。在目前的工作中,该方法被扩展到包含一个在所有尺度上与平滑阶段合作的不连续检测过程。线素探测器的激活是基于从相同或不同尺度的相邻点收集的信息来指示相关不连续点的存在。由于所需的计算是局部的,因此可以有效地使用并行性。建议将所需的数据结构映射到处理器的二维网格上。领域分解在能够在每个处理器中包含许多单独单元的MIMD计算机上被证明是有效的。针对两种不同的应用,给出了所提出的多尺度解决技术的一些示例。在第一种情况下,从一阶导数信息(从强度数据中提取)重建表面,在第二种情况下,从噪声深度约束中重建表面。
{"title":"Surface Reconstruction and Discontinuity Detection: A Fast Hierarchical Approach on a Two-Dimensional Mesh","authors":"R. Battiti","doi":"10.1109/DMCC.1990.555382","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555382","url":null,"abstract":"Recently multigrid techniques have been proposed for solving low-level vision problems in optimal time (i.e. time proportional to the number of pixels). In the present work this method is extended to incorporate a discontinuity detection process cooperating with the smoothing phase on all scales. Activation of line element detectors that signal the presence of relevant discontinuities is based on information gathered from neighboring points at the same and different scales. Because the required computation is local, parallelism can be profitably used. A mapping of the required data structure onto a two dimensional mesh of processors is suggested. Domain decomposition is shown to be efficient on MIMD computers capable of containing many individual cells in each processor. Some examples of the proposed multiscale solution techniques are shown for two different applications. In the first case a surface is reconstructed from first derivative information (extracted from the intensity data), in the second case from noisy depth constraints.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121840684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Parallel Loops on Distributed Machines 分布式机器上的并行循环
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556322
C. Koelbel, P. Mehrotra, J. Saltz, H. Berryman
Any programming environment for distributed memory machines that allows the user to specify pdwallel do loops over globally defined data structures requires optimizations that go beyond the specification of Lrppropriate data and workload partitionings. In this paper, we consider optimizations that are required for efficient execution of a code segment that consists of pmallel loops over distributed data Structures. On distributed memory machines it is typically very expensive tci fetch individual data elements. Instead, before a parallirl loop executes, it is desirable to prefetch all off-processor data required in the loop. We specify a scheme for s boring copies of fetched data along with a scheme for accessing copies of off-processor data during the computafJ ion of the loop. The performance of such optimizations rm the iPSC/2 and the NCUBE is also presented.
任何允许用户在全局定义的数据结构上指定pdwall_do循环的分布式内存机器编程环境,都需要进行超出适当数据和工作负载分区规范的优化。在本文中,我们考虑了有效执行由分布数据结构上的并行循环组成的代码段所需的优化。在分布式内存机器上,获取单个数据元素通常非常昂贵。相反,在并行循环执行之前,最好是预取循环中所需的所有离处理器数据。我们为获取的数据的5个无聊副本指定了一种方案,并为在循环计算期间访问离处理器数据的副本指定了一种方案。本文还介绍了iPSC/2和NCUBE的优化性能。
{"title":"Parallel Loops on Distributed Machines","authors":"C. Koelbel, P. Mehrotra, J. Saltz, H. Berryman","doi":"10.1109/DMCC.1990.556322","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556322","url":null,"abstract":"Any programming environment for distributed memory machines that allows the user to specify pdwallel do loops over globally defined data structures requires optimizations that go beyond the specification of Lrppropriate data and workload partitionings. In this paper, we consider optimizations that are required for efficient execution of a code segment that consists of pmallel loops over distributed data Structures. On distributed memory machines it is typically very expensive tci fetch individual data elements. Instead, before a parallirl loop executes, it is desirable to prefetch all off-processor data required in the loop. We specify a scheme for s boring copies of fetched data along with a scheme for accessing copies of off-processor data during the computafJ ion of the loop. The performance of such optimizations rm the iPSC/2 and the NCUBE is also presented.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116018461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Parallel Distributed-Memory Implementation of the Corrective Switching Problem 纠错开关问题的并行分布式存储器实现
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555358
J. Blanc, D. Trystram, J. Ryckbosch
LMC-IIVLAG EDF-DER Abstract. For the past 20 years, an increasing interest has been devoted to the sequential Conjugate Gradient Method for solving large linear systems arising from the modeling of physical problems (especially for very large systems with sparse matrices). This paper deals with the implementation on parallel supercomputers of a preconditioned conjugate gradient method for solving the corrective switching problem obtained while modeling the behavior of power systems in electrical networks. This problem consists in finding the successive solutions of many close linear systems (not too large) with very ill-conditioned matrices (sometimes even singular). We present a new method based on the Preconditioned Conjugate Gradient algorithm with an original preconditioning and study its parallelization on both shared and distributed memory computers. 1. Setting of the problem During the control of electrical networks, the operator must ensure the system to bc in a safc state (i.e. to be able to protect the system against incidents liable to occur in real time). The demand and the possibility of the plants are such that nuclear energy between two plants flows from various nodes of the network. The loss of one element could jeopardize the security of the whole system by a chain tripping: in such case, an overload line occurs and without any operation the protective devices will act and the line will trip out. In actual operations conditions, the switching actions that the operator applies to the electrical network ensure that overloads will disappear before the delayed protective devices go into action. Such actions are shown on the picture at the end of the paper. The computation of switching actions is a combinatorial problem, very hard to solve. The connections of the switching elements are described as discrete variables. The corrective switching problem corresponds to determine the various possible solutions of the load flow calculation. Each such situation requires to solve a linear system where the matrices have only a few elements which differ from each other. Let us consider the N consecutive linear systems below: (Si) Ajx; = b;, lG
LMC-IIVLAG EDF-DER摘要。在过去的20年里,序列共轭梯度法在求解大型线性系统(特别是具有稀疏矩阵的非常大的系统)的物理问题建模中引起了越来越多的兴趣。本文研究了一种预条件共轭梯度法在并行超级计算机上的实现,该方法用于求解电网中电力系统行为建模时得到的校正开关问题。这个问题包括寻找许多具有非常病态矩阵(有时甚至是奇异矩阵)的紧密线性系统(不太大)的连续解。提出了一种基于预条件共轭梯度算法的新方法,并对其在共享和分布式存储计算机上的并行化进行了研究。1. 在电网控制过程中,操作员必须确保系统处于安全状态(即能够保护系统免受实时可能发生的事故的影响)。电厂的需求和可能性是这样的,两个电厂之间的核能从网络的不同节点流动。一个元件的丢失可能会因链式跳闸而危及整个系统的安全:在这种情况下,线路发生过载,不需要任何操作,保护装置就会起作用,线路就会跳闸。在实际运行条件下,操作人员对电网的切换动作确保在延迟保护装置动作之前过载消失。这些动作在文章末尾的图片中都有显示。开关动作的计算是一个很难解决的组合问题。开关元件的连接被描述为离散变量。纠偏切换问题对应于确定潮流计算的各种可能解。每个这样的情况都需要求解一个线性系统,其中矩阵只有几个元素彼此不同。让我们考虑下面的N个连续线性系统:(Si) Ajx;= b;, lG
{"title":"Parallel Distributed-Memory Implementation of the Corrective Switching Problem","authors":"J. Blanc, D. Trystram, J. Ryckbosch","doi":"10.1109/DMCC.1990.555358","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555358","url":null,"abstract":"LMC-IIVLAG EDF-DER Abstract. For the past 20 years, an increasing interest has been devoted to the sequential Conjugate Gradient Method for solving large linear systems arising from the modeling of physical problems (especially for very large systems with sparse matrices). This paper deals with the implementation on parallel supercomputers of a preconditioned conjugate gradient method for solving the corrective switching problem obtained while modeling the behavior of power systems in electrical networks. This problem consists in finding the successive solutions of many close linear systems (not too large) with very ill-conditioned matrices (sometimes even singular). We present a new method based on the Preconditioned Conjugate Gradient algorithm with an original preconditioning and study its parallelization on both shared and distributed memory computers. 1. Setting of the problem During the control of electrical networks, the operator must ensure the system to bc in a safc state (i.e. to be able to protect the system against incidents liable to occur in real time). The demand and the possibility of the plants are such that nuclear energy between two plants flows from various nodes of the network. The loss of one element could jeopardize the security of the whole system by a chain tripping: in such case, an overload line occurs and without any operation the protective devices will act and the line will trip out. In actual operations conditions, the switching actions that the operator applies to the electrical network ensure that overloads will disappear before the delayed protective devices go into action. Such actions are shown on the picture at the end of the paper. The computation of switching actions is a combinatorial problem, very hard to solve. The connections of the switching elements are described as discrete variables. The corrective switching problem corresponds to determine the various possible solutions of the load flow calculation. Each such situation requires to solve a linear system where the matrices have only a few elements which differ from each other. Let us consider the N consecutive linear systems below: (Si) Ajx; = b;, lG<N where the matrices Ai (of size n by n) are \"close\" to each other, viz, A;+1 = Ai+Ai, with Ai of small norm. The solutions xi will be close to each other in this sense, and we want to take full advantage of this. Note that this problem also occurs in Adaptive Filtering or Finite Element modeling.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116306272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Adaptive Multiscale Scheme for Real-Time Motion Field Estimation 一种实时运动场估计的自适应多尺度方案
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.555383
R. Battiti
The problem considered in this work is that of estimating the motion field (i.e. the projection of the velocity field onto the image plane) from a temporal sequence of images. Generic images contain different objects with diverse spatial frequencies and motion amplitudes. To deal with this complex environment in a fast and effective way, biological visual systems use parallel processing, visual channels at different resolutions and adaptive mechanisms. In this paper a new adaptive multiscale scheme is proposed, in which the spatial discretization scale is based on a local estimate of the errors involved. Considering the constraints for real-time operation, flexibility and portability, the scheme can be implemented on MIMD parallel computers with medium size grains with high efficiency. Tests with ray-traced and video-acquired images for different motion ranges show that this method produces a better estimation with respect to the homogeneous (no Gadap t ive) mult iscale met hod.
在这项工作中考虑的问题是从图像的时间序列中估计运动场(即速度场在图像平面上的投影)。通用图像包含不同空间频率和运动幅度的不同对象。为了快速有效地处理这种复杂的环境,生物视觉系统采用并行处理、不同分辨率的视觉通道和自适应机制。本文提出了一种新的自适应多尺度方案,该方案的空间离散尺度基于误差的局部估计。考虑到实时性、灵活性和可移植性的限制,该方案可以在中等粒度的MIMD并行计算机上高效实现。对不同运动范围的光线跟踪和视频采集图像进行的测试表明,该方法相对于均匀(无Gadap t - ive)多尺度方法产生了更好的估计。
{"title":"An Adaptive Multiscale Scheme for Real-Time Motion Field Estimation","authors":"R. Battiti","doi":"10.1109/DMCC.1990.555383","DOIUrl":"https://doi.org/10.1109/DMCC.1990.555383","url":null,"abstract":"The problem considered in this work is that of estimating the motion field (i.e. the projection of the velocity field onto the image plane) from a temporal sequence of images. Generic images contain different objects with diverse spatial frequencies and motion amplitudes. To deal with this complex environment in a fast and effective way, biological visual systems use parallel processing, visual channels at different resolutions and adaptive mechanisms. In this paper a new adaptive multiscale scheme is proposed, in which the spatial discretization scale is based on a local estimate of the errors involved. Considering the constraints for real-time operation, flexibility and portability, the scheme can be implemented on MIMD parallel computers with medium size grains with high efficiency. Tests with ray-traced and video-acquired images for different motion ranges show that this method produces a better estimation with respect to the homogeneous (no Gadap t ive) mult iscale met hod.","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126646154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hot-Spot Performance of Single-Stage and Multistage Interconnection Networks 单级和多级互连网络的热点性能
Pub Date : 1990-04-08 DOI: 10.1109/DMCC.1990.556269
K. Gunter, E. Gehringer
{"title":"Hot-Spot Performance of Single-Stage and Multistage Interconnection Networks","authors":"K. Gunter, E. Gehringer","doi":"10.1109/DMCC.1990.556269","DOIUrl":"https://doi.org/10.1109/DMCC.1990.556269","url":null,"abstract":"","PeriodicalId":204431,"journal":{"name":"Proceedings of the Fifth Distributed Memory Computing Conference, 1990.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126673213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Fifth Distributed Memory Computing Conference, 1990.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1