Microprocessing and Microprogramming最新文献

英文中文

A parallel computer based on simple DSP modules 基于简单DSP模块的并行计算机

Microprocessing and Microprogramming

Pub Date : 1995-08-01 DOI: 10.1016/0165-6074(95)00013-E

F. Mayer-Lindenberg

This article reports on an engineering project at the TUHH aimed at providing a massively parallel experimental computer system to support a number of research projects. The computer nicknamed the PENTAGON is an MIMD system containing a number of identical processing elements (PE's) linked via interfaces. The network is a 3-D torus, and the nodes are based on off-the-shelf signal processor chips, namely TMS 320C40's from TI. The design adds to these standard ingredients an engineering discipline to keep things as simple as possible, and a corresponding, quite unusual physical setup of the total system. These make up for a very cost effective system showing how simple it may be to build a powerful parallel machine.

Although based on a standard architecture, the PENTAGON design takes some special choices, the most important being the complete distribution of I/O capabilities. This provides for an unlimited I/O bandwidth, the support of realtime applications and excellent capabilities of expansion. A graphics interface has been designed to provide direct realtime output from the DSP's. Another recent extension is a set of Power-PC modules on top of the DSP nodes.

Besides standard commercial compilers for 'C40 networks, the functional language Fifth of the author has been implemented on the PENTAGON. Fifth provides facilities such as distributed objects and the automatical distribution of parallel programs. For well parallelizable applications such as the calculation of a Mandelbrot set, high efficiencies in the usage of the processors have been obtained.

本文报告了TUHH的一个工程项目，旨在提供一个大规模并行实验计算机系统来支持许多研究项目。绰号为五角大楼的计算机是一个MIMD系统，包含许多通过接口连接的相同处理元素(PE)。网络是一个三维环面，节点基于现成的信号处理器芯片，即TI公司的TMS 320C40。设计在这些标准成分的基础上增加了一个工程原则，以使事情尽可能简单，以及一个相应的，非常不寻常的整个系统的物理设置。这些构成了一个非常具有成本效益的系统，显示了构建一个强大的并行机器是多么简单。虽然基于标准体系结构，但五角大楼的设计采用了一些特殊的选择，最重要的是I/O功能的完整分布。这提供了无限的I/O带宽，支持实时应用程序和出色的扩展能力。设计了一个图形接口来提供DSP的直接实时输出。另一个最近的扩展是在DSP节点之上的一组Power-PC模块。除了用于C40网络的标准商业编译器外，作者的函数式语言Fifth已在五角大楼上实现。第五，提供分布式对象和并行程序自动分发等功能。对于并行性较好的应用程序，如计算Mandelbrot集合，处理器的使用效率很高。

{"title":"A parallel computer based on simple DSP modules","authors":"F. Mayer-Lindenberg","doi":"10.1016/0165-6074(95)00013-E","DOIUrl":"10.1016/0165-6074(95)00013-E","url":null,"abstract":"<div><p>This article reports on an engineering project at the TUHH aimed at providing a massively parallel experimental computer system to support a number of research projects. The computer nicknamed the PENTAGON is an MIMD system containing a number of identical processing elements (PE's) linked via interfaces. The network is a 3-D torus, and the nodes are based on off-the-shelf signal processor chips, namely TMS 320C40's from TI. The design adds to these standard ingredients an engineering discipline to keep things as simple as possible, and a corresponding, quite unusual physical setup of the total system. These make up for a very cost effective system showing how simple it may be to build a powerful parallel machine.</p><p>Although based on a standard architecture, the PENTAGON design takes some special choices, the most important being the complete distribution of I/O capabilities. This provides for an unlimited I/O bandwidth, the support of realtime applications and excellent capabilities of expansion. A graphics interface has been designed to provide direct realtime output from the DSP's. Another recent extension is a set of Power-PC modules on top of the DSP nodes.</p><p>Besides standard commercial compilers for 'C40 networks, the functional language Fifth of the author has been implemented on the PENTAGON. Fifth provides facilities such as distributed objects and the automatical distribution of parallel programs. For well parallelizable applications such as the calculation of a Mandelbrot set, high efficiencies in the usage of the processors have been obtained.</p></div>","PeriodicalId":100927,"journal":{"name":"Microprocessing and Microprogramming","volume":"41 4","pages":"Pages 301-314"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0165-6074(95)00013-E","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126150656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The multi-associative branch target buffer: a cost effective BTB mechanism 多关联分支目标缓冲区:一种成本有效的BTB机制

Microprocessing and Microprogramming

Pub Date : 1995-06-01 DOI: 10.1016/0165-6074(95)00009-D

Weili Chu , Stamatis Vassiliadis , JoséG. Delgado-Frias

A new branch target buffer hardware organization, denoted as the multi-associative branch target buffer (MBTB), for efficient branch handling in pipelined central processing units (CPUs) is presented. The proposed organization consists of multiple different size arrays addressed via a bit selection addressing mechanism. These arrays are used to maintain information pertinent to the branches, including information usually contained within the traditional branch target buffers such as branch instruction address and branch target address. The proposed configuration and its bit extraction mechanism — which is used to increase the hit ratio of the buffers — provides the capability of dynamically increasing the associativity of the branch target buffers. Due to the new organization, i.e. the multiple array structure, along with the new addressing scheme, it is suggested, based on simulation results, that improvements with reduced hardware can be expected when a multi-associative branch target buffer is installed in a CPU implementation.

提出了一种新的分支目标缓冲区硬件结构，称为多关联分支目标缓冲区(MBTB)，用于流水线中央处理器(cpu)的高效分支处理。所建议的组织由多个不同大小的数组组成，通过位选择寻址机制寻址。这些数组用于维护与分支相关的信息，包括通常包含在传统分支目标缓冲区中的信息，如分支指令地址和分支目标地址。所提出的配置及其位提取机制-用于增加缓冲区的命中率-提供了动态增加分支目标缓冲区的结合性的能力。由于新的组织结构，即多数组结构，以及新的寻址方案，根据仿真结果，建议当在CPU实现中安装多关联分支目标缓冲区时，可以期望减少硬件的改进。

引用次数: 0

Calendar of forthcoming conferences and eventsuting 即将召开的会议和活动日历

Microprocessing and Microprogramming

Pub Date : 1995-06-01 DOI: 10.1016/0165-6074(95)90004-7

引用次数: 0

Modified straight division: A computer implementation of multiple-precision division 修正直除法:一种多精度除法的计算机实现

Microprocessing and Microprogramming

Pub Date : 1995-06-01 DOI: 10.1016/0165-6074(94)00091-N

Ranjani Parthasarathi, Ashok Jhunjhunwala

The ‘Straight division’ algorithm is an in-place division technique that has been known in India as a mental computation technique. This technique is found to be suitable for implementing multiple-precision division on computers using existing single-precision operations. This paper presents some modifications carried out on this basic technique to improve the efficiency of the algorithm. It also discusses an implementation of multiple-precision division using this modified straight division technique on two different processor architectures. This is followed by an analysis of these implementations in comparison with other existing division techniques. It is found that the modified straight division is superior in performance to other known methods.

“直除法”算法是一种就地除法技术，在印度被称为心算技术。该技术适用于利用现有的单精度运算在计算机上实现多精度除法。本文对这一基本技术进行了一些改进，以提高算法的效率。本文还讨论了在两种不同的处理器体系结构上使用这种改进的直除法技术实现多精度除法。接下来是对这些实现的分析，并与其他现有的除法技术进行比较。结果表明，改进的直除法在性能上优于其他已知的方法。

引用次数: 2

Parset: A language construct for system independent parallel programming on distributed systems 一种用于在分布式系统上进行独立于系统的并行编程的语言结构

Microprocessing and Microprogramming

Pub Date : 1995-06-01 DOI: 10.1016/0165-6074(95)00006-A

Rushikesh K. Joshi, D.Janaki Ram

Parallel programming on loosely coupled distributed systems involves many system dependent tasks such as sensing node availability, creating remote processes, programming inter-process communication and synchronization, etc. Very often these system-dependent tasks are handled at the programmer level. This has complicated the process of parallel programming on distributed systems. The portability of these programs is also severely affected. The programmer may also start his remote processes on heavily loaded nodes, thereby degrading the overall performance of the system. To overcome these difficulties, we introduce a language construct called parset at the programming level. Parset captures various kinds of coarse grain parallelism occurring in distributed systems. It also provides scalability to distributed programs. We show that this construct greatly simplifies writing programs on distributed systems providing transparency to various system dependent tasks.

在松散耦合的分布式系统上进行并行编程涉及到许多与系统相关的任务，如感知节点可用性、创建远程进程、编程进程间通信和同步等。这些与系统相关的任务通常是在程序员级别处理的。这使得分布式系统上的并行编程过程变得复杂。这些程序的可移植性也受到严重影响。程序员也可能在负载沉重的节点上启动远程进程，从而降低系统的整体性能。为了克服这些困难，我们在编程层引入了一种称为parset的语言结构。Parset捕获分布式系统中出现的各种粗粒度并行性。它还为分布式程序提供了可伸缩性。我们表明，这种结构极大地简化了在分布式系统上编写程序，为各种系统相关任务提供了透明度。

引用次数: 3

Two-dimensional specification of queries in object-oriented databases 面向对象数据库中查询的二维规范

Microprocessing and Microprogramming

Pub Date : 1995-06-01 DOI: 10.1016/0165-6074(95)00005-9

Jae-Cheol Kwak , Songchun Moon

Visual queries based on schema graphs simplify access to databases for technical and non-technical users. Unlike relational databases, in object-oriented databases, the basic entity in a query, i.e. a class, is frequently considered as a compound of several entities to which the query operations may apply, which causes the deficiency in describing an entity of designation. In this paper, we propose a visual query language object query diagram (OQD) for object-oriented databases, where a class is decomposed into a number of object sets, each of which is a set of values of one of the attributes of the other classes. By representing each class and object sets in the class using the well-known Venn diagram in a query, OQD explicitly presents all the entities to which the operations in a query can apply. We describe the syntax and semantics of OQD through a number of illustrative examples.

基于模式图的可视化查询简化了技术和非技术用户对数据库的访问。与关系数据库不同，在面向对象数据库中，查询中的基本实体(即类)通常被认为是查询操作可能应用的几个实体的复合，这导致在描述指定实体时存在缺陷。在本文中，我们提出了一种面向对象数据库的可视化查询语言对象查询图(OQD)，其中一个类被分解为许多对象集，每个对象集是其他类的一个属性的一组值。通过在查询中使用众所周知的维恩图表示类中的每个类和对象集，OQD显式地表示查询中的操作可以应用的所有实体。我们通过一些说明性的例子来描述OQD的语法和语义。

引用次数: 1

Efficient fault tolerant cache memory design 高效的容错缓存存储器设计

Microprocessing and Microprogramming

Pub Date : 1995-05-01 DOI: 10.1016/0165-6074(95)00004-8

H.T. Verges, D. Nikolos

In this paper we firstly discuss the consequences of cache memory defects/faults in the operation of the system and we show that cache tag defects/faults compared to cache data defects/faults may cause significantly more serious consequences on the integrity and performance of the system. A possible solution is the use of a single error correcting-double error detecting (SEC/DED) code in the cache tag memory. However, the classical implementation of the SEC/DED code is proved to be inappropriate for the tag memory due to the required silicon area and time delays. In this paper we propose a new way of the SEC/DED code exploitation well-suited to cache tag memories. During fault free operation the proposed technique does not add any delay on the critical path of the cache, while in the case of a single error the delay is so small that the cache access time is increased by at most one CPU cycle. An example design shows the superiority of the proposed technique against the classical one. The application of the proposed scheme to real and virtual addressed caches of one or two levels is also discussed.

在本文中，我们首先讨论了缓存内存缺陷/故障在系统运行中的后果，并表明与缓存数据缺陷/故障相比，缓存标签缺陷/故障可能对系统的完整性和性能造成更严重的后果。一种可能的解决方案是在缓存标记内存中使用单错误校正-双错误检测(SEC/DED)代码。然而，由于所需的硅面积和时间延迟，SEC/DED代码的经典实现被证明不适合标签存储器。本文提出了一种适合于标签存储器缓存的SEC/DED代码开发新方法。在无故障运行期间，该技术不会在缓存的关键路径上增加任何延迟，而在单个错误的情况下，延迟非常小，以至于缓存访问时间最多增加一个CPU周期。实例设计表明了该方法相对于传统方法的优越性。本文还讨论了该方案在一层或两层的实地址缓存和虚地址缓存中的应用。

引用次数: 16

A software-controlled prefetching mechanism for software-managed TLBs 一种软件控制的tlb预取机制

Microprocessing and Microprogramming

Pub Date : 1995-05-01 DOI: 10.1016/0165-6074(95)00003-7

Jang Suk Park , Gwang Seon Ahn

The TLB (Translation Lookaside Buffer) miss services have been concealed from operating systems, but some new RISC architectures manage the TLB in software. Since software-managed TLBs provide flexibility to an operating system in page translation, they are considered an important factor in the design of microprocessors for open system environments. However, software-managed TLBs suffer from larger miss penalty than hardware-managed TLBs, since they require more extra context switching overhead than hardware-managed TLBs.

This paper introduces a new technique for reducing the miss penalty of software-managed TLBs by prefetching necessary TLB entries before being used. This technique is not inherently limited to specific applications. The key of this scheme is to perform the prefetch operations to update the TLB entries before first accesses so that TLB misses can be avoided. Using trace-driven simulation and a quantitative analysis, the proposed scheme is evaluated in terms of the miss rate and the total miss penalty. Our results show that the proposed scheme reduces the TLB miss rate by a factor of 6% to 77% due to TLB characteristics and page sizes. In addition, it is found that reducing the miss rate by the prefetching scheme reduces the total miss penalty and bus traffics in software-managed TLBs.

TLB (Translation Lookaside Buffer)错过服务对操作系统是隐藏的，但是一些新的RISC架构在软件中管理TLB。由于软件管理的tlb在页面转换方面为操作系统提供了灵活性，因此它们被认为是为开放系统环境设计微处理器的一个重要因素。但是，软件管理的tlb比硬件管理的tlb遭受更大的丢失损失，因为它们比硬件管理的tlb需要更多的额外上下文切换开销。本文介绍了一种通过在使用之前预取必要的TLB条目来减少软件管理的TLB丢失损失的新技术。这种技术本身并不局限于特定的应用程序。该方案的关键是在第一次访问之前执行预取操作来更新TLB表项，从而避免TLB丢失。通过跟踪驱动仿真和定量分析，从脱靶率和总脱靶惩罚两方面对该方案进行了评价。我们的结果表明，由于TLB特性和页面大小的影响，所提出的方案将TLB缺陷率降低了6%至77%。此外，研究还发现，通过预取方案降低丢失率可以减少软件管理tlb的总丢失惩罚和总线流量。

{"title":"A software-controlled prefetching mechanism for software-managed TLBs","authors":"Jang Suk Park , Gwang Seon Ahn","doi":"10.1016/0165-6074(95)00003-7","DOIUrl":"10.1016/0165-6074(95)00003-7","url":null,"abstract":"<div><p>The TLB (Translation Lookaside Buffer) miss services have been concealed from operating systems, but some new RISC architectures manage the TLB in software. Since software-managed TLBs provide flexibility to an operating system in page translation, they are considered an important factor in the design of microprocessors for open system environments. However, software-managed TLBs suffer from larger miss penalty than hardware-managed TLBs, since they require more extra context switching overhead than hardware-managed TLBs.</p><p>This paper introduces a new technique for reducing the miss penalty of software-managed TLBs by prefetching necessary TLB entries before being used. This technique is not inherently limited to specific applications. The key of this scheme is to perform the prefetch operations to update the TLB entries before first accesses so that TLB misses can be avoided. Using trace-driven simulation and a quantitative analysis, the proposed scheme is evaluated in terms of the miss rate and the total miss penalty. Our results show that the proposed scheme reduces the TLB miss rate by a factor of 6% to 77% due to TLB characteristics and page sizes. In addition, it is found that reducing the miss rate by the prefetching scheme reduces the total miss penalty and bus traffics in software-managed TLBs.</p></div>","PeriodicalId":100927,"journal":{"name":"Microprocessing and Microprogramming","volume":"41 2","pages":"Pages 121-136"},"PeriodicalIF":0.0,"publicationDate":"1995-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0165-6074(95)00003-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124116989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Incorporating job sizes in distributed load balancing 在分布式负载平衡中结合作业大小

Microprocessing and Microprogramming

Pub Date : 1995-05-01 DOI: 10.1016/0165-6074(95)00008-C

John G. Vaughan

The load index most frequently used for load balancing in distributed systems is the job queue length. This work examines some of the implications of scheduling jobs according to an additional abstract dimension attribute called job size. The load balancing algorithm is supported by a virtual ring structure which organises the network nodes in groups and defines the information-gathering activities to take place within and between such groups. A two-phase approach to information gathering and decision making is adopted. This enables the selection of jobs for transfer to be delayed until as close as possible to the moment of transfer. The operation of the protocol is described for each phase and synchronisation of the parallel activities in the virtual rings is discussed. The schedule length performance of the distributed algorithm is examined in a series of closed-system tests.

分布式系统中最常用于负载平衡的负载索引是作业队列长度。这项工作检查了根据称为作业大小的附加抽象维度属性调度作业的一些含义。负载平衡算法由虚拟环结构支持，该结构将网络节点组织成组，并定义在这些组内和组之间发生的信息收集活动。采用两阶段的方法进行信息收集和决策。这使得要转移的作业的选择可以延迟到尽可能接近转移的时刻。描述了每个阶段协议的操作，并讨论了虚拟环中并行活动的同步。通过一系列的封闭系统测试，验证了分布式算法的调度长度性能。

引用次数: 4

Calendar91 Calendar91

Microprocessing and Microprogramming

Pub Date : 1995-05-01 DOI: 10.1016/0165-6074(95)90001-2

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Microprocessing and Microprogramming

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀