Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors最新文献

英文中文

Heterogeneous multiprocessor scheduling and allocation using evolutionary algorithms 基于进化算法的异构多处理器调度与分配

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606835

Carsten Reuter, M. Schwiegershausen, P. Pirsch

We propose a novel stochastic approach for the problem of multiprocessor scheduling and allocation under timing and resource constraints using an evolutionary algorithm (EA). For composite schemes of DSP algorithms a compact problem encoding has been developed with emphasis on the allocation/binding part of the problem as well as an efficient problem transformation-decoding scheme in order to avoid infeasible solutions and therefore time consuming repair mechanisms. Thus, the algorithm is able to handle even large size problems within moderate computation time. Simulation results comparing the proposed EA with optimal results provided by mixed integer linear programming (MILP) show, that the EA is suitable to achieve the same or similar results but in much less time as problem size increases.

提出了一种基于进化算法的随机多处理器调度与分配问题的求解方法。对于DSP算法的组合方案，我们开发了一种紧凑的问题编码，重点是问题的分配/绑定部分，以及一种高效的问题转换解码方案，以避免不可行的解决方案和费时的修复机制。因此，该算法能够在适度的计算时间内处理更大的问题。仿真结果表明，随着问题规模的增大，所提出的EA与混合整数线性规划(MILP)的最优结果相比，可以在更短的时间内获得相同或相似的结果。

引用次数: 13

A strategy for determining a Jacobi specific dataflow processor 确定Jacobi特定数据流处理器的策略

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606812

E. Rijpkema, G. Hekstra, E. Deprettere, Jun Ma

In this paper we present a strategy for determining a dataflow processor which is intended for the execution of Jacobi algorithms which are found in the application domain of array processing and other real-lime adaptive signal processing applications. Our strategy to determine a processor for their execution is to exploit the quasi regularity property in their dependence graph representations in search for what we call the Jacobi processor. This processor emerges from an exploration iteration which takes off from a processor template and a set of Jacobi algorithms. Based on qualitative and quantitative performance analysis, both the algorithms and the processor template are restructured towards improved execution performance. To ensure the mapper is part of the emerging processor specification, the algorithm-to-processor mapping method is included in the iterative and hierarchical exploration method. Processor's hierarchy exploits properties related to regularity in the algorithm's structure, allows gentle transitions from regular to irregular levels in the algorithm hierarchy and offers different control models for the irregular structures that appear at deeper levels of the hierarchy. Transformations aiming at reducing critical paths, increasing throughput, improving mapping efficiency and minimizing control and flow overheads are essential. They include retiming, pipelining and lookahead techniques.

在本文中，我们提出了一种确定数据流处理器的策略，该处理器旨在执行Jacobi算法，该算法在阵列处理和其他实时自适应信号处理应用的应用领域中发现。我们确定执行它们的处理器的策略是利用它们的依赖图表示中的准正则性来寻找我们称之为Jacobi处理器的东西。该处理器是从一个处理器模板和一组Jacobi算法出发的探索迭代中产生的。在定性和定量性能分析的基础上，对算法和处理器模板进行了重构，以提高执行性能。为了确保映射器是新出现的处理器规范的一部分，迭代和分层探索方法中包含了算法到处理器的映射方法。处理器的层次结构利用与算法结构中的规则相关的属性，允许从算法层次结构中的规则到不规则级别的温和转换，并为在层次结构的更深层次上出现的不规则结构提供不同的控制模型。旨在减少关键路径、增加吞吐量、改进映射效率以及最小化控制和流开销的转换是必不可少的。它们包括重新计时、流水线和前瞻性技术。

{"title":"A strategy for determining a Jacobi specific dataflow processor","authors":"E. Rijpkema, G. Hekstra, E. Deprettere, Jun Ma","doi":"10.1109/ASAP.1997.606812","DOIUrl":"https://doi.org/10.1109/ASAP.1997.606812","url":null,"abstract":"In this paper we present a strategy for determining a dataflow processor which is intended for the execution of Jacobi algorithms which are found in the application domain of array processing and other real-lime adaptive signal processing applications. Our strategy to determine a processor for their execution is to exploit the quasi regularity property in their dependence graph representations in search for what we call the Jacobi processor. This processor emerges from an exploration iteration which takes off from a processor template and a set of Jacobi algorithms. Based on qualitative and quantitative performance analysis, both the algorithms and the processor template are restructured towards improved execution performance. To ensure the mapper is part of the emerging processor specification, the algorithm-to-processor mapping method is included in the iterative and hierarchical exploration method. Processor's hierarchy exploits properties related to regularity in the algorithm's structure, allows gentle transitions from regular to irregular levels in the algorithm hierarchy and offers different control models for the irregular structures that appear at deeper levels of the hierarchy. Transformations aiming at reducing critical paths, increasing throughput, improving mapping efficiency and minimizing control and flow overheads are essential. They include retiming, pipelining and lookahead techniques.","PeriodicalId":368315,"journal":{"name":"Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114683495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Determination of the processor functionality in the design of processor arrays 处理器阵列设计中处理器功能的确定

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606826

D. Fimmel, R. Merker

In this paper the inclusion of hardware constraints into the design of massively parallel processor arrays is considered. We propose an algorithm which determines an optimal scheduling function as well as the selection of components which have to be implemented in one processor of a processor array. The arising optimization problem is formulated as an integer linear program which also takes the necessary chip area of a hardware implementation into consideration. Thereby we assume that an allocation function is given and that a partitioning of the processor array is required to match a limited chip area in silicon.

本文将硬件约束纳入大规模并行处理器阵列的设计中。我们提出了一种算法来确定最优调度函数以及必须在处理器阵列的一个处理器中实现的组件的选择。所产生的优化问题被表述为一个整数线性程序，该程序还考虑了硬件实现所需的芯片面积。因此，我们假设给出了分配函数，并且需要对处理器阵列进行划分以匹配硅中的有限芯片面积。

引用次数: 10

A flexible data-interlacing architecture for full-search block-matching algorithm 一种灵活的全搜索块匹配算法数据交错结构

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606816

Yeong-Kang Lai, Liang-Gee Chen, Yung-Pin Lee

This paper describes a data-interlacing architecture with two-dimensional (2-D) data-reuse for full-search block-matching algorithm. Based on some cascading strategies, the same chips can be flexibly cascaded for different block sizes, search ranges, and pixel rates. In addition, the cascading chips can efficiently reuse data to decrease external memory accesses and achieve a high throughput rate. Our results demonstrate that the architecture with 2-D data-reuse is a flexible, low-pin-counts, high-throughput, and cascadable solution for full search block-matching algorithm.

本文描述了一种具有二维数据重用的全搜索块匹配算法的数据隔行结构。基于一定的级联策略，同一芯片可以灵活地针对不同的块大小、搜索范围和像素率进行级联。此外，级联芯片可以有效地重用数据，减少外部存储器访问，实现高吞吐率。我们的研究结果表明，具有二维数据重用的架构是一种灵活、低引脚数、高吞吐量和可级联的全搜索块匹配算法解决方案。

引用次数: 4

Automatic data mapping of signal processing applications 自动数据映射的信号处理应用

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606840

Corinne Ancourt, Denis Barthou, C. Guettier, F. Irigoin, Bertrand Jeannet, J. Jourdan, J. Mattioli

This paper presents a technique to map automatically a complete digital signal processing (DSP) application onto a parallel machine with distributed memory. Unlike other applications where coarse or medium grain scheduling techniques can be used, DSP applications integrate several thousand of tasks and hence necessitate fine grain considerations. Moreover finding an effective mapping imperatively require to take into account both architectural resources constraints and real time constraints. The main contribution of this paper is to show how it is possible to handle and to solve data partitioning, and fine-grain scheduling under the above operational constraints using concurrent constraints logic programming languages (CCLP). Our concurrent resolution technique undertaking linear and nonlinear constraints takes advantage of the special features of signal processing applications and provides a solution equivalent to a manual solution for the representative panoramic analysis (PA) application.

本文提出了一种将完整的数字信号处理(DSP)应用程序自动映射到具有分布式存储器的并行计算机上的技术。与其他可以使用粗粒度或中粒度调度技术的应用程序不同，DSP应用程序集成了数千个任务，因此需要考虑细粒度。此外，要找到有效的映射，必须同时考虑到架构资源约束和实时约束。本文的主要贡献是展示了如何使用并发约束逻辑编程语言(CCLP)在上述操作约束下处理和解决数据分区和细粒度调度。我们的并行分辨率技术承担线性和非线性约束，利用信号处理应用的特殊功能，为代表性全景分析(PA)应用提供了相当于手动解决方案的解决方案。

引用次数: 18

Three-dimensional orthogonal tile sizing problem : mathematical programming approach 三维正交瓷砖尺寸问题:数学规划方法

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606827

R. Andonov, N. Yanev, H. Bourzoufi

We discuss in this paper the problem of finding the optimal tiling transformation of three-dimensional uniform recurrences on a two-dimensional torus/grid of distributed-memory general-purpose machines. We show that even for the simplest case of recurrences which allows for such transformation, the corresponding problem of minimizing the total running time is a non-trivial non-linear integer programming problem. For the later we derive an O(1) algorithm for finding a good approximation solution. The theoretical evaluations and the experimental results show that the obtained solution approximates the original minimum sufficiently well in the context of the considered problem. Such analytical results are of obvious interest and can be successfully used in parallelizing compilers as well as in performance tuning of parallel codes.

本文讨论了在分布式存储通用机器的二维环面/网格上寻找三维均匀递归的最优平铺变换问题。我们证明，即使对于允许这种转换的最简单的递归情况，相应的最小化总运行时间的问题也是一个非平凡的非线性整数规划问题。对于后者，我们导出了一个O(1)算法来寻找一个好的近似解。理论计算和实验结果表明，在所考虑的问题中，所得到的解与原最小值足够接近。这样的分析结果很有意义，可以成功地用于并行编译器以及并行代码的性能调优。

引用次数: 4

Accurate function approximations by symmetric table lookup and addition 精确的函数近似对称表查找和加法

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606821

M. Schulte, J. Stine

This paper presents a high-speed method for accurate function approximations. This method employs parallel table lookups followed by multi-operand addition. It takes advantage of leading zeros and symmetry in the table entries to reduce the table sizes. By increasing the number of tables and the number of operands in the multi-operand addition, the amount of memory is significantly reduced. This method provides a closed form solution for the table entries and can be applied to a variety of elementary functions. Compared to conventional table lookups, it requires two to three orders of magnitude less memory. The design of elementary function generators that use this method are presented and compared to similar methods for elementary function generation.

本文提出了一种快速精确逼近函数的方法。这种方法采用并行表查找，然后是多操作数加法。它利用表项中的前导零和对称性来减小表的大小。通过在多操作数加法中增加表的数量和操作数的数量，可以显著减少内存量。此方法为表项提供了封闭形式的解决方案，并可应用于各种基本函数。与传统的表查找相比，它需要的内存要少两到三个数量级。给出了使用该方法的初等函数生成器的设计，并与类似的初等函数生成方法进行了比较。

引用次数: 21

Architectural approaches for video compression 视频压缩的体系结构方法

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1997-07-14 DOI: 10.1109/ASAP.1997.606824

P. Pirsch, H. Stolberg

An overview on architectures for implementations of current video compression schemes is given. Dedicated as well as programmable approaches are discussed. Examples for dedicated function-specific implementations include architectures for DCT and block matching. For programmable video signal processors, a number of architectural measures to increase video compression performance are reviewed. Actual implementations of video compression schemes typically employ a variety of different architectural approaches. The detailed mix of approaches depends on the targeted application spectrum.

概述了当前视频压缩方案的实现体系结构。讨论了专用方法和可编程方法。专用功能特定实现的示例包括用于DCT和块匹配的体系结构。对于可编程视频信号处理器，回顾了一些提高视频压缩性能的架构措施。视频压缩方案的实际实现通常采用各种不同的体系结构方法。具体的方法组合取决于目标应用程序的范围。

引用次数: 13

Libraries of schedule-free operators in Alpha 在Alpha中无调度操作符的库

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

Pub Date : 1900-01-01 DOI: 10.1109/ASAP.1997.606830

F. de Dinechin

This paper presents a method, based on the formalism of affine recurrence equations, for the synthesis of digital circuits exploiting parallelism at the bit-level. In the initial specification of a numerical algorithm, the arithmetic operators are replaced with their yet unscheduled (schedule-free) binary implementation as recurrence equations. This allows a bit-level dependency analysis yielding a bit-parallel array. The method is demonstrated on the example of the matrix-vector product, and discussed.

本文提出了一种基于仿射递推方程的形式化方法，用于利用位级并行性合成数字电路。在数值算法的初始规范中，算术运算符被替换为它们作为递归方程的未调度(无调度)二进制实现。这允许进行位级依赖分析，生成位并行数组。最后以矩阵向量积为例对该方法进行了验证，并进行了讨论。

引用次数: 6

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀