A linear array parallel image processor: SliM-II

Hyunman Chang, S. Ong, M. Sunwoo
{"title":"A linear array parallel image processor: SliM-II","authors":"Hyunman Chang, S. Ong, M. Sunwoo","doi":"10.1109/ASAP.1997.606810","DOIUrl":null,"url":null,"abstract":"This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-II Image Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the worst case simulation and gives 1.92 GIPS. SIiM-II can greatly reduce the inter-PE communication overhead, due to the idea of sliding that is overlapping inter-PE communication with computation. In contrast to existing array processors, each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU operation, data I/O, and inter-PE communication simultaneously in an instruction cycle. In addition, during the ALU/multiplier operation, SliM-II provides parallel load/store between the register file and on-chip memory as in DSP chips. The bandwidth of data I/O and inter-PE communication increases due to bit-parallel paths. We developed VHDL models and performed logic synthesis using the COMPASS/sup TM/ CAD tool. We used the COMPASS/sup TM/ 3.3 V 0.6 /spl mu/m standard cell library (v8r4.9.1). The total number of transistors is about 1.5 millions. The SliM-II chip is being fabricated at the LG Semiconductor Co,, Ltd. The performance estimation shows a significant improvement for algorithms requiring multiplications compared with existing array processors.","PeriodicalId":368315,"journal":{"name":"Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.1997.606810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-II Image Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the worst case simulation and gives 1.92 GIPS. SIiM-II can greatly reduce the inter-PE communication overhead, due to the idea of sliding that is overlapping inter-PE communication with computation. In contrast to existing array processors, each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU operation, data I/O, and inter-PE communication simultaneously in an instruction cycle. In addition, during the ALU/multiplier operation, SliM-II provides parallel load/store between the register file and on-chip memory as in DSP chips. The bandwidth of data I/O and inter-PE communication increases due to bit-parallel paths. We developed VHDL models and performed logic synthesis using the COMPASS/sup TM/ CAD tool. We used the COMPASS/sup TM/ 3.3 V 0.6 /spl mu/m standard cell library (v8r4.9.1). The total number of transistors is about 1.5 millions. The SliM-II chip is being fabricated at the LG Semiconductor Co,, Ltd. The performance estimation shows a significant improvement for algorithms requiring multiplications compared with existing array processors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
线性阵列并行图像处理器SliM-II
本文介绍了一种通用并行图像处理器芯片SliM-II的结构和设计。该芯片具有64个处理元件(pe)的线性阵列,在最坏情况模拟下工作在30 MHz,并给出1.92 GIPS。SIiM-II可以大大减少pe间通信开销,因为滑动的思想是将pe间通信与计算重叠。与现有的数组处理器相比,每个PE都有一个乘法器,对卷积、模板匹配等非常有效。该指令集可以在一个指令周期内同时执行ALU操作、数据I/O和pe间通信。此外,在ALU/乘法器操作期间,SliM-II在寄存器文件和片上存储器之间提供并行加载/存储,就像在DSP芯片中一样。由于采用位并行路径,数据I/O和pe间通信的带宽增加。利用COMPASS/sup TM/ CAD工具建立VHDL模型并进行逻辑综合。我们使用COMPASS/sup TM/ 3.3 V 0.6 /spl mu/m标准细胞库(v8r4.9.1)。晶体管的总数约为150万个。SliM-II芯片目前正在LG半导体公司生产。性能估计表明,与现有的阵列处理器相比,需要乘法的算法有了显著的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Buffer size optimization for full-search block matching algorithms PART: a partitioning tool for efficient use of distributed systems A visual computing environment for very large scale biomolecular modeling A strategy for determining a Jacobi specific dataflow processor Scheduling in co-partitioned array architectures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1