首页 > 最新文献

1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)最新文献

英文 中文
A gray-based block-matching algorithm and its VLSI architecture 一种基于灰度的块匹配算法及其VLSI结构
Yeu-Horng Shiau, Pei-Yin Chen, J. Jou
In this paper, we propose an efficient gray-based block-matching algorithm (GBMA) and its VLSI architecture. Based on the gray system theory, the GBMA can determine the better motion vectors of image blocks quickly. The experimental results show that the proposed algorithm performs better than other search algorithms, such as TSS, CS, PHODS, FSS, and SES, in terms of four different measures: 1) average MSE per pixel, 2) average PSNR, 3) average prediction errors per pixel, and 4) average search points per frame. The VLSI architecture of the algorithm has been designed and implemented, and it can yield a search rate of 680 K blocks/sec with a clock rate of 66 MHz.
本文提出了一种高效的基于灰度的块匹配算法(GBMA)及其VLSI架构。基于灰度系统理论,该算法可以快速确定图像块的运动向量。实验结果表明,该算法在每像素平均MSE、2)平均PSNR、3)每像素平均预测误差和4)每帧平均搜索点四个指标上都优于其他搜索算法,如TSS、CS、PHODS、FSS和SES。设计并实现了该算法的VLSI架构,在66mhz的时钟频率下,其搜索速率可达680 K块/秒。
{"title":"A gray-based block-matching algorithm and its VLSI architecture","authors":"Yeu-Horng Shiau, Pei-Yin Chen, J. Jou","doi":"10.1109/SIPS.1999.822310","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822310","url":null,"abstract":"In this paper, we propose an efficient gray-based block-matching algorithm (GBMA) and its VLSI architecture. Based on the gray system theory, the GBMA can determine the better motion vectors of image blocks quickly. The experimental results show that the proposed algorithm performs better than other search algorithms, such as TSS, CS, PHODS, FSS, and SES, in terms of four different measures: 1) average MSE per pixel, 2) average PSNR, 3) average prediction errors per pixel, and 4) average search points per frame. The VLSI architecture of the algorithm has been designed and implemented, and it can yield a search rate of 680 K blocks/sec with a clock rate of 66 MHz.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132744277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of an area efficient Reed-Solomon decoder ASIC chip 一种面积高效的Reed-Solomon译码器ASIC芯片的设计
Hyunman Chang, M. Sunwoo
We describe an area efficient pipelined Reed-Solomon (RS) decoder. We propose two simple basic cell architectures which evaluate the error locator and the error magnitude polynomial in the general Euclid's algorithm. The evaluation involves high computational complexity, and thus, it affects the speed and the hardware complexity of RS decoders. The proposed architectures can reduce the hardware complexity by more than 16% of existing RS decoder architectures. The proposed RS decoder can be programmed to decode four RS codes defined in Galois field 2/sup 8/, i.e., (200, 188), (120, 108), (60, 48), and (40, 28) and can correct up to six errors. The fabricated FEC (Forward Error Correction) chip including the RS and Viterbi decoders operates at 40 MHz. The total number of gates for the RS decoder is about 31,000 and the FEC chip contains about 76,000 gates.
我们描述了一种区域高效的流水线里德-所罗门(RS)解码器。我们提出了两种简单的基本单元结构来评估一般欧几里得算法中的误差定位器和误差幅度多项式。该评估涉及较高的计算复杂度,从而影响RS解码器的速度和硬件复杂度。与现有的RS解码器结构相比,所提出的结构可将硬件复杂度降低16%以上。所提出的RS解码器可编程为解码在伽罗瓦域2/sup 8/中定义的4个RS码,即(200,188)、(120,108)、(60,48)和(40,28),并可校正最多6个错误。制造的FEC(前向纠错)芯片包括RS和Viterbi解码器工作在40 MHz。RS解码器的门总数约为31,000个,FEC芯片包含约76,000个门。
{"title":"Design of an area efficient Reed-Solomon decoder ASIC chip","authors":"Hyunman Chang, M. Sunwoo","doi":"10.1109/SIPS.1999.822364","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822364","url":null,"abstract":"We describe an area efficient pipelined Reed-Solomon (RS) decoder. We propose two simple basic cell architectures which evaluate the error locator and the error magnitude polynomial in the general Euclid's algorithm. The evaluation involves high computational complexity, and thus, it affects the speed and the hardware complexity of RS decoders. The proposed architectures can reduce the hardware complexity by more than 16% of existing RS decoder architectures. The proposed RS decoder can be programmed to decode four RS codes defined in Galois field 2/sup 8/, i.e., (200, 188), (120, 108), (60, 48), and (40, 28) and can correct up to six errors. The fabricated FEC (Forward Error Correction) chip including the RS and Viterbi decoders operates at 40 MHz. The total number of gates for the RS decoder is about 31,000 and the FEC chip contains about 76,000 gates.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124783751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
JPEG transcompressor and Internet applications JPEG转换压缩器和互联网应用程序
Tae-Hua Lan, A.H. Tewflk, Po-Chin Hu
In this paper we design a novel transcoding algorithm to transcode and compress a JPEG or MPEG bitstream to a fully embedded one. Due to the efficiency of multigrid embedded (MGE) coding used to code non-zero quantized DCT coefficients, it actually compresses JPEG or MPEG bitstreams from 9% to 30%. This "transcompression" method has many applications for the Internet and multimedia networks. Since most "original" images or video files stored on computers around the world are in JPEG or MPEG formats, our algorithm provides a simple and efficient way to redistribute pictures/video files across the Internet.
本文设计了一种新的转码算法,将JPEG或MPEG码流转码压缩为完全嵌入的码流。由于用于编码非零量化DCT系数的多网格嵌入(MGE)编码的效率,它实际上将JPEG或MPEG比特流从9%压缩到30%。这种“转压缩”方法在Internet和多媒体网络中有许多应用。由于世界上存储在计算机上的大多数“原始”图像或视频文件都是JPEG或MPEG格式,我们的算法提供了一种简单有效的方法来通过互联网重新分发图片/视频文件。
{"title":"JPEG transcompressor and Internet applications","authors":"Tae-Hua Lan, A.H. Tewflk, Po-Chin Hu","doi":"10.1109/SIPS.1999.822345","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822345","url":null,"abstract":"In this paper we design a novel transcoding algorithm to transcode and compress a JPEG or MPEG bitstream to a fully embedded one. Due to the efficiency of multigrid embedded (MGE) coding used to code non-zero quantized DCT coefficients, it actually compresses JPEG or MPEG bitstreams from 9% to 30%. This \"transcompression\" method has many applications for the Internet and multimedia networks. Since most \"original\" images or video files stored on computers around the world are in JPEG or MPEG formats, our algorithm provides a simple and efficient way to redistribute pictures/video files across the Internet.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124871886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimal systolic block size for low power high speed digital allpass filters based on the 3-port adaptor 基于3端口适配器的低功耗高速数字全通滤波器的最佳收缩块大小
P. Israsena, S. Summerfield
Allpass digital filters are major building blocks in many digital filter architectures. In this paper an optimal pipelined architecture for a 2nd order allpass section based on 3-port adaptor is proposed. Optimal pipelining improves the filter's overall performance in term of power-delay-area by 4.8 times using 1 /spl mu/m CMOS standard cell design and by 10 times using custom cells. Given the same clock speed and without the use of supply voltage scaling, the architecture consumes 58% less power than the non-pipelined equivalent using custom cell implementation and by 50% using standard cells. With a maximum throughput of 277 MHz, the adaptor's power consumption is 5.44 mW/MHz, representing a 64% improvement in power efficiency relative to the non-pipelined standard cell adaptor.
全通数字滤波器是许多数字滤波器体系结构的主要组成部分。本文提出了一种基于三端口适配器的二阶全通段的最优流水线结构。采用1 /spl mu/m CMOS标准单元设计,优化的流水线将滤波器的整体性能在功率延迟面积方面提高了4.8倍,使用定制单元则提高了10倍。在相同的时钟速度和不使用电源电压缩放的情况下,该架构比使用自定义单元实现的非流水线等效功耗低58%,使用标准单元功耗低50%。最大吞吐量为277 MHz,适配器的功耗为5.44 mW/MHz,与非流水线标准单元适配器相比,功率效率提高了64%。
{"title":"Optimal systolic block size for low power high speed digital allpass filters based on the 3-port adaptor","authors":"P. Israsena, S. Summerfield","doi":"10.1109/SIPS.1999.822377","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822377","url":null,"abstract":"Allpass digital filters are major building blocks in many digital filter architectures. In this paper an optimal pipelined architecture for a 2nd order allpass section based on 3-port adaptor is proposed. Optimal pipelining improves the filter's overall performance in term of power-delay-area by 4.8 times using 1 /spl mu/m CMOS standard cell design and by 10 times using custom cells. Given the same clock speed and without the use of supply voltage scaling, the architecture consumes 58% less power than the non-pipelined equivalent using custom cell implementation and by 50% using standard cells. With a maximum throughput of 277 MHz, the adaptor's power consumption is 5.44 mW/MHz, representing a 64% improvement in power efficiency relative to the non-pipelined standard cell adaptor.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123195266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Architecture and implementation of a single-chip programmable digital television and media processor 单片可编程数字电视和媒体处理器的体系结构和实现
S. Dutta, D. Singh, V. Mehra
This paper describes the architecture, functionality and design of TM-2700-a digital television and media processor chip from Philips Semiconductors. The chip not only supports all eighteen digital television picture formats prescribed by the United States Advanced Television Systems Committee (ATSC), from standard-definition to wide-angle high-definition video, but has also the power to handle High-Definition Television (HDTV) video and audio source decoding (high-level MPEG-2 video, AC-3 and ProLogic audio, closed captioning, etc.) as well as the flexibility to process advanced interactive services. TM-2700 is a programmable processor with a very powerful, general-purpose Very Long Instruction Word (VLIW) Central Processing Unit (CPU) core that implements many non-trivial multimedia algorithms, coordinates all on-chip activities, and runs a small real-time operating system. Aided by an array of peripheral devices and high-performance buses, the CPU core facilitates concurrent processing of audio, video, graphics, and communication-data.
本文介绍了Philips半导体公司的tm -2700数字电视和媒体处理器芯片的结构、功能和设计。该芯片不仅支持美国先进电视系统委员会(ATSC)规定的从标准清晰度到广角高清视频的全部18种数字电视图像格式,还具有处理高清电视(HDTV)视频和音频源解码(高级MPEG-2视频、AC-3和ProLogic音频、闭字幕等)的能力,并具有处理高级交互式业务的灵活性。TM-2700是一种可编程处理器,具有非常强大的通用超长指令字(VLIW)中央处理单元(CPU)核心,可实现许多重要的多媒体算法,协调所有片上活动,并运行小型实时操作系统。在一系列外围设备和高性能总线的帮助下,CPU核心促进了音频、视频、图形和通信数据的并发处理。
{"title":"Architecture and implementation of a single-chip programmable digital television and media processor","authors":"S. Dutta, D. Singh, V. Mehra","doi":"10.1109/SIPS.1999.822337","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822337","url":null,"abstract":"This paper describes the architecture, functionality and design of TM-2700-a digital television and media processor chip from Philips Semiconductors. The chip not only supports all eighteen digital television picture formats prescribed by the United States Advanced Television Systems Committee (ATSC), from standard-definition to wide-angle high-definition video, but has also the power to handle High-Definition Television (HDTV) video and audio source decoding (high-level MPEG-2 video, AC-3 and ProLogic audio, closed captioning, etc.) as well as the flexibility to process advanced interactive services. TM-2700 is a programmable processor with a very powerful, general-purpose Very Long Instruction Word (VLIW) Central Processing Unit (CPU) core that implements many non-trivial multimedia algorithms, coordinates all on-chip activities, and runs a small real-time operating system. Aided by an array of peripheral devices and high-performance buses, the CPU core facilitates concurrent processing of audio, video, graphics, and communication-data.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123239387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Analysis of the intrinsic transient fault tolerance of a signal and image processing algorithm implemented on a DSP 信号固有暂态容错分析及在DSP上实现的图像处理算法
C. Lecordier, O. Ingremeau, E. Martin
We present in this article a method to analyse the intrinsic transient fault tolerance of signal and image processing algorithms for DSP applications. This method allows to protect only non-tolerant parts and consequently to decrease time and memory overheads. Results are given for an algorithm of embedded satellite image compression.
本文提出了一种分析信号和图像处理算法固有暂态容错性的方法。这种方法允许只保护非公差部件,从而减少时间和内存开销。给出了一种嵌入式卫星图像压缩算法的研究结果。
{"title":"Analysis of the intrinsic transient fault tolerance of a signal and image processing algorithm implemented on a DSP","authors":"C. Lecordier, O. Ingremeau, E. Martin","doi":"10.1109/SIPS.1999.822352","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822352","url":null,"abstract":"We present in this article a method to analyse the intrinsic transient fault tolerance of signal and image processing algorithms for DSP applications. This method allows to protect only non-tolerant parts and consequently to decrease time and memory overheads. Results are given for an algorithm of embedded satellite image compression.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"120 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128487556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scalable low-complexity digit-serial VLSI architecture for RSA cryptosystem RSA密码系统的可扩展低复杂度数字串行VLSI架构
Jye-Jong Leu, A. Wu
The Booth-encoded Montgomery modular multiplication algorithm is proposed to reduce the iteration number to about n/2 in each Montgomery operation. In addition, we apply the folding and unfolding technique to shorten the critical path. Finally, we propose the 2 bit-digit-serial pipelined architecture to process RSA en/decryption in a more efficient way. By applying the proposed algorithm in RSA design, the hardware complexity can be reduced by 15% compared with most RSA VLSI designs using the Montgomery modular multiplication algorithm.
提出了booth编码Montgomery模乘法算法,将每次Montgomery运算的迭代次数减少到n/2左右。此外,我们还应用了折叠和展开技术来缩短关键路径。最后,我们提出了2位数字串行流水线架构,以更有效的方式处理RSA加密/解密。通过将该算法应用于RSA设计,与使用Montgomery模乘法算法的大多数RSA VLSI设计相比,硬件复杂度可降低15%。
{"title":"A scalable low-complexity digit-serial VLSI architecture for RSA cryptosystem","authors":"Jye-Jong Leu, A. Wu","doi":"10.1109/SIPS.1999.822365","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822365","url":null,"abstract":"The Booth-encoded Montgomery modular multiplication algorithm is proposed to reduce the iteration number to about n/2 in each Montgomery operation. In addition, we apply the folding and unfolding technique to shorten the critical path. Finally, we propose the 2 bit-digit-serial pipelined architecture to process RSA en/decryption in a more efficient way. By applying the proposed algorithm in RSA design, the hardware complexity can be reduced by 15% compared with most RSA VLSI designs using the Montgomery modular multiplication algorithm.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131087910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Parametrizable behavioral IP module for a data-localized low-power FFT 数据局部化低功耗FFT的可参数化行为IP模块
E. Brockmeyer, C. Ghez, J. D'Eer, F. Catthoor, H. de Man
FFTs are important modules in embedded telecom systems, many of which require low-power real-time implementations. This paper describes a technique for aggressively localizing data accesses in a (inverse) fast Fourier transformation at the source code level. The global I/O functionality is not modified and neither is the bit-true arithmetic behavior. Typically 20 to 50% of the background memory accesses can be saved. A heavily parametrizable solution is proposed which leads to a family of power optimized algorithm codes. Moreover, efficient coding details for specific instances are shown.
fft是嵌入式电信系统中的重要模块,其中许多需要低功耗实时实现。本文描述了一种在源代码级(逆)快速傅立叶变换中积极定位数据访问的技术。全局I/O功能没有被修改,位真算术行为也没有被修改。通常可以节省20%到50%的后台内存访问。提出了一种高度可参数化的解决方案,从而产生了一系列功率优化算法代码。此外,还展示了针对特定实例的高效编码细节。
{"title":"Parametrizable behavioral IP module for a data-localized low-power FFT","authors":"E. Brockmeyer, C. Ghez, J. D'Eer, F. Catthoor, H. de Man","doi":"10.1109/SIPS.1999.822370","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822370","url":null,"abstract":"FFTs are important modules in embedded telecom systems, many of which require low-power real-time implementations. This paper describes a technique for aggressively localizing data accesses in a (inverse) fast Fourier transformation at the source code level. The global I/O functionality is not modified and neither is the bit-true arithmetic behavior. Typically 20 to 50% of the background memory accesses can be saved. A heavily parametrizable solution is proposed which leads to a family of power optimized algorithm codes. Moreover, efficient coding details for specific instances are shown.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132230099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Rapid VLSI design of biorthogonal wavelet transform cores 双正交小波变换核的VLSI快速设计
S. Masud, J. McCanny
A rapid design methodology for biorthogonal wavelet transform cores has been developed. This methodology is based on a generic, scaleable architecture for the wavelet filters. The architecture offers efficient hardware utilization by combining the linear phase property of biorthogonal filters with decimation in a MAC based implementation. The design has been captured in VHDL and parameterised in terms of wavelet type, data word length and coefficient word length. The control circuit is embedded within the cores and allows them to be cascaded without any interface glue logic for any desired level of decomposition. The design time to produce silicon layout of a biorthogonal wavelet based system is typically less than a day. The resulting silicon cores produced are comparable in area and performance to hand-crafted designs. The designs are portable across a range of foundries and are also applicable to FPGA and PLD implementations.
提出了一种双正交小波变换磁芯的快速设计方法。该方法基于一个通用的、可扩展的小波滤波器架构。该体系结构通过将双正交滤波器的线性相位特性与基于MAC的抽取相结合,提供了高效的硬件利用。该设计已在VHDL中捕获,并根据小波类型、数据字长和系数字长进行参数化。控制电路嵌入在核心内,并允许它们级联,而无需任何接口胶逻辑,即可实现任何所需的分解级别。基于双正交小波系统的硅版图设计时间通常少于一天。由此产生的硅芯在面积和性能上可与手工设计相媲美。该设计可在一系列代工厂之间移植,也适用于FPGA和PLD实现。
{"title":"Rapid VLSI design of biorthogonal wavelet transform cores","authors":"S. Masud, J. McCanny","doi":"10.1109/SIPS.1999.822334","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822334","url":null,"abstract":"A rapid design methodology for biorthogonal wavelet transform cores has been developed. This methodology is based on a generic, scaleable architecture for the wavelet filters. The architecture offers efficient hardware utilization by combining the linear phase property of biorthogonal filters with decimation in a MAC based implementation. The design has been captured in VHDL and parameterised in terms of wavelet type, data word length and coefficient word length. The control circuit is embedded within the cores and allows them to be cascaded without any interface glue logic for any desired level of decomposition. The design time to produce silicon layout of a biorthogonal wavelet based system is typically less than a day. The resulting silicon cores produced are comparable in area and performance to hand-crafted designs. The designs are portable across a range of foundries and are also applicable to FPGA and PLD implementations.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125329372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A performance-oriented use methodology of power optimizing code transformations for multimedia applications realized on programmable multimedia processors 在可编程多媒体处理器上实现的面向性能的多媒体应用程序功率优化代码转换的使用方法
K. Masselos, F. Catthoor, C. Goutis, H. Deman
The data storage and transfers related power consumption, which forms an important part of the total system power consumption, should be reduced in realizations of multimedia applications on programmable multimedia processors. In earlier work, we have formalized a methodology to achieve this. This methodology is based on the application of a number of power optimizing code transformations to a system-level description of the target application. Our script has recently been extended explicitly to take into consideration the required performance (in number of execution cycles), which is the overriding constraint in real-time multimedia applications. In this paper, we focus on a systematic use methodology that allows to apply this script in practice in a manual way on a complex application, including the necessary local design iterations. Experimental results from a real-life data-dominated application demonstrate that the application of the proposed systematic approach leads to significant power and performance gains compared to reference designs.
在可编程多媒体处理器上实现多媒体应用时,应降低与数据存储和传输相关的功耗,这是系统总功耗的重要组成部分。在早期的工作中,我们已经形式化了实现这一目标的方法。该方法基于对目标应用程序的系统级描述的许多功能优化代码转换的应用程序。我们的脚本最近进行了显式扩展,以考虑所需的性能(以执行周期的数量为单位),这是实时多媒体应用程序中最重要的约束。在本文中,我们关注于一种系统的使用方法,该方法允许在实践中以手工的方式在复杂的应用程序上应用该脚本,包括必要的局部设计迭代。来自实际数据主导应用的实验结果表明,与参考设计相比,所提出的系统方法的应用可带来显着的功率和性能增益。
{"title":"A performance-oriented use methodology of power optimizing code transformations for multimedia applications realized on programmable multimedia processors","authors":"K. Masselos, F. Catthoor, C. Goutis, H. Deman","doi":"10.1109/SIPS.1999.822331","DOIUrl":"https://doi.org/10.1109/SIPS.1999.822331","url":null,"abstract":"The data storage and transfers related power consumption, which forms an important part of the total system power consumption, should be reduced in realizations of multimedia applications on programmable multimedia processors. In earlier work, we have formalized a methodology to achieve this. This methodology is based on the application of a number of power optimizing code transformations to a system-level description of the target application. Our script has recently been extended explicitly to take into consideration the required performance (in number of execution cycles), which is the overriding constraint in real-time multimedia applications. In this paper, we focus on a systematic use methodology that allows to apply this script in practice in a manual way on a complex application, including the necessary local design iterations. Experimental results from a real-life data-dominated application demonstrate that the application of the proposed systematic approach leads to significant power and performance gains compared to reference designs.","PeriodicalId":275030,"journal":{"name":"1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)","volume":"30 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132656140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1