Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115392
Mariusz Chyzy, W. Kosinski
The paper proposes an evolutionary algorithm (EA) for the state assignment problem (SAP). Two original crossover operators are presented. They are experimentally compared with other known crossovers for SAP using a set of benchmark finite state machines. Solutions generated by EA (using different crossover operators) are compared with the random ones and with the state assignments generated by the MAX+PLUS II system. Experimental results show that the solutions found by EA are significantly better (up to 55%) than those from MAX+PLUS II. Moreover, EA equipped with proposed crossover operators found better results than those obtained with the use of other compared crossovers.
{"title":"Evolutionary algorithm for state assignment of finite state machines","authors":"Mariusz Chyzy, W. Kosinski","doi":"10.1109/DSD.2002.1115392","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115392","url":null,"abstract":"The paper proposes an evolutionary algorithm (EA) for the state assignment problem (SAP). Two original crossover operators are presented. They are experimentally compared with other known crossovers for SAP using a set of benchmark finite state machines. Solutions generated by EA (using different crossover operators) are compared with the random ones and with the state assignments generated by the MAX+PLUS II system. Experimental results show that the solutions found by EA are significantly better (up to 55%) than those from MAX+PLUS II. Moreover, EA equipped with proposed crossover operators found better results than those obtained with the use of other compared crossovers.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121545067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115363
M. L. Anido, A. Paar, N. Bagherzadeh
This paper presents a novel method for improving the operation autonomy of the processing elements (PE) of SIMD-like machines. By combining guarded instructions and pseudo branches it is possible to achieve higher operation autonomy and higher instruction level parallelism than in previous SIMD/ASIMD architectures. The paper shows that it is feasible to avoid most branches and it is also possible to emulate conditional execution on the processing elements, either by using guarded instructions or by using pseudo branches, thus avoiding unnecessary intervention by the array control unit in data-dependant computations. Pseudo branches are used when it is not possible to use guarded instructions. Additionally, they also support the implementation of complex nested if-then-else constructs, improving the execution of irregular dataparallel applications. The paper also shows that the simplicity of the method allows it to be implemented both in fine-grain and coarse-grain SIMD/ASIMD architectures because it does not require significant additional silicon area. Finally, it is shown that pseudo branches can be used to control the power saving of those processing elements that have instructions nullified.
{"title":"Improving the operation autonomy of SIMD processing elements by using guarded instructions and pseudo branches","authors":"M. L. Anido, A. Paar, N. Bagherzadeh","doi":"10.1109/DSD.2002.1115363","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115363","url":null,"abstract":"This paper presents a novel method for improving the operation autonomy of the processing elements (PE) of SIMD-like machines. By combining guarded instructions and pseudo branches it is possible to achieve higher operation autonomy and higher instruction level parallelism than in previous SIMD/ASIMD architectures. The paper shows that it is feasible to avoid most branches and it is also possible to emulate conditional execution on the processing elements, either by using guarded instructions or by using pseudo branches, thus avoiding unnecessary intervention by the array control unit in data-dependant computations. Pseudo branches are used when it is not possible to use guarded instructions. Additionally, they also support the implementation of complex nested if-then-else constructs, improving the execution of irregular dataparallel applications. The paper also shows that the simplicity of the method allows it to be implemented both in fine-grain and coarse-grain SIMD/ASIMD architectures because it does not require significant additional silicon area. Finally, it is shown that pseudo branches can be used to control the power saving of those processing elements that have instructions nullified.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"103 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122603077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115371
S. Goto, Takashi Yamada, Norihisa Takayarna, Yoshifumi Matsushita, Y. Harada, H. Yasuura
This paper presents a design for a low-power digital matched filter (DMF) applicable to Wideband-Code Division Multiple Access (W-CDMA), which is a Direct-Sequence Spread-Spectrum (DS-SS) communication system. The proposed architectural approach to reducing the power consumption focuses on the reception registers and the correlation-calculating unit (CCU), which dissipate the majority of the power in a DMF The main features are asynchronous latch clock generation for the reception registers, parallelism of the correlation calculation operations and bit manipulation for chip-correlation operations. A DMF is designed in compliance with the W-CDMA specifications incorporating the proposed techniques, and its properties are evaluated by computer simulations at the gate level using 0.18-/spl mu/m CMOS standard cell array technology. The results of the simulations show a power consumption of 9.3 mW (@15.6MHz, 1.6V), which is only about 30% of the power consumption of conventional DMFs.
本文设计了一种适用于宽带码分多址(W-CDMA)通信系统的低功耗数字匹配滤波器(DMF)。所提出的降低功耗的架构方法集中在接收寄存器和相关计算单元(CCU)上,它们消耗了DMF的大部分功率,其主要特征是接收寄存器的异步锁存时钟生成,相关计算操作的并行性以及芯片相关操作的位操作。采用所提出的技术,设计了符合W-CDMA规范的DMF,并使用0.18-/spl μ m CMOS标准单元阵列技术在栅极级进行了计算机模拟,评估了DMF的性能。仿真结果显示,功耗为9.3 mW (@15.6MHz, 1.6V),仅为传统dmf功耗的30%左右。
{"title":"A design for a low-power digital matched filter applicable to W-CDMA","authors":"S. Goto, Takashi Yamada, Norihisa Takayarna, Yoshifumi Matsushita, Y. Harada, H. Yasuura","doi":"10.1109/DSD.2002.1115371","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115371","url":null,"abstract":"This paper presents a design for a low-power digital matched filter (DMF) applicable to Wideband-Code Division Multiple Access (W-CDMA), which is a Direct-Sequence Spread-Spectrum (DS-SS) communication system. The proposed architectural approach to reducing the power consumption focuses on the reception registers and the correlation-calculating unit (CCU), which dissipate the majority of the power in a DMF The main features are asynchronous latch clock generation for the reception registers, parallelism of the correlation calculation operations and bit manipulation for chip-correlation operations. A DMF is designed in compliance with the W-CDMA specifications incorporating the proposed techniques, and its properties are evaluated by computer simulations at the gate level using 0.18-/spl mu/m CMOS standard cell array technology. The results of the simulations show a power consumption of 9.3 mW (@15.6MHz, 1.6V), which is only about 30% of the power consumption of conventional DMFs.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124511178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115362
Ari Wahyudi, A. Omondi
This paper reports on our experiments on using the Infineon TriCore as a building block for a multimedia processor. The experiments aim to obtain a high performance processor using two strategies: integrating multimedia units into the TriCore CPU and constructing the TriCore in multiprocessor configuration. The design and implementation of the multimedia units for video, audio, and text compressions are discussed. Two hardware architectures for IMA ADPCM audio compression multimedia unit were designed: direct architecture and sequential architecture. The multimedia unit for text compression is based on a modification from another design; our design uses a more efficient timing operation and has a better hardware utilization than the original design. Two algorithms for parallel motion-estimation were implemented on the multiple TriCore system. The results show that the TriCore is a good building block for a multiprocessor system.
{"title":"Parallel multimedia processor using customised Infineon TriCores","authors":"Ari Wahyudi, A. Omondi","doi":"10.1109/DSD.2002.1115362","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115362","url":null,"abstract":"This paper reports on our experiments on using the Infineon TriCore as a building block for a multimedia processor. The experiments aim to obtain a high performance processor using two strategies: integrating multimedia units into the TriCore CPU and constructing the TriCore in multiprocessor configuration. The design and implementation of the multimedia units for video, audio, and text compressions are discussed. Two hardware architectures for IMA ADPCM audio compression multimedia unit were designed: direct architecture and sequential architecture. The multimedia unit for text compression is based on a modification from another design; our design uses a more efficient timing operation and has a better hardware utilization than the original design. Two algorithms for parallel motion-estimation were implemented on the multiple TriCore system. The results show that the TriCore is a good building block for a multiprocessor system.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115995728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115375
J. Rice, J. Muzio
Four operations on switching functions are used to define a classification technique based on the autocorrelation function. The relationship between these classes and the well-known spectral classes is investigated, and a canonical representative for each class is proposed. It is thought that these classes will be of use in logic synthesis employing decision diagram representations for intermediate steps.
{"title":"Use of the autocorrelation function in the classification of switching functions","authors":"J. Rice, J. Muzio","doi":"10.1109/DSD.2002.1115375","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115375","url":null,"abstract":"Four operations on switching functions are used to define a classification technique based on the autocorrelation function. The relationship between these classes and the well-known spectral classes is investigated, and a canonical representative for each class is proposed. It is thought that these classes will be of use in logic synthesis employing decision diagram representations for intermediate steps.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125202847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115353
M. Garrido, C. Sanz, Marcos Jiménez, J. Meneses
In this paper a very flexible and efficient architecture that implements the core of a video coder according to Rec. H.263 is presented. It consists of a RISC processor that controls the scheduling of a set of specialized processors for the transforms (DCT and IDCT), quantizers (DQ and IQ), motion estimation and motion compensation (ME/MC). The architecture also includes preprocessing modules for the input video signal from the camera and interfaces for the external video memory and the H.263 bit-stream generation. The architecture has been written in synthesizable Verilog and tested using standard video sequences. It has also been prototyped into a development system based on an FPGA and a RISC.
{"title":"A flexible architecture for H.263 video coding","authors":"M. Garrido, C. Sanz, Marcos Jiménez, J. Meneses","doi":"10.1109/DSD.2002.1115353","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115353","url":null,"abstract":"In this paper a very flexible and efficient architecture that implements the core of a video coder according to Rec. H.263 is presented. It consists of a RISC processor that controls the scheduling of a set of specialized processors for the transforms (DCT and IDCT), quantizers (DQ and IQ), motion estimation and motion compensation (ME/MC). The architecture also includes preprocessing modules for the input video signal from the camera and interfaces for the external video memory and the H.263 bit-stream generation. The architecture has been written in synthesizable Verilog and tested using standard video sequences. It has also been prototyped into a development system based on an FPGA and a RISC.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131167599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115374
D. Jankovic, R. Stankovic, R. Drechsler
In this paper, we propose an approach to the reduction of sizes of multi-terminal binary decision diagrams (MTBDDs) by using the copy properties of discrete functions. The underlying principles come from copy theory of discrete signals considered previously. We propose two modifications of MTBDDs, called copy DDs (CDDs) and half copy DDs (HCDDs), using the corresponding copy operations from copy theory. Functions having different types of copy properties can be efficiently represented by the proposed Copy DDs. Examples are Walsh and Reed-Muller functions as well as different binary codes.
{"title":"Decision diagram optimization using copy properties","authors":"D. Jankovic, R. Stankovic, R. Drechsler","doi":"10.1109/DSD.2002.1115374","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115374","url":null,"abstract":"In this paper, we propose an approach to the reduction of sizes of multi-terminal binary decision diagrams (MTBDDs) by using the copy properties of discrete functions. The underlying principles come from copy theory of discrete signals considered previously. We propose two modifications of MTBDDs, called copy DDs (CDDs) and half copy DDs (HCDDs), using the corresponding copy operations from copy theory. Functions having different types of copy properties can be efficiently represented by the proposed Copy DDs. Examples are Walsh and Reed-Muller functions as well as different binary codes.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127000125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115344
Ulf Schlichtmann
The progress of silicon process technology relentlessly marches on. Moore's law still holds, the number of transistors that can be integrated on an IC doubles approximately every 18 months. The inability of system designs to keep up with this ever increasing number of available transistors has been debated for a long time, many solutions have been proposed. Now, as 130 nm processes enter volume production, 90 nm yields first engineering samples, and 65 nm processes are being developed, the design productivity crisis is exacerbated by the fact that very difficult design challenges are inherent in Ultra-Deep Submicron (UDSM) technologies. They threaten the approach of abstracting technological features away at higher levels, thus endangering design productivity even more. This paper outlines current challenges, presents approaches to address them and proposes further areas for research.
{"title":"Systems are made from transistors: UDSM technology creates new challenges for library and IC development","authors":"Ulf Schlichtmann","doi":"10.1109/DSD.2002.1115344","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115344","url":null,"abstract":"The progress of silicon process technology relentlessly marches on. Moore's law still holds, the number of transistors that can be integrated on an IC doubles approximately every 18 months. The inability of system designs to keep up with this ever increasing number of available transistors has been debated for a long time, many solutions have been proposed. Now, as 130 nm processes enter volume production, 90 nm yields first engineering samples, and 65 nm processes are being developed, the design productivity crisis is exacerbated by the fact that very difficult design challenges are inherent in Ultra-Deep Submicron (UDSM) technologies. They threaten the approach of abstracting technological features away at higher levels, thus endangering design productivity even more. This paper outlines current challenges, presents approaches to address them and proposes further areas for research.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116592730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115349
R. Drechsler, Wolfgang Günther, T. Eschbach, Lothar Linhard, Gerhard Angst
In many application in VLSI CAD, a given netlist has to be partitioned into smaller sub-designs which can be handled much better. In this paper we present a new recursive bi-partitioning algorithm that is especially applicable, if a large number of final partitions, e.g. more than 1000, has to be computed. The algorithm consists of two steps. Based on recursive splits the problem is divided into several sub-problems, but with increasing recursion depth more run time is invested. By this an initial solution is determined very fast. The core of the method is a second step, where a very powerful greedy algorithm is applied to refine the partitions. Experimental results are given that compare the new approach to state-of-the-art tools. The experiments show that the new approach outperforms the standard techniques with respect to run time and quality. Furthermore, the memory usage is very low and is reduced in comparison to other methods by more than a factor of four.
{"title":"Recursive bi-partitioning of netlists for large number of partitions","authors":"R. Drechsler, Wolfgang Günther, T. Eschbach, Lothar Linhard, Gerhard Angst","doi":"10.1109/DSD.2002.1115349","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115349","url":null,"abstract":"In many application in VLSI CAD, a given netlist has to be partitioned into smaller sub-designs which can be handled much better. In this paper we present a new recursive bi-partitioning algorithm that is especially applicable, if a large number of final partitions, e.g. more than 1000, has to be computed. The algorithm consists of two steps. Based on recursive splits the problem is divided into several sub-problems, but with increasing recursion depth more run time is invested. By this an initial solution is determined very fast. The core of the method is a second step, where a very powerful greedy algorithm is applied to refine the partitions. Experimental results are given that compare the new approach to state-of-the-art tools. The experiments show that the new approach outperforms the standard techniques with respect to run time and quality. Furthermore, the memory usage is very low and is reduced in comparison to other methods by more than a factor of four.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129464572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115368
A. Schneider, K. Diener, E. Ivask, R. Ubar, E. Gramatová, T. Hollstein, W. Kuzmicz, Zebo Peng
This paper describes an environment for internet-based collaboration in the field of design and test of digital systems. Automatic Test Pattern Generation (ATPG) and fault simulation tools at behavioral, logical and hierarchical levels available at geographically different places running under the virtual environment using the MOSCITO system are presented The interfaces between the integrated tools and also commercial design tools were developed. The tools can be used separately, or in multiple applications in different design and test flows. The functionality of the integrated design and test system was verified in several collaborative experiments over internet by partners locating in different geographical sites.
{"title":"Integrated design and test generation under internet based environment MOSCITO","authors":"A. Schneider, K. Diener, E. Ivask, R. Ubar, E. Gramatová, T. Hollstein, W. Kuzmicz, Zebo Peng","doi":"10.1109/DSD.2002.1115368","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115368","url":null,"abstract":"This paper describes an environment for internet-based collaboration in the field of design and test of digital systems. Automatic Test Pattern Generation (ATPG) and fault simulation tools at behavioral, logical and hierarchical levels available at geographically different places running under the virtual environment using the MOSCITO system are presented The interfaces between the integrated tools and also commercial design tools were developed. The tools can be used separately, or in multiple applications in different design and test flows. The functionality of the integrated design and test system was verified in several collaborative experiments over internet by partners locating in different geographical sites.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128658976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}