Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213012
Kuo-Hsing Cheng, Y. Lo, W. Yu, Shu-Yin Hung
This paper describes a mixed-mode delay-locked loop (DLL) for wide-range operation and multiphase outputs with just one clock cycle. The architecture of the proposed DLL uses the mixed-mode time-to-digital converter (TDC) scheme for phase range selector to offer faster locking time. The multi-controlled delay cell for voltage-controlled delay line (VCDL) was used to provide wide locked range and the low-jitter performance. The proposed DLL can solve the problem of the false locking associated with conventional DLLs. The circuit design and HSPICE simulation are based upon TSMC 0.258 /spl mu/m 1P5M N-well CMOS process with a 2.5 V power supply voltage. The post-layout simulation results show that the proposed DLL has wide locking range 50 to 280 MHz. Moreover, the total time delay from all delay stages is precisely one period of the input reference signal, and that can generate equally spaced ten-phase clocks.
{"title":"A mixed-mode delay-locked loop for wide-range operation and multiphase clock generation","authors":"Kuo-Hsing Cheng, Y. Lo, W. Yu, Shu-Yin Hung","doi":"10.1109/IWSOC.2003.1213012","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213012","url":null,"abstract":"This paper describes a mixed-mode delay-locked loop (DLL) for wide-range operation and multiphase outputs with just one clock cycle. The architecture of the proposed DLL uses the mixed-mode time-to-digital converter (TDC) scheme for phase range selector to offer faster locking time. The multi-controlled delay cell for voltage-controlled delay line (VCDL) was used to provide wide locked range and the low-jitter performance. The proposed DLL can solve the problem of the false locking associated with conventional DLLs. The circuit design and HSPICE simulation are based upon TSMC 0.258 /spl mu/m 1P5M N-well CMOS process with a 2.5 V power supply voltage. The post-layout simulation results show that the proposed DLL has wide locking range 50 to 280 MHz. Moreover, the total time delay from all delay stages is precisely one period of the input reference signal, and that can generate equally spaced ten-phase clocks.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134328640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213005
T. W. Fox, A. Carreira, L. Turner
This paper presents a method for the generation of low-power Finite duration Impulse Response (FIR) lowpass differentiator Intellectual Property (IP) blocks. The design problem is formulated as a discrete constrained optimization problem where the total squared frequency response approximation error is minimized subject to constraints on the power consumption and frequency response approximation error. It is demonstrated that the power consumption can be reduced while still satisfying the frequency response specifications.
{"title":"The design of low-power fixed-point FIR differentiator IP blocks","authors":"T. W. Fox, A. Carreira, L. Turner","doi":"10.1109/IWSOC.2003.1213005","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213005","url":null,"abstract":"This paper presents a method for the generation of low-power Finite duration Impulse Response (FIR) lowpass differentiator Intellectual Property (IP) blocks. The design problem is formulated as a discrete constrained optimization problem where the total squared frequency response approximation error is minimized subject to constraints on the power consumption and frequency response approximation error. It is demonstrated that the power consumption can be reduced while still satisfying the frequency response specifications.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"91 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133411618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213009
N. Patel, G. Coghill, S. Nguang
Real-time execution of an algorithm can be achieved with a fast serial processor or with a parallel machine. Usually both of these methods use multi-bit binary words which are processed by an arithmetic unit. This paper demonstrates an alternative approach to the parallel solution. Here, instead of a multi-bit word, a single bit-stream is processed by digital logic to satisfy the required algorithm. Using elementary digital gates, classical elements like integrators and differentiators can be constructed. These elements operate on bit-streams and also produce bit-streams and by interconnecting them, complex systems can be built. The inherently parallel nature of this technique makes it possible to implement complex algorithms in real time. This technique has been successfully applied to implement a PID controller on an FPGA for an experimental thermal plant.
{"title":"Digital realization of analogue computing elements using bit streams","authors":"N. Patel, G. Coghill, S. Nguang","doi":"10.1109/IWSOC.2003.1213009","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213009","url":null,"abstract":"Real-time execution of an algorithm can be achieved with a fast serial processor or with a parallel machine. Usually both of these methods use multi-bit binary words which are processed by an arithmetic unit. This paper demonstrates an alternative approach to the parallel solution. Here, instead of a multi-bit word, a single bit-stream is processed by digital logic to satisfy the required algorithm. Using elementary digital gates, classical elements like integrators and differentiators can be constructed. These elements operate on bit-streams and also produce bit-streams and by interconnecting them, complex systems can be built. The inherently parallel nature of this technique makes it possible to implement complex algorithms in real time. This technique has been successfully applied to implement a PID controller on an FPGA for an experimental thermal plant.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117255279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213067
Shih-Chang Hsia
MPEG-2 coder has become a standard core for video compression, and the whole module of variable length code (VLC) is a key component within MPEG-2 system. In this study, a real-time VLC encoder is developed by using discrete logic architecture rather than memory-based. In order to improve the chip efficiency, the codeword bank is constructed by order of codeword consisting of tri-state buffer. Three main VLC codeword tables for MPEG-2 system involved coded block pattern, motion vector and DCT coefficients all are efficiently realized in this work. The prototyping circuit is successfully implemented by using Verilog high-level description language and then fitted into a FPGA chip. The total gate-count can be reduced about 30% compared to the conventional VLC designs.
{"title":"Prototyping implementation for low-complexity real-time MPEG-2 variable length encoder","authors":"Shih-Chang Hsia","doi":"10.1109/IWSOC.2003.1213067","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213067","url":null,"abstract":"MPEG-2 coder has become a standard core for video compression, and the whole module of variable length code (VLC) is a key component within MPEG-2 system. In this study, a real-time VLC encoder is developed by using discrete logic architecture rather than memory-based. In order to improve the chip efficiency, the codeword bank is constructed by order of codeword consisting of tri-state buffer. Three main VLC codeword tables for MPEG-2 system involved coded block pattern, motion vector and DCT coefficients all are efficiently realized in this work. The prototyping circuit is successfully implemented by using Verilog high-level description language and then fitted into a FPGA chip. The total gate-count can be reduced about 30% compared to the conventional VLC designs.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129716796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213052
N. Bambha, S. Bhattacharyya, G. Euliss
This paper addresses some fundamental issues relating to the design of systems on chip that utilize optical interconnects. We present an information theoretical model for assessing trade-offs between global and local partitions in these systems, and evaluate interconnect topology synthesis and application mapping techniques for digital signal processing (DSP) applications in these systems.
{"title":"Design considerations for optically connected systems on chip","authors":"N. Bambha, S. Bhattacharyya, G. Euliss","doi":"10.1109/IWSOC.2003.1213052","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213052","url":null,"abstract":"This paper addresses some fundamental issues relating to the design of systems on chip that utilize optical interconnects. We present an information theoretical model for assessing trade-offs between global and local partitions in these systems, and evaluate interconnect topology synthesis and application mapping techniques for digital signal processing (DSP) applications in these systems.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127118465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213020
H. Emam, M. Ashour, H. Fekry, A. Wahdan
Genetic Algorithms (GAs) are one of the most advanced optimization techniques. The main objective of this paper, is introducing an FPGA implementation based genetic algorithm, then applying it, as an adaptive algorithm, on a nonlinear adaptive filters for the purpose of blind signals separation. In this case, the nonlinear estimator has been used to predict the error filter and GA will be used to optimize the filter coefficients through the search for a near optimum solution. The proposed Hardware Genetic Algorithms (HGA) has been presented and tested, first, by different sine wave signals, then by audio wave signals to judge the design separation capability. The implementation results declare that HGA approach significantly enhances the system performance as a step toward real time performance.
{"title":"Introducing an FPGA based genetic algorithms in the applications of blind signals separation","authors":"H. Emam, M. Ashour, H. Fekry, A. Wahdan","doi":"10.1109/IWSOC.2003.1213020","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213020","url":null,"abstract":"Genetic Algorithms (GAs) are one of the most advanced optimization techniques. The main objective of this paper, is introducing an FPGA implementation based genetic algorithm, then applying it, as an adaptive algorithm, on a nonlinear adaptive filters for the purpose of blind signals separation. In this case, the nonlinear estimator has been used to predict the error filter and GA will be used to optimize the filter coefficients through the search for a near optimum solution. The proposed Hardware Genetic Algorithms (HGA) has been presented and tested, first, by different sine wave signals, then by audio wave signals to judge the design separation capability. The implementation results declare that HGA approach significantly enhances the system performance as a step toward real time performance.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114981441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213002
P. Guitton-Ouhamou, C. Belleudy, M. Auguin
Minimizing power consumption in system on chip is a crucial task. So the parameter of consumption has to be introduced in HW/SW tool. This paper describes how our HW/SW codesign tool, CODEF, is extended to have power consumption and optimization ability. Some strategies of consumption optimizations are presented. First, we present how to build the library composed of consumption models of hardware and software modules (that take into account frequency and supply voltage). Then, we describe the algorithm that computes the peak power and the energy. To reduce the energy, we describe a strategy during allocation step to minimize energy. In this way, the partitioning algorithm has been modified and we present some results of architectures optimization with some important gains of 50%.
{"title":"Energy optimization in a HW/SW tool: design of low power architecture system","authors":"P. Guitton-Ouhamou, C. Belleudy, M. Auguin","doi":"10.1109/IWSOC.2003.1213002","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213002","url":null,"abstract":"Minimizing power consumption in system on chip is a crucial task. So the parameter of consumption has to be introduced in HW/SW tool. This paper describes how our HW/SW codesign tool, CODEF, is extended to have power consumption and optimization ability. Some strategies of consumption optimizations are presented. First, we present how to build the library composed of consumption models of hardware and software modules (that take into account frequency and supply voltage). Then, we describe the algorithm that computes the peak power and the energy. To reduce the energy, we describe a strategy during allocation step to minimize energy. In this way, the partitioning algorithm has been modified and we present some results of architectures optimization with some important gains of 50%.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114713761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213031
S. Aly, A. Salem
In this paper, an architecture of a Java based co-verification environment is proposed. A clock, memory, and bus have been modeled using the model-view paradigm. Furthermore, mobile device collaboration has been modeled using communicating threads. The proposed model consists of four components: a Java Virtual Machine, a Java based bus functional model (JBFM), a collaboration protocol model and an API interface. A simple image processing operation has been used to demonstrate the applicability of our approach.
{"title":"Java based co-verification of expedited mobile device collaboration using observability","authors":"S. Aly, A. Salem","doi":"10.1109/IWSOC.2003.1213031","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213031","url":null,"abstract":"In this paper, an architecture of a Java based co-verification environment is proposed. A clock, memory, and bus have been modeled using the model-view paradigm. Furthermore, mobile device collaboration has been modeled using communicating threads. The proposed model consists of four components: a Java Virtual Machine, a Java based bus functional model (JBFM), a collaboration protocol model and an API interface. A simple image processing operation has been used to demonstrate the applicability of our approach.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130494973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213004
L. Weng, Xiaojun Wang, B. Liu
One of the most important considerations for the current VLSI/SOC design is power, which can be classified into power analysis and optimization. In this survey, the main concepts of power optimization including the sources and policies are introduced. Among the various approaches, dynamic power management (DPM), which implies to change devices states when they are not working at the highest speed or at their full capacity, is the most efficient one. Our explanations accompanying the figures specify the abstract concepts of DPM. This paper briefly surveys both heuristic and stochastic policies and discusses their advantages and disadvantages.
{"title":"A survey of dynamic power optimization techniques","authors":"L. Weng, Xiaojun Wang, B. Liu","doi":"10.1109/IWSOC.2003.1213004","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213004","url":null,"abstract":"One of the most important considerations for the current VLSI/SOC design is power, which can be classified into power analysis and optimization. In this survey, the main concepts of power optimization including the sources and policies are introduced. Among the various approaches, dynamic power management (DPM), which implies to change devices states when they are not working at the highest speed or at their full capacity, is the most efficient one. Our explanations accompanying the figures specify the abstract concepts of DPM. This paper briefly surveys both heuristic and stochastic policies and discusses their advantages and disadvantages.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130746731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-07-22DOI: 10.1109/IWSOC.2003.1213028
Shih-Chang Hsia
Currently, various video formats, such as QCIF, CIF, CCIR601 and HDTV, are widely used in the world. Since their resolution is different, the processing speed required is different for motion estimation. Hence we need to design the specific hardware architecture for each format. In this study, we propose a flexible motion estimator to meet the processing speed of all formats with a common architecture, wherein there are four searching algorithms built to satisfy the various processing-time required. For applying to low-power systems, the computational kernel employs four processing-elements in this chip. With timing mode control, the throughput rate of the proposed motion estimator can achieve from 3k to 180k blocks to meet different applications while this chip works on 50MHz. The total gate count is less than 5k and the power dissipation is no more than 0.1mW in the worst case. Hence the very low-power motion estimation is appropriate for portable systems.
{"title":"VLSI implementation of very low-power motion estimator for scalable coding systems","authors":"Shih-Chang Hsia","doi":"10.1109/IWSOC.2003.1213028","DOIUrl":"https://doi.org/10.1109/IWSOC.2003.1213028","url":null,"abstract":"Currently, various video formats, such as QCIF, CIF, CCIR601 and HDTV, are widely used in the world. Since their resolution is different, the processing speed required is different for motion estimation. Hence we need to design the specific hardware architecture for each format. In this study, we propose a flexible motion estimator to meet the processing speed of all formats with a common architecture, wherein there are four searching algorithms built to satisfy the various processing-time required. For applying to low-power systems, the computational kernel employs four processing-elements in this chip. With timing mode control, the throughput rate of the proposed motion estimator can achieve from 3k to 180k blocks to meet different applications while this chip works on 50MHz. The total gate count is less than 5k and the power dissipation is no more than 0.1mW in the worst case. Hence the very low-power motion estimation is appropriate for portable systems.","PeriodicalId":259178,"journal":{"name":"The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132103421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}