The availability of multiple metal layers in modern IC processes raises the possibility of using non-Manhattan routing on some of the layers in order to reduce the average interconnect length, and thus improve performance and routability. In this paper, we present novel algorithms for both Manhattan and non-Manhattan multi-layer maze routing. The algorithms in principle can be extended to an arbitrary number of layers, but the paper focuses on four-layer routing, two in horizontal and two in vertical directions for Manhattan, and one layer each in horizontal, vertical, 45-degree and 135-degree directions for non-Manhattan routing. The non-Manhattan algorithms show an improvement of up to 12.2% in average wire length compared to Manhattan routing for two general MCNC benchmarks.
{"title":"Non-Manhattan maze routing","authors":"M. Stan, F. Hamzaoglu, David Garrett","doi":"10.1145/1016568.1016637","DOIUrl":"https://doi.org/10.1145/1016568.1016637","url":null,"abstract":"The availability of multiple metal layers in modern IC processes raises the possibility of using non-Manhattan routing on some of the layers in order to reduce the average interconnect length, and thus improve performance and routability. In this paper, we present novel algorithms for both Manhattan and non-Manhattan multi-layer maze routing. The algorithms in principle can be extended to an arbitrary number of layers, but the paper focuses on four-layer routing, two in horizontal and two in vertical directions for Manhattan, and one layer each in horizontal, vertical, 45-degree and 135-degree directions for non-Manhattan routing. The non-Manhattan algorithms show an improvement of up to 12.2% in average wire length compared to Manhattan routing for two general MCNC benchmarks.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122338587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper evaluates how distinct real-time task scheduling algorithms impact power consumption and timing performance of embedded systems. A design space exploration methodology is proposed in order to adjust the system's power consumption by tuning the CPU frequency according to the scheduling algorithm and to the temporal requirements of the embedded application. The goal is to find an optimized configuration, selecting the right combination of a scheduling policy with a CPU frequency, so as to consume less power without missing any deadline in the application. Experiments based on a synthetic workload that simulates realistic applications demonstrate that considerable power savings can be obtained. Moreover, the paper defines guidelines to be used by system designers in order to find a configuration that best matches the design constraints and requirements.
{"title":"Power and performance tuning in the synthesis of real-time scheduling algorithms for embedded applications","authors":"L. Becker, M. A. Wehrmeister, C. Pereira","doi":"10.1145/1016568.1016616","DOIUrl":"https://doi.org/10.1145/1016568.1016616","url":null,"abstract":"This paper evaluates how distinct real-time task scheduling algorithms impact power consumption and timing performance of embedded systems. A design space exploration methodology is proposed in order to adjust the system's power consumption by tuning the CPU frequency according to the scheduling algorithm and to the temporal requirements of the embedded application. The goal is to find an optimized configuration, selecting the right combination of a scheduling policy with a CPU frequency, so as to consume less power without missing any deadline in the application. Experiments based on a synthetic workload that simulates realistic applications demonstrate that considerable power savings can be obtained. Moreover, the paper defines guidelines to be used by system designers in order to find a configuration that best matches the design constraints and requirements.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127046849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Barreto, Marília Neves, Meuse N. Oliveira, P. Maciel, E. Tavares, R. Lima
Software synthesis is defined as the task of translating a specification into a software program, in a general purpose language, in such a way that this software can be compiled by conventional compilers. In general, complex real-time systems rely on specialized operating system kernels. However, the operating system usage may introduce significant overheads as in execution time as in memory requirement. In order to eliminate such overheads, automatic software synthesis methods should be implemented. Such methods comprise real-time operating system services (scheduling, resource management, communication, synchronization), and code generation. Formal methods are a very promising alternative to deal with the complexity of embedded systems, and for improving the degree of confidence in critical systems. We present a formal approach for automatic embedded hard real-time software synthesis based on time Petri nets. In order to illustrate the practical usability of the proposed method, it is shown how to synthesize a C code implementation using a heated-humidifier case study.
{"title":"A formal software synthesis approach for embedded hard real-time systems","authors":"R. Barreto, Marília Neves, Meuse N. Oliveira, P. Maciel, E. Tavares, R. Lima","doi":"10.1145/1016568.1016615","DOIUrl":"https://doi.org/10.1145/1016568.1016615","url":null,"abstract":"Software synthesis is defined as the task of translating a specification into a software program, in a general purpose language, in such a way that this software can be compiled by conventional compilers. In general, complex real-time systems rely on specialized operating system kernels. However, the operating system usage may introduce significant overheads as in execution time as in memory requirement. In order to eliminate such overheads, automatic software synthesis methods should be implemented. Such methods comprise real-time operating system services (scheduling, resource management, communication, synchronization), and code generation. Formal methods are a very promising alternative to deal with the complexity of embedded systems, and for improving the degree of confidence in critical systems. We present a formal approach for automatic embedded hard real-time software synthesis based on time Petri nets. In order to illustrate the practical usability of the proposed method, it is shown how to synthesize a C code implementation using a heated-humidifier case study.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114183437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. E. Savioli, Claudio C. Czendrodi, J. Calvano, A. C. M. Filho
This paper proposes a method for automated test pattern generation for fault diagnosis on continuous-time analog electrical networks based on evolutionary techniques. The paper states a method for coding a generic algorithm, based on a given heuristic, that are able to generate a set of optimum frequencies capable of disclosing parametric faults. The method itself is generic, and not based on specific or ad hoc features at all.
{"title":"ATPG for fault diagnosis on analog electrical networks using evolutionary techniques","authors":"C. E. Savioli, Claudio C. Czendrodi, J. Calvano, A. C. M. Filho","doi":"10.1145/1016568.1016600","DOIUrl":"https://doi.org/10.1145/1016568.1016600","url":null,"abstract":"This paper proposes a method for automated test pattern generation for fault diagnosis on continuous-time analog electrical networks based on evolutionary techniques. The paper states a method for coding a generic algorithm, based on a given heuristic, that are able to generate a set of optimum frequencies capable of disclosing parametric faults. The method itself is generic, and not based on specific or ad hoc features at all.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121371756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper analyzes the performance of the conventional CMOS inverter, NAND-2 and NOR-2 static logic gates operating in the subthreshold region. The dependence of the drain currents on the process parameters can give rise to drive currents of NMOS and PMOS transistors that differ by an order of magnitude or even more. To compensate for this difference in currents, we propose three bias circuits in single-well processes that adjust the body voltage. Computer simulations using the AMS 0.8 /spl mu/m technology and the BSIM3v3 model were carried out to assess the compensation technique. A test chip was fabricated in both AMIS 1.5 /spl mu/m and TSMC0.35 /spl mu/m to further validate the proposal.
{"title":"Body-bias compensation technique for subthreshold CMOS static logic gates","authors":"L. A. P. Melek, M. C. Schneider, C. Galup-Montoro","doi":"10.1145/1016568.1016639","DOIUrl":"https://doi.org/10.1145/1016568.1016639","url":null,"abstract":"This paper analyzes the performance of the conventional CMOS inverter, NAND-2 and NOR-2 static logic gates operating in the subthreshold region. The dependence of the drain currents on the process parameters can give rise to drive currents of NMOS and PMOS transistors that differ by an order of magnitude or even more. To compensate for this difference in currents, we propose three bias circuits in single-well processes that adjust the body voltage. Computer simulations using the AMS 0.8 /spl mu/m technology and the BSIM3v3 model were carried out to assess the compensation technique. A test chip was fabricated in both AMIS 1.5 /spl mu/m and TSMC0.35 /spl mu/m to further validate the proposal.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123253619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate modeling of coupling effects via the substrate is an increasingly important concern in the design of mixed-signal systems such as communication, biomedical and analog signal processing circuits. Fast-switching digital blocks inject noise into the common substrate hindering the performance of high-precision sensible analog circuitry. Miniaturization effects on ICs complexity inevitably make the accuracy requirements for substrate coupling simulation increase. Due in part to the global nature of such couplings, model extraction and analysis is a computation-intensive task requiring the availability of fast and accurate substrate model extraction and analysis tools. One way to deal with this problem is to take further advantage of available computational technologies and distributed computing emerges as an interesting solution. In this paper we discuss several issues related to the parallelization of a multigrid-based substrate model extraction and analysis tool. This tool is used as a proxy for generic computations on a 3D discretized volume. The results presented indicate potential avenues for successfully exploiting parallelism as well as pitfalls to avoid in such a quest.
{"title":"Issues in parallelizing multigrid-based substrate model extraction and analysis","authors":"João M. S. Silva, L. M. Silveira","doi":"10.1145/1016568.1016605","DOIUrl":"https://doi.org/10.1145/1016568.1016605","url":null,"abstract":"Accurate modeling of coupling effects via the substrate is an increasingly important concern in the design of mixed-signal systems such as communication, biomedical and analog signal processing circuits. Fast-switching digital blocks inject noise into the common substrate hindering the performance of high-precision sensible analog circuitry. Miniaturization effects on ICs complexity inevitably make the accuracy requirements for substrate coupling simulation increase. Due in part to the global nature of such couplings, model extraction and analysis is a computation-intensive task requiring the availability of fast and accurate substrate model extraction and analysis tools. One way to deal with this problem is to take further advantage of available computational technologies and distributed computing emerges as an interesting solution. In this paper we discuss several issues related to the parallelization of a multigrid-based substrate model extraction and analysis tool. This tool is used as a proxy for generic computations on a 3D discretized volume. The results presented indicate potential avenues for successfully exploiting parallelism as well as pitfalls to avoid in such a quest.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"330 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134332565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mário C. B. Osorio, Carlos A. Sampaio, A. Reis, R. Ribas
This paper presents an enhanced 32-bit carry look-ahead (CLA) adder implemented using the multi-output enable/disable CMOS differential logic (MOECDL) style. The MOECDL structure proposed represents a promising technique for iterative networks and self-timed circuits. The recursive property of CLA algorithm has been efficiently exploited to demonstrate the advantages of multiple-output structures. The 32-bit MOECDL CLA circuit has been designed into a standard 0.5 /spl mu/m CMOS technology. Comparison to the known DCVS style is presented through electrical simulation.
{"title":"Enhanced 32-bit carry look-ahead adder using multiple output enable-disable CMOS differential logic","authors":"Mário C. B. Osorio, Carlos A. Sampaio, A. Reis, R. Ribas","doi":"10.1145/1016568.1016619","DOIUrl":"https://doi.org/10.1145/1016568.1016619","url":null,"abstract":"This paper presents an enhanced 32-bit carry look-ahead (CLA) adder implemented using the multi-output enable/disable CMOS differential logic (MOECDL) style. The MOECDL structure proposed represents a promising technique for iterative networks and self-timed circuits. The recursive property of CLA algorithm has been efficiently exploited to demonstrate the advantages of multiple-output structures. The 32-bit MOECDL CLA circuit has been designed into a standard 0.5 /spl mu/m CMOS technology. Comparison to the known DCVS style is presented through electrical simulation.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133021969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michel Leong, Pedro Vasconcelos, J. Fernandes, L. Sousa
In this paper, we propose and develop a fully programmable CNN circuit. The CNN coefficients are digitally programmable using a digital to analog converter (DAC), resulting in added flexibility. CNNs with 4/spl times/4 and 16/spl times/16 cells are designed and tested, exhibiting good accuracy when compared with Matlab and Java applications for computing CNNs. All circuits are designed and implemented with a 0.35 /spl mu/m CMOS technology. The layout of a full 4/spl times/4 CNN was designed using cadence design framework II. The circuits are simulated with Pspice/Spectre.
{"title":"A programmable cellular neural network circuit","authors":"Michel Leong, Pedro Vasconcelos, J. Fernandes, L. Sousa","doi":"10.1145/1016568.1016620","DOIUrl":"https://doi.org/10.1145/1016568.1016620","url":null,"abstract":"In this paper, we propose and develop a fully programmable CNN circuit. The CNN coefficients are digitally programmable using a digital to analog converter (DAC), resulting in added flexibility. CNNs with 4/spl times/4 and 16/spl times/16 cells are designed and tested, exhibiting good accuracy when compared with Matlab and Java applications for computing CNNs. All circuits are designed and implemented with a 0.35 /spl mu/m CMOS technology. The layout of a full 4/spl times/4 CNN was designed using cadence design framework II. The circuits are simulated with Pspice/Spectre.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125154655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper analyzes the utilization of a network on chip (NoC) as the communication sub-system of a reconfigurable/parallel architecture. A router was designed and implemented in SystemC to analyze the NoC. With this routers the NoCX4 was created and simulated using coarse-grained reconfigurable microprocessor as processing nodes. To perform the simulation two approaches were used. The first one uses a load generator program and communication loads between 5% and 25%. The second is the calculation of 2D-DCT coefficients.
{"title":"When reconfigurable architecture meets network-on-chip","authors":"R. Soares, Ivan Saraiva Silva, A. Azevedo","doi":"10.1145/1016568.1016626","DOIUrl":"https://doi.org/10.1145/1016568.1016626","url":null,"abstract":"This paper analyzes the utilization of a network on chip (NoC) as the communication sub-system of a reconfigurable/parallel architecture. A router was designed and implemented in SystemC to analyze the NoC. With this routers the NoCX4 was created and simulated using coarse-grained reconfigurable microprocessor as processing nodes. To perform the simulation two approaches were used. The first one uses a load generator program and communication loads between 5% and 25%. The second is the calculation of 2D-DCT coefficients.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121298078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Ahmadinia, C. Bobda, Dirk Koch, Mateusz Majer, J. Teich
We consider the problem of executing a dynamically changing set of tasks on a reconfigurable system, made upon a processor and a reconfigurable device. Task execution on such a platform is managed by a scheduler that can allocate tasks either to the processor or to the reconfigurable device. The scheduler can be seen as part of an operating system running on the software or as core in the reconfigurable device. For each tasks to be executed on reconfigurable device, an equivalent implementation exists as rectangular block in a database. This block has to be placed on the device at run-time. A placer is responsible for the placement of tasks received from the scheduler on the reconfigurable device. However, the placement of tasks on the reconfigurable device cannot be successful if enough space is not available on the device to hold the task. In this case, the scheduler receive an acknowledgment from the placer and decide either to preempt a running task or to run the task on software. We present in this work an implementation of a placer module as well as investigations on task preemption. The two modules are part of an operating system for reconfigurable system currently under development.
{"title":"Task scheduling for heterogeneous reconfigurable computers","authors":"A. Ahmadinia, C. Bobda, Dirk Koch, Mateusz Majer, J. Teich","doi":"10.1145/1016568.1016582","DOIUrl":"https://doi.org/10.1145/1016568.1016582","url":null,"abstract":"We consider the problem of executing a dynamically changing set of tasks on a reconfigurable system, made upon a processor and a reconfigurable device. Task execution on such a platform is managed by a scheduler that can allocate tasks either to the processor or to the reconfigurable device. The scheduler can be seen as part of an operating system running on the software or as core in the reconfigurable device. For each tasks to be executed on reconfigurable device, an equivalent implementation exists as rectangular block in a database. This block has to be placed on the device at run-time. A placer is responsible for the placement of tasks received from the scheduler on the reconfigurable device. However, the placement of tasks on the reconfigurable device cannot be successful if enough space is not available on the device to hold the task. In this case, the scheduler receive an acknowledgment from the placer and decide either to preempt a running task or to run the task on software. We present in this work an implementation of a placer module as well as investigations on task preemption. The two modules are part of an operating system for reconfigurable system currently under development.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124078276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}