Ewerson Carvalho, Ney Laert Vilar Calazans, E. Brião, F. Moraes
Dynamically and partially reconfigurable systems (DRS) are those where any portion of the hardware behavior can be altered at application execution time. These systems have the potential to provide hardware with flexibility similar to that of software, while leading to better performance and smaller system size. However, the widespread acceptance of DRSs depends on adequate support to design and implement them. This work proposes a framework for DRS design and implementation named PADReH. The approach is compared to other propositions available in the literature. The first steps of the framework implementation are described, involving methods and tools to control the hardware reconfiguration process and the generation of partial bitstreams. The main contribution of the work is to provide means to systematically reduce the lack of support currently hampering the adoption of DRSs as a mainstream technology.
{"title":"PADReH - a framework for the design and implementation of dynamically and partially reconfigurable systems","authors":"Ewerson Carvalho, Ney Laert Vilar Calazans, E. Brião, F. Moraes","doi":"10.1145/1016568.1016580","DOIUrl":"https://doi.org/10.1145/1016568.1016580","url":null,"abstract":"Dynamically and partially reconfigurable systems (DRS) are those where any portion of the hardware behavior can be altered at application execution time. These systems have the potential to provide hardware with flexibility similar to that of software, while leading to better performance and smaller system size. However, the widespread acceptance of DRSs depends on adequate support to design and implement them. This work proposes a framework for DRS design and implementation named PADReH. The approach is compared to other propositions available in the literature. The first steps of the framework implementation are described, involving methods and tools to control the hardware reconfiguration process and the generation of partial bitstreams. The main contribution of the work is to provide means to systematically reduce the lack of support currently hampering the adoption of DRSs as a mainstream technology.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121760969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Increasing power consumption and growing design effort are considered limiting factors in the design of chip-wide synchronous system-on-chip designs. The attempt to get over these problems lead to an intensified look at asynchronous communication solutions, sometimes based on network-on-chips. Despite this basically asynchronous approach, most of the actual research work is not supporting a globally genuinely-asynchronous solution. We present a modular switch for a true globally asynchronous interconnect network. Independent clock generators in each switch maintain a local clock thus avoiding a global clock at the level of the interconnect network. The general switch architecture is described and the integration of the synchronization technique used to resolve metastability is discussed in detail. First synthesis results of a prototypical VLSI implementation are presented.
{"title":"A switch architecture and signal synchronization for GALS system-on-chips","authors":"P. Zipf, H. Hinkelmann, Adeela Ashraf, M. Glesner","doi":"10.1145/1016568.1016625","DOIUrl":"https://doi.org/10.1145/1016568.1016625","url":null,"abstract":"Increasing power consumption and growing design effort are considered limiting factors in the design of chip-wide synchronous system-on-chip designs. The attempt to get over these problems lead to an intensified look at asynchronous communication solutions, sometimes based on network-on-chips. Despite this basically asynchronous approach, most of the actual research work is not supporting a globally genuinely-asynchronous solution. We present a modular switch for a true globally asynchronous interconnect network. Independent clock generators in each switch maintain a local clock thus avoiding a global clock at the level of the interconnect network. The general switch architecture is described and the integration of the synchronization technique used to resolve metastability is discussed in detail. First synthesis results of a prototypical VLSI implementation are presented.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114761697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper analyzes the performance of the conventional CMOS inverter, NAND-2 and NOR-2 static logic gates operating in the subthreshold region. The dependence of the drain currents on the process parameters can give rise to drive currents of NMOS and PMOS transistors that differ by an order of magnitude or even more. To compensate for this difference in currents, we propose three bias circuits in single-well processes that adjust the body voltage. Computer simulations using the AMS 0.8 /spl mu/m technology and the BSIM3v3 model were carried out to assess the compensation technique. A test chip was fabricated in both AMIS 1.5 /spl mu/m and TSMC0.35 /spl mu/m to further validate the proposal.
{"title":"Body-bias compensation technique for subthreshold CMOS static logic gates","authors":"L. A. P. Melek, M. C. Schneider, C. Galup-Montoro","doi":"10.1145/1016568.1016639","DOIUrl":"https://doi.org/10.1145/1016568.1016639","url":null,"abstract":"This paper analyzes the performance of the conventional CMOS inverter, NAND-2 and NOR-2 static logic gates operating in the subthreshold region. The dependence of the drain currents on the process parameters can give rise to drive currents of NMOS and PMOS transistors that differ by an order of magnitude or even more. To compensate for this difference in currents, we propose three bias circuits in single-well processes that adjust the body voltage. Computer simulations using the AMS 0.8 /spl mu/m technology and the BSIM3v3 model were carried out to assess the compensation technique. A test chip was fabricated in both AMIS 1.5 /spl mu/m and TSMC0.35 /spl mu/m to further validate the proposal.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123253619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Barreto, Marília Neves, Meuse N. Oliveira, P. Maciel, E. Tavares, R. Lima
Software synthesis is defined as the task of translating a specification into a software program, in a general purpose language, in such a way that this software can be compiled by conventional compilers. In general, complex real-time systems rely on specialized operating system kernels. However, the operating system usage may introduce significant overheads as in execution time as in memory requirement. In order to eliminate such overheads, automatic software synthesis methods should be implemented. Such methods comprise real-time operating system services (scheduling, resource management, communication, synchronization), and code generation. Formal methods are a very promising alternative to deal with the complexity of embedded systems, and for improving the degree of confidence in critical systems. We present a formal approach for automatic embedded hard real-time software synthesis based on time Petri nets. In order to illustrate the practical usability of the proposed method, it is shown how to synthesize a C code implementation using a heated-humidifier case study.
{"title":"A formal software synthesis approach for embedded hard real-time systems","authors":"R. Barreto, Marília Neves, Meuse N. Oliveira, P. Maciel, E. Tavares, R. Lima","doi":"10.1145/1016568.1016615","DOIUrl":"https://doi.org/10.1145/1016568.1016615","url":null,"abstract":"Software synthesis is defined as the task of translating a specification into a software program, in a general purpose language, in such a way that this software can be compiled by conventional compilers. In general, complex real-time systems rely on specialized operating system kernels. However, the operating system usage may introduce significant overheads as in execution time as in memory requirement. In order to eliminate such overheads, automatic software synthesis methods should be implemented. Such methods comprise real-time operating system services (scheduling, resource management, communication, synchronization), and code generation. Formal methods are a very promising alternative to deal with the complexity of embedded systems, and for improving the degree of confidence in critical systems. We present a formal approach for automatic embedded hard real-time software synthesis based on time Petri nets. In order to illustrate the practical usability of the proposed method, it is shown how to synthesize a C code implementation using a heated-humidifier case study.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114183437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. E. Savioli, Claudio C. Czendrodi, J. Calvano, A. C. M. Filho
This paper proposes a method for automated test pattern generation for fault diagnosis on continuous-time analog electrical networks based on evolutionary techniques. The paper states a method for coding a generic algorithm, based on a given heuristic, that are able to generate a set of optimum frequencies capable of disclosing parametric faults. The method itself is generic, and not based on specific or ad hoc features at all.
{"title":"ATPG for fault diagnosis on analog electrical networks using evolutionary techniques","authors":"C. E. Savioli, Claudio C. Czendrodi, J. Calvano, A. C. M. Filho","doi":"10.1145/1016568.1016600","DOIUrl":"https://doi.org/10.1145/1016568.1016600","url":null,"abstract":"This paper proposes a method for automated test pattern generation for fault diagnosis on continuous-time analog electrical networks based on evolutionary techniques. The paper states a method for coding a generic algorithm, based on a given heuristic, that are able to generate a set of optimum frequencies capable of disclosing parametric faults. The method itself is generic, and not based on specific or ad hoc features at all.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121371756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mário C. B. Osorio, Carlos A. Sampaio, A. Reis, R. Ribas
This paper presents an enhanced 32-bit carry look-ahead (CLA) adder implemented using the multi-output enable/disable CMOS differential logic (MOECDL) style. The MOECDL structure proposed represents a promising technique for iterative networks and self-timed circuits. The recursive property of CLA algorithm has been efficiently exploited to demonstrate the advantages of multiple-output structures. The 32-bit MOECDL CLA circuit has been designed into a standard 0.5 /spl mu/m CMOS technology. Comparison to the known DCVS style is presented through electrical simulation.
{"title":"Enhanced 32-bit carry look-ahead adder using multiple output enable-disable CMOS differential logic","authors":"Mário C. B. Osorio, Carlos A. Sampaio, A. Reis, R. Ribas","doi":"10.1145/1016568.1016619","DOIUrl":"https://doi.org/10.1145/1016568.1016619","url":null,"abstract":"This paper presents an enhanced 32-bit carry look-ahead (CLA) adder implemented using the multi-output enable/disable CMOS differential logic (MOECDL) style. The MOECDL structure proposed represents a promising technique for iterative networks and self-timed circuits. The recursive property of CLA algorithm has been efficiently exploited to demonstrate the advantages of multiple-output structures. The 32-bit MOECDL CLA circuit has been designed into a standard 0.5 /spl mu/m CMOS technology. Comparison to the known DCVS style is presented through electrical simulation.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133021969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate modeling of coupling effects via the substrate is an increasingly important concern in the design of mixed-signal systems such as communication, biomedical and analog signal processing circuits. Fast-switching digital blocks inject noise into the common substrate hindering the performance of high-precision sensible analog circuitry. Miniaturization effects on ICs complexity inevitably make the accuracy requirements for substrate coupling simulation increase. Due in part to the global nature of such couplings, model extraction and analysis is a computation-intensive task requiring the availability of fast and accurate substrate model extraction and analysis tools. One way to deal with this problem is to take further advantage of available computational technologies and distributed computing emerges as an interesting solution. In this paper we discuss several issues related to the parallelization of a multigrid-based substrate model extraction and analysis tool. This tool is used as a proxy for generic computations on a 3D discretized volume. The results presented indicate potential avenues for successfully exploiting parallelism as well as pitfalls to avoid in such a quest.
{"title":"Issues in parallelizing multigrid-based substrate model extraction and analysis","authors":"João M. S. Silva, L. M. Silveira","doi":"10.1145/1016568.1016605","DOIUrl":"https://doi.org/10.1145/1016568.1016605","url":null,"abstract":"Accurate modeling of coupling effects via the substrate is an increasingly important concern in the design of mixed-signal systems such as communication, biomedical and analog signal processing circuits. Fast-switching digital blocks inject noise into the common substrate hindering the performance of high-precision sensible analog circuitry. Miniaturization effects on ICs complexity inevitably make the accuracy requirements for substrate coupling simulation increase. Due in part to the global nature of such couplings, model extraction and analysis is a computation-intensive task requiring the availability of fast and accurate substrate model extraction and analysis tools. One way to deal with this problem is to take further advantage of available computational technologies and distributed computing emerges as an interesting solution. In this paper we discuss several issues related to the parallelization of a multigrid-based substrate model extraction and analysis tool. This tool is used as a proxy for generic computations on a 3D discretized volume. The results presented indicate potential avenues for successfully exploiting parallelism as well as pitfalls to avoid in such a quest.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"330 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134332565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michel Leong, Pedro Vasconcelos, J. Fernandes, L. Sousa
In this paper, we propose and develop a fully programmable CNN circuit. The CNN coefficients are digitally programmable using a digital to analog converter (DAC), resulting in added flexibility. CNNs with 4/spl times/4 and 16/spl times/16 cells are designed and tested, exhibiting good accuracy when compared with Matlab and Java applications for computing CNNs. All circuits are designed and implemented with a 0.35 /spl mu/m CMOS technology. The layout of a full 4/spl times/4 CNN was designed using cadence design framework II. The circuits are simulated with Pspice/Spectre.
{"title":"A programmable cellular neural network circuit","authors":"Michel Leong, Pedro Vasconcelos, J. Fernandes, L. Sousa","doi":"10.1145/1016568.1016620","DOIUrl":"https://doi.org/10.1145/1016568.1016620","url":null,"abstract":"In this paper, we propose and develop a fully programmable CNN circuit. The CNN coefficients are digitally programmable using a digital to analog converter (DAC), resulting in added flexibility. CNNs with 4/spl times/4 and 16/spl times/16 cells are designed and tested, exhibiting good accuracy when compared with Matlab and Java applications for computing CNNs. All circuits are designed and implemented with a 0.35 /spl mu/m CMOS technology. The layout of a full 4/spl times/4 CNN was designed using cadence design framework II. The circuits are simulated with Pspice/Spectre.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125154655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper analyzes the utilization of a network on chip (NoC) as the communication sub-system of a reconfigurable/parallel architecture. A router was designed and implemented in SystemC to analyze the NoC. With this routers the NoCX4 was created and simulated using coarse-grained reconfigurable microprocessor as processing nodes. To perform the simulation two approaches were used. The first one uses a load generator program and communication loads between 5% and 25%. The second is the calculation of 2D-DCT coefficients.
{"title":"When reconfigurable architecture meets network-on-chip","authors":"R. Soares, Ivan Saraiva Silva, A. Azevedo","doi":"10.1145/1016568.1016626","DOIUrl":"https://doi.org/10.1145/1016568.1016626","url":null,"abstract":"This paper analyzes the utilization of a network on chip (NoC) as the communication sub-system of a reconfigurable/parallel architecture. A router was designed and implemented in SystemC to analyze the NoC. With this routers the NoCX4 was created and simulated using coarse-grained reconfigurable microprocessor as processing nodes. To perform the simulation two approaches were used. The first one uses a load generator program and communication loads between 5% and 25%. The second is the calculation of 2D-DCT coefficients.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121298078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Ahmadinia, C. Bobda, Dirk Koch, Mateusz Majer, J. Teich
We consider the problem of executing a dynamically changing set of tasks on a reconfigurable system, made upon a processor and a reconfigurable device. Task execution on such a platform is managed by a scheduler that can allocate tasks either to the processor or to the reconfigurable device. The scheduler can be seen as part of an operating system running on the software or as core in the reconfigurable device. For each tasks to be executed on reconfigurable device, an equivalent implementation exists as rectangular block in a database. This block has to be placed on the device at run-time. A placer is responsible for the placement of tasks received from the scheduler on the reconfigurable device. However, the placement of tasks on the reconfigurable device cannot be successful if enough space is not available on the device to hold the task. In this case, the scheduler receive an acknowledgment from the placer and decide either to preempt a running task or to run the task on software. We present in this work an implementation of a placer module as well as investigations on task preemption. The two modules are part of an operating system for reconfigurable system currently under development.
{"title":"Task scheduling for heterogeneous reconfigurable computers","authors":"A. Ahmadinia, C. Bobda, Dirk Koch, Mateusz Majer, J. Teich","doi":"10.1145/1016568.1016582","DOIUrl":"https://doi.org/10.1145/1016568.1016582","url":null,"abstract":"We consider the problem of executing a dynamically changing set of tasks on a reconfigurable system, made upon a processor and a reconfigurable device. Task execution on such a platform is managed by a scheduler that can allocate tasks either to the processor or to the reconfigurable device. The scheduler can be seen as part of an operating system running on the software or as core in the reconfigurable device. For each tasks to be executed on reconfigurable device, an equivalent implementation exists as rectangular block in a database. This block has to be placed on the device at run-time. A placer is responsible for the placement of tasks received from the scheduler on the reconfigurable device. However, the placement of tasks on the reconfigurable device cannot be successful if enough space is not available on the device to hold the task. In this case, the scheduler receive an acknowledgment from the placer and decide either to preempt a running task or to run the task on software. We present in this work an implementation of a placer module as well as investigations on task preemption. The two modules are part of an operating system for reconfigurable system currently under development.","PeriodicalId":275811,"journal":{"name":"Proceedings. SBCCI 2004. 17th Symposium on Integrated Circuits and Systems Design (IEEE Cat. No.04TH8784)","volume":"225 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124078276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}