Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666507
Bruce W. Hunt, K. Stevens, B. Suter, D. Gelosh
A fully asynchronous fixed point FFT processor is introduced for low power space applications. The architecture is based on an algorithm developed by Suter and Stevens specifically for a low power implementation. The novelty of this architecture lies in its high localization of components and pipelining with no need to share a global memory. High throughput is attained using large numbers of small, local components working in parallel. A derivation of the algorithm from the discrete Fourier transform is presented followed by a discussion of circuit design parameters specifically those relevant to space applications. A survey of this application specific architecture is included with a detailed look at the design of the complex-valued Booth multiplier to demonstrate the design methodology of this project. Finally, simulation results based on layout extractions are presented and an outline for future work is given.
{"title":"A single chip low power asynchronous implementation of an FFT algorithm for space applications","authors":"Bruce W. Hunt, K. Stevens, B. Suter, D. Gelosh","doi":"10.1109/ASYNC.1998.666507","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666507","url":null,"abstract":"A fully asynchronous fixed point FFT processor is introduced for low power space applications. The architecture is based on an algorithm developed by Suter and Stevens specifically for a low power implementation. The novelty of this architecture lies in its high localization of components and pipelining with no need to share a global memory. High throughput is attained using large numbers of small, local components working in parallel. A derivation of the algorithm from the discrete Fourier transform is presented followed by a discussion of circuit design parameters specifically those relevant to space applications. A survey of this application specific architecture is included with a detailed look at the design of the complex-valued Booth multiplier to demonstrate the design methodology of this project. Finally, simulation results based on layout extractions are presented and an outline for future work is given.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122449658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666491
M. Renaudin, P. Vivet, F. Robin
The design of a CMOS standard-cell Quasi-Delay-Insensitive (QDI) 16-bit asynchronous microprocessor is presented. ASPRO-216 is being developed for embedded applications. It is a scalar processor which issues instructions in-order and completes their execution out-of-order, and it can be customized both at the hardware and software levels to fit specific application requirements. Its architecture extensively uses an overlapping pipelined execution scheme involving desynchronized units. The design flow and circuit style are an original application of A. Martin's method. The expected performance is 200 peak MIPS, 0.5 Watt using a 0.25 /spl mu/m technology.
{"title":"ASPRO-216: a standard-cell Q.D.I. 16-bit RISC asynchronous microprocessor","authors":"M. Renaudin, P. Vivet, F. Robin","doi":"10.1109/ASYNC.1998.666491","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666491","url":null,"abstract":"The design of a CMOS standard-cell Quasi-Delay-Insensitive (QDI) 16-bit asynchronous microprocessor is presented. ASPRO-216 is being developed for embedded applications. It is a scalar processor which issues instructions in-order and completes their execution out-of-order, and it can be customized both at the hardware and software levels to fit specific application requirements. Its architecture extensively uses an overlapping pipelined execution scheme involving desynchronized units. The design flow and circuit style are an original application of A. Martin's method. The expected performance is 200 peak MIPS, 0.5 Watt using a 0.25 /spl mu/m technology.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114218749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666505
S. Piestrak
Delay-insensitive (unordered) codes have been used to encode data in various asynchronous systems such as asynchronous circuits and buses. In this paper, a new general approach to designing completion-detection circuits (completion checkers) for asynchronous circuits and systems using delay-insensitive codes is presented. It is shown that a completion-detection circuit for many delay-insensitive codes can be easily and efficiently built in a systematic way by using multi-output threshold circuits. The results presented here remain in a sharp contrast with the conclusions reached by Akella et al. (1996) where similar designs-called enumeration-based decoders-were found impractical due to excessive complexity.
{"title":"Membership test logic for delay-insensitive codes","authors":"S. Piestrak","doi":"10.1109/ASYNC.1998.666505","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666505","url":null,"abstract":"Delay-insensitive (unordered) codes have been used to encode data in various asynchronous systems such as asynchronous circuits and buses. In this paper, a new general approach to designing completion-detection circuits (completion checkers) for asynchronous circuits and systems using delay-insensitive codes is presented. It is shown that a completion-detection circuit for many delay-insensitive codes can be easily and efficiently built in a systematic way by using multi-output threshold circuits. The results presented here remain in a sharp contrast with the conclusions reached by Akella et al. (1996) where similar designs-called enumeration-based decoders-were found impractical due to excessive complexity.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131546736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666511
Y. Kameda, S. Polonsky, M. Maezawa, T. Nanya
We present a primitive-level pipelining method in rapid single-flux-quantum (RSFQ) technology. In RSFQ circuits, binary information is represented by discrete voltage pulses unlike voltage levels in CMOS and related circuits. The method utilizes inherent storage capability in RSFQ primitives as pipeline registers. We propose a new RSFQ primitive that carries out a binary operation, holds the result, and controls the output. As the three tasks are performed in one primitive, it is expected to eliminate interconnect delays that are inevitable if three separate primitives are used. Data is transferred following a request-acknowledgment protocol in a delay-insensitive (DI) fashion. Due to delay insensitivity, high modularity is achieved. As examples, several adders and an array multiplier are designed on the DI model. We confirm the correctness of the circuit designs using a verification tool.
{"title":"Primitive-level pipelining method on delay-insensitive model for RSFQ pulse-driven logic","authors":"Y. Kameda, S. Polonsky, M. Maezawa, T. Nanya","doi":"10.1109/ASYNC.1998.666511","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666511","url":null,"abstract":"We present a primitive-level pipelining method in rapid single-flux-quantum (RSFQ) technology. In RSFQ circuits, binary information is represented by discrete voltage pulses unlike voltage levels in CMOS and related circuits. The method utilizes inherent storage capability in RSFQ primitives as pipeline registers. We propose a new RSFQ primitive that carries out a binary operation, holds the result, and controls the output. As the three tasks are performed in one primitive, it is expected to eliminate interconnect delays that are inevitable if three separate primitives are used. Data is transferred following a request-acknowledgment protocol in a delay-insensitive (DI) fashion. Due to delay insensitivity, high modularity is achieved. As examples, several adders and an array multiplier are designed on the DI model. We confirm the correctness of the circuit designs using a verification tool.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114362306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666497
H. V. Gageldonk, K. V. Berkel, A. Peeters, Daniel Baumann, D. Gloor, G. Stegmann
This paper presents a low-power asynchronous implementation of the 80C51 microcontroller. It was realized in a 0.5 /spl mu/ CMOS process and it shows a power advantage of a factor 4 compared to a recent synchronous implementation in the same technology. The chip is fully bit compatible with the synchronous implementation, and timing compatible for external memory access. The circuit is a compiled VLSI-program, using Tangram as VLSI-programming language and the Tangram tool-set to compile the design automatically to a standard-cell netlist. This design approach proves to be powerful enough to describe the microcontroller and derive an efficient implementation. Further, it offers the designer the possibility to explore various alternatives in the design space.
{"title":"An asynchronous low-power 80C51 microcontroller","authors":"H. V. Gageldonk, K. V. Berkel, A. Peeters, Daniel Baumann, D. Gloor, G. Stegmann","doi":"10.1109/ASYNC.1998.666497","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666497","url":null,"abstract":"This paper presents a low-power asynchronous implementation of the 80C51 microcontroller. It was realized in a 0.5 /spl mu/ CMOS process and it shows a power advantage of a factor 4 compared to a recent synchronous implementation in the same technology. The chip is fully bit compatible with the synchronous implementation, and timing compatible for external memory access. The circuit is a compiled VLSI-program, using Tangram as VLSI-programming language and the Tangram tool-set to compile the design automatically to a standard-cell netlist. This design approach proves to be powerful enough to describe the microcontroller and derive an efficient implementation. Further, it offers the designer the possibility to explore various alternatives in the design space.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129331140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666494
Michael Theobald, S. Nowick
None of the available minimizers for exact 2-level hazard-free logic minimization can synthesize very large circuits. This limitation has forced researchers to resort to heuristic minimization, or to manual and automated circuit partitioning techniques. This paper introduces a new implicit 2-level logic minimizer, IMPYMIN, which is able to solve very large multi-output hazard-free minimization problems exactly. The minimizer is based on a novel theoretical approach: it incorporates hazard-freedom constraints within a synchronous function by adding new variables. In particular, the generation of dynamic-hazard-free prime implicants is cast as a synchronous prime implicant generation problem. The minimizer can exactly solve all currently available examples, which range up to 32 inputs and 33 outputs, in less than 813 seconds. These include examples that have never been exactly solved before.
{"title":"An implicit method for hazard-free two-level logic minimization","authors":"Michael Theobald, S. Nowick","doi":"10.1109/ASYNC.1998.666494","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666494","url":null,"abstract":"None of the available minimizers for exact 2-level hazard-free logic minimization can synthesize very large circuits. This limitation has forced researchers to resort to heuristic minimization, or to manual and automated circuit partitioning techniques. This paper introduces a new implicit 2-level logic minimizer, IMPYMIN, which is able to solve very large multi-output hazard-free minimization problems exactly. The minimizer is based on a novel theoretical approach: it incorporates hazard-freedom constraints within a synchronous function by adding new variables. In particular, the generation of dynamic-hazard-free prime implicants is cast as a synchronous prime implicant generation problem. The minimizer can exactly solve all currently available examples, which range up to 32 inputs and 33 outputs, in less than 813 seconds. These include examples that have never been exactly solved before.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116482181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666502
R. Negulescu, A. Peeters
A way to reduce the cost (area) or increase the performance of asynchronous circuits is to make timing assumptions that go beyond the isochronic fork. This, however, results in circuits that are not speed-independent. Such timing assumptions often boil down to imposing that, of two circuit paths that start at the same point, one path is faster than the other. We call speed-dependences of this form chain constraints, and we handle them as processes in a metric-free formalism. This paper applies chain constraints to verify single-rail handshake circuits in the context of their timing assumptions, and to evaluate safety margins for delay fluctuations. We discuss the lessons learned, including decomposition and weakening of extended isochronic fork assumptions, usage of CMOS cell models in the presence of hazards, and correlations between our discrete-state results and analog simulations.
{"title":"Verification of speed-dependences in single-rail handshake circuits","authors":"R. Negulescu, A. Peeters","doi":"10.1109/ASYNC.1998.666502","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666502","url":null,"abstract":"A way to reduce the cost (area) or increase the performance of asynchronous circuits is to make timing assumptions that go beyond the isochronic fork. This, however, results in circuits that are not speed-independent. Such timing assumptions often boil down to imposing that, of two circuit paths that start at the same point, one path is faster than the other. We call speed-dependences of this form chain constraints, and we handle them as processes in a metric-free formalism. This paper applies chain constraints to verify single-rail handshake circuits in the context of their timing assumptions, and to evaluate safety margins for delay fluctuations. We discuss the lessons learned, including decomposition and weakening of extended isochronic fork assumptions, usage of CMOS cell models in the presence of hazards, and correlations between our discrete-state results and analog simulations.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"886 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114150345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666506
D. Kinniment, A. Yakovlev, Fei Xia, B. Gao
Analog to digital (A-D) converters with a fixed conversion time are subject to errors due to metastability. These errors will occur in all converter designs with a bounded time for decisions, and are potentially severe. We estimate the frequency of these errors in a successive approximation converter, and compare the results with asynchronous designs using both a fully speed-independent, and a bundled data approach. It is shown that an asynchronous converter is more reliable than its synchronous counterpart, and that the bundled data design is also faster, on average, than the synchronous design. We also demonstrate trade-offs involved in asynchronous converter designs, such as speed, robustness to delay variations, circuit size and design scalability.
{"title":"Towards asynchronous A-D conversion","authors":"D. Kinniment, A. Yakovlev, Fei Xia, B. Gao","doi":"10.1109/ASYNC.1998.666506","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666506","url":null,"abstract":"Analog to digital (A-D) converters with a fixed conversion time are subject to errors due to metastability. These errors will occur in all converter designs with a bounded time for decisions, and are potentially severe. We estimate the frequency of these errors in a successive approximation converter, and compare the results with asynchronous designs using both a fully speed-independent, and a bundled data approach. It is shown that an asynchronous converter is more reliable than its synchronous counterpart, and that the bundled data design is also faster, on average, than the synchronous design. We also demonstrate trade-offs involved in asynchronous converter designs, such as speed, robustness to delay variations, circuit size and design scalability.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128232301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666503
T. Verhoeff
We present the XDI Model for specifying delay-insensitive circuits, that is, reactive systems that correctly exchange signals with their environment in spite of unknown delays incurred by the interface. XDI specifications capture restrictions on the communication between circuit and environment, treating both parties equally. They can be visualized as state graphs where each arrow is labeled by a communication terminal and each state by a safety/progress label. We investigate various properties that can be extracted from XDI specifications: automorphisms, environment partitions, autocomparison matrix, and classifications of choice, order dependence, and nondeterminism. We introduce a distinction between static and dynamic output nondeterminism, capturing the difference between design freedom and arbitration. Determining specification properties is useful for validation and design.
{"title":"Analyzing specifications for delay-insensitive circuits","authors":"T. Verhoeff","doi":"10.1109/ASYNC.1998.666503","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666503","url":null,"abstract":"We present the XDI Model for specifying delay-insensitive circuits, that is, reactive systems that correctly exchange signals with their environment in spite of unknown delays incurred by the interface. XDI specifications capture restrictions on the communication between circuit and environment, treating both parties equally. They can be visualized as state graphs where each arrow is labeled by a communication terminal and each state by a safety/progress label. We investigate various properties that can be extracted from XDI specifications: automorphisms, environment partitions, autocomparison matrix, and classifications of choice, order dependence, and nondeterminism. We introduce a distinction between static and dynamic output nondeterminism, capturing the difference between design freedom and arbitration. Determining specification properties is useful for validation and design.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115184705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666510
A. Xie, P. Beerel
This paper presents a methodology to speed up the stationary analysis of large Markov chains that model asynchronous systems. Instead of directly working on the original Markov chain, we propose to analyze a smaller Markov chain obtained via a novel technique called string-based state compression. Once the smaller chain is solved, the solution to the original chain is obtained via a process called expansion. The method is especially powerful when the Markov chain has a small feedback vertex set, which happens often an asynchronous systems. Experimental results show that the method can yield reductions of more than an order of magnitude in run time and facilitate the analysis of larger systems than possible using traditional techniques.
{"title":"Accelerating Markovian analysis of asynchronous systems using string-based state compression","authors":"A. Xie, P. Beerel","doi":"10.1109/ASYNC.1998.666510","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666510","url":null,"abstract":"This paper presents a methodology to speed up the stationary analysis of large Markov chains that model asynchronous systems. Instead of directly working on the original Markov chain, we propose to analyze a smaller Markov chain obtained via a novel technique called string-based state compression. Once the smaller chain is solved, the solution to the original chain is obtained via a process called expansion. The method is especially powerful when the Markov chain has a small feedback vertex set, which happens often an asynchronous systems. Experimental results show that the method can yield reductions of more than an order of magnitude in run time and facilitate the analysis of larger systems than possible using traditional techniques.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114486298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}