Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666508
Ross Smith, K. Fant, D. Parker, Rick Stephani, Ching-Yi Wang
This paper describes a fully asynchronous two-dimensional discrete cosine transform chip. The chip has a fixed block size of 8/spl times/8 pixels and uses bit-serial arithmetic. The chip was fabricated through MOSIS using a 0.8 /spl mu/ double-metal CMOS process. The 49.5 mm/sup 2/ core uses /spl sim/162,000 transistors. The chip operates from 0.65 V to 7.0 V, but its pixel rate at 5.0 V, 17 MHz, is significantly below the 27 MHz simulated because none of the signal's capacitances were backextracted. In order to design a completely asynchronous chip, a FIFO-based transposition memory was used, even though it used more area than RAM-based memory. The most interesting aspects of the design are presented here: the memory control structure, the pipelining structures, the use of Xilinx FPGAs and a Quickturn emulation system for emulation, and a comparison with other synchronous and asynchronous designs.
{"title":"An asynchronous 2-D discrete cosine transform chip","authors":"Ross Smith, K. Fant, D. Parker, Rick Stephani, Ching-Yi Wang","doi":"10.1109/ASYNC.1998.666508","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666508","url":null,"abstract":"This paper describes a fully asynchronous two-dimensional discrete cosine transform chip. The chip has a fixed block size of 8/spl times/8 pixels and uses bit-serial arithmetic. The chip was fabricated through MOSIS using a 0.8 /spl mu/ double-metal CMOS process. The 49.5 mm/sup 2/ core uses /spl sim/162,000 transistors. The chip operates from 0.65 V to 7.0 V, but its pixel rate at 5.0 V, 17 MHz, is significantly below the 27 MHz simulated because none of the signal's capacitances were backextracted. In order to design a completely asynchronous chip, a FIFO-based transposition memory was used, even though it used more area than RAM-based memory. The most interesting aspects of the design are presented here: the memory control structure, the pipelining structures, the use of Xilinx FPGAs and a Quickturn emulation system for emulation, and a comparison with other synchronous and asynchronous designs.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122612492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-03-30DOI: 10.1109/ASYNC.1998.666509
J. Ebergen, Scott M. Fairbanks, I. Sutherland
A technique is presented to predict the performance behavior of control circuits for a linear FIFO. The control circuit consists of a linear chain of RendezVous elements, also called JOINs, preceded by a source and followed by a sink. The technique predicts how the cycle time, or throughput, of the FIFO depends on the sink delay, the source delay, and the length of the FIFO. It also predicts how the delays in each RendezVous element depend on the same set of parameters. The pipelines can be divided into three cases: source-limited, sink-limited, and self-limited pipelines. The technique is based on the assumption that the delays through a RendezVous element can be described as a function of the separation in arrival times of the inputs. Such descriptions are conveniently represented by the so-called Charlie diagram.
{"title":"Predicting performance of micropipelines using Charlie diagrams","authors":"J. Ebergen, Scott M. Fairbanks, I. Sutherland","doi":"10.1109/ASYNC.1998.666509","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666509","url":null,"abstract":"A technique is presented to predict the performance behavior of control circuits for a linear FIFO. The control circuit consists of a linear chain of RendezVous elements, also called JOINs, preceded by a source and followed by a sink. The technique predicts how the cycle time, or throughput, of the FIFO depends on the sink delay, the source delay, and the length of the FIFO. It also predicts how the delays in each RendezVous element depend on the same set of parameters. The pipelines can be divided into three cases: source-limited, sink-limited, and self-limited pipelines. The technique is based on the assumption that the delays through a RendezVous element can be described as a function of the separation in arrival times of the inputs. Such descriptions are conveniently represented by the so-called Charlie diagram.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121501609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ASYNC.1998.666496
W. Chou, P. Beerel, R. Ginosar, Rakefet Kol, C. Myers, Shai Rotem, K. Stevens, K. Yun
This paper presents a technology mapping technique for optimizing the average-case delay of asynchronous combinational circuits implemented using domino logic and one-hot encoded outputs. The technique minimizes the critical path for common input patterns at the possible expense of making less common critical paths longer. To demonstrate the application of this technique, we present a case study of a combinational length decoding block, an integral component of an Asynchronous Instruction Length Decoder (AILD) which can be used in Pentium(R) processors. The experimental results demonstrate that the average-case delay of our mapped circuits can be dramatically lower than the worst-case delay of the circuits obtained using conventional worst-case mapping techniques.
{"title":"Average-case optimized technology mapping of one-hot domino circuits","authors":"W. Chou, P. Beerel, R. Ginosar, Rakefet Kol, C. Myers, Shai Rotem, K. Stevens, K. Yun","doi":"10.1109/ASYNC.1998.666496","DOIUrl":"https://doi.org/10.1109/ASYNC.1998.666496","url":null,"abstract":"This paper presents a technology mapping technique for optimizing the average-case delay of asynchronous combinational circuits implemented using domino logic and one-hot encoded outputs. The technique minimizes the critical path for common input patterns at the possible expense of making less common critical paths longer. To demonstrate the application of this technique, we present a case study of a combinational length decoding block, an integral component of an Asynchronous Instruction Length Decoder (AILD) which can be used in Pentium(R) processors. The experimental results demonstrate that the average-case delay of our mapped circuits can be dramatically lower than the worst-case delay of the circuits obtained using conventional worst-case mapping techniques.","PeriodicalId":425072,"journal":{"name":"Proceedings Fourth International Symposium on Advanced Research in Asynchronous Circuits and Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115460810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}