R. Tadros, Weizhe Hua, Matheus Gibiluka, Matheus T. Moreira, Ney Laert Vilar Calazans, P. Beerel
Dynamic voltage scaling of bundled-data asynchronous design has the promise to lead to far more energy efficient systems than traditionally clocked alternatives. However, this approach relies on the development of energy-efficient delay lines, whose delay must track that of the combinational datapath over a wide range of voltages. This paper presents a thorough analysis of the design of such delay lines and describes how sizing affects their delay across different voltages. It proposes a design methodology for minimizing energy consumption subject to delay matching constraints. It then applies this methodology to delay lines that consist of four different delay elements in two different technologies, exploring the underlying trade-offs they present.
{"title":"Analysis and Design of Delay Lines for Dynamic Voltage Scaling Applications","authors":"R. Tadros, Weizhe Hua, Matheus Gibiluka, Matheus T. Moreira, Ney Laert Vilar Calazans, P. Beerel","doi":"10.1109/ASYNC.2016.16","DOIUrl":"https://doi.org/10.1109/ASYNC.2016.16","url":null,"abstract":"Dynamic voltage scaling of bundled-data asynchronous design has the promise to lead to far more energy efficient systems than traditionally clocked alternatives. However, this approach relies on the development of energy-efficient delay lines, whose delay must track that of the combinational datapath over a wide range of voltages. This paper presents a thorough analysis of the design of such delay lines and describes how sizing affects their delay across different voltages. It proposes a design methodology for minimizing energy consumption subject to delay matching constraints. It then applies this methodology to delay lines that consist of four different delay elements in two different technologies, exploring the underlying trade-offs they present.","PeriodicalId":314538,"journal":{"name":"2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131363326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a methodology for substrate noise reduction in mixed-signal integrated circuits (IC) by using a globally-asynchronous locally-synchronous (GALS) approach for digital system integration. For this purpose the harmonic balanced partitioning strategy is proposed. It is shown that by converting a synchronous design into a plesiochronous GALS design with M locally-synchronous modules (LSMs), a theoretical limit of spectral peak attenuation corresponds to 20logM. This model is evaluated by numerical simulations in MATLAB. Based on the proposed partitioning scheme, a methodology for GALS partitioning for optimal substrate noise reduction is developed. Finally the corresponding low-noise GALS design flow is proposed, based on a custom noise optimization tool named EMIAS. The flow is evaluated on a realistic design example.
{"title":"GALS Partitioning Methodology for Substrate Noise Reduction in Mixed-Signal Integrated Circuits","authors":"M. Babić, Steffen Zeidler, M. Krstic","doi":"10.1109/ASYNC.2016.15","DOIUrl":"https://doi.org/10.1109/ASYNC.2016.15","url":null,"abstract":"This paper proposes a methodology for substrate noise reduction in mixed-signal integrated circuits (IC) by using a globally-asynchronous locally-synchronous (GALS) approach for digital system integration. For this purpose the harmonic balanced partitioning strategy is proposed. It is shown that by converting a synchronous design into a plesiochronous GALS design with M locally-synchronous modules (LSMs), a theoretical limit of spectral peak attenuation corresponds to 20logM. This model is evaluated by numerical simulations in MATLAB. Based on the proposed partitioning scheme, a methodology for GALS partitioning for optimal substrate noise reduction is developed. Finally the corresponding low-noise GALS design flow is proposed, based on a custom noise optimization tool named EMIAS. The flow is evaluated on a realistic design example.","PeriodicalId":314538,"journal":{"name":"2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128922042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hardware design abstraction has significantly favoured productivity in the recent years. The clock is known to be the beating heart of every digital design which coordinates the communications and computations. Due to the critical role of this signal a proper management of it is essential. Newly emerged high-level synthesis and hardware construction tools either reflect this responsibility to the designer at high level or make some general assumptions based upon critical paths which may also require the designer to re-architecture the design when the assumptions encounter failure. This can exert a profound impact on designer's productivity. We propose the AutoCLK technique to handle the clock automatically which calls for specific properties, such as 'slack elasticity' and distributed control flow, in the target architecture. Our experiments demonstrate that both low-level and high-level factors have to be taken into account for efficient clock management.
{"title":"Automatic Clock: A Promising Approach toward GALSification","authors":"M. Mamaghani, M. Krstic, J. Garside","doi":"10.1109/ASYNC.2016.20","DOIUrl":"https://doi.org/10.1109/ASYNC.2016.20","url":null,"abstract":"Hardware design abstraction has significantly favoured productivity in the recent years. The clock is known to be the beating heart of every digital design which coordinates the communications and computations. Due to the critical role of this signal a proper management of it is essential. Newly emerged high-level synthesis and hardware construction tools either reflect this responsibility to the designer at high level or make some general assumptions based upon critical paths which may also require the designer to re-architecture the design when the assumptions encounter failure. This can exert a profound impact on designer's productivity. We propose the AutoCLK technique to handle the clock automatically which calls for specific properties, such as 'slack elasticity' and distributed control flow, in the target architecture. Our experiments demonstrate that both low-level and high-level factors have to be taken into account for efficient clock management.","PeriodicalId":314538,"journal":{"name":"2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"30 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114036111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasing scale and complexity of integrated circuits leads to many departures from a pure, synchronous design methodology. Clock-domain crossings, multi-cycle paths, and circuits for test with long combinational logic delays introduce vulnerabilities for glitch-related failures. Conventional simulation techniques can miss glitches because of the large number of value and timing scenarios. We have tried several commercially available tools but have not found a comprehensive solution. This paper presents a concise statement of what it means for a logic circuit to be "glitch free". This property can be verified using satisfiability solvers. We present our implementation using the ACL2 theorem proving system and some experimental results.
{"title":"Finding Glitches Using Formal Methods","authors":"Yan Peng, I. W. Jones, M. Greenstreet","doi":"10.1109/ASYNC.2016.12","DOIUrl":"https://doi.org/10.1109/ASYNC.2016.12","url":null,"abstract":"The increasing scale and complexity of integrated circuits leads to many departures from a pure, synchronous design methodology. Clock-domain crossings, multi-cycle paths, and circuits for test with long combinational logic delays introduce vulnerabilities for glitch-related failures. Conventional simulation techniques can miss glitches because of the large number of value and timing scenarios. We have tried several commercially available tools but have not found a comprehensive solution. This paper presents a concise statement of what it means for a logic circuit to be \"glitch free\". This property can be verified using satisfiability solvers. We present our implementation using the ACL2 theorem proving system and some experimental results.","PeriodicalId":314538,"journal":{"name":"2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"968 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123312776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A frequency-locked loop (FLL) system typically employs synchronous digital counters to estimate the frequency discrepancy between the output of a local oscillator and an external reference clock. We present a novel FIFO-based frequency detector as an alternative to such counters. Our FIFO-based frequency detector consists of an asP* control unit and data flip-flops at either ends, with inputs being the reference clock and the divided down oscillator output. Using this frequency detector, we construct an asynchronously controlled FLL, and we compare its performance against a synchronously controlled FLL. The asynchronously controlled design is shown to generate more corrective events, allowing it to frequency lock in comparable time to traditional FLLs. Moreover, the asynchronously controlled FLL provides a simpler design whose counter bit width requirements do not increase with finer frequency resolution specifications. Finally, we also propose a slightly modified FLL design which uses the FIFO-based frequency detector to achieve frequency locking in 80% less time, as compared to traditional FLLs.
{"title":"Asynchronously Controlled Frequency Locked Loop","authors":"Suwen Yang, Frankie Y. Liu, V. C. Lee","doi":"10.1109/ASYNC.2016.8","DOIUrl":"https://doi.org/10.1109/ASYNC.2016.8","url":null,"abstract":"A frequency-locked loop (FLL) system typically employs synchronous digital counters to estimate the frequency discrepancy between the output of a local oscillator and an external reference clock. We present a novel FIFO-based frequency detector as an alternative to such counters. Our FIFO-based frequency detector consists of an asP* control unit and data flip-flops at either ends, with inputs being the reference clock and the divided down oscillator output. Using this frequency detector, we construct an asynchronously controlled FLL, and we compare its performance against a synchronously controlled FLL. The asynchronously controlled design is shown to generate more corrective events, allowing it to frequency lock in comparable time to traditional FLLs. Moreover, the asynchronously controlled FLL provides a simpler design whose counter bit width requirements do not increase with finer frequency resolution specifications. Finally, we also propose a slightly modified FLL design which uses the FIFO-based frequency detector to achieve frequency locking in 80% less time, as compared to traditional FLLs.","PeriodicalId":314538,"journal":{"name":"2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122276925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
System-on-Chip (SoC) designs using multiple clock domains are gaining importance due to clock distribution difficulties and increasing in-die process variations. For the same reasons more emerging SoC designs utilize clock-less domains for parts of the system. Both clock domain crossing and clocked/clockless domain crossing require a mechanism for inter-domain data transfer that re-synchronizes data to the clock domain of the receiver and avoids metastability. These synchronizers introduce added latency and reduce throughput. This paper proposes merging synchronization with computation in order to reduce latency while keeping throughput high. The method, called Gradual Synchronization (GSync), can reduce synchronization latency at maximum operating frequency by up to 37 percent, with even greater benefit at slower frequencies. We show the benefits of this approach in the scenario of an asynchronous NoC with synchronous end-points.
{"title":"Gradual Synchronization","authors":"Sandra J. Jackson, R. Manohar","doi":"10.1109/ASYNC.2016.21","DOIUrl":"https://doi.org/10.1109/ASYNC.2016.21","url":null,"abstract":"System-on-Chip (SoC) designs using multiple clock domains are gaining importance due to clock distribution difficulties and increasing in-die process variations. For the same reasons more emerging SoC designs utilize clock-less domains for parts of the system. Both clock domain crossing and clocked/clockless domain crossing require a mechanism for inter-domain data transfer that re-synchronizes data to the clock domain of the receiver and avoids metastability. These synchronizers introduce added latency and reduce throughput. This paper proposes merging synchronization with computation in order to reduce latency while keeping throughput high. The method, called Gradual Synchronization (GSync), can reduce synchronization latency at maximum operating frequency by up to 37 percent, with even greater benefit at slower frequencies. We show the benefits of this approach in the scenario of an asynchronous NoC with synchronous end-points.","PeriodicalId":314538,"journal":{"name":"2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114348921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}