Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601942
Xiang Xiao, J. Lee
This paper introduces a novel O(1) parallel deadlock detection approach for multi-unit resource system-on-a-chips (SoCs), inspired by Kimpsilas method in O(1) detection as well as Shiupsilas method in parallel processing. Our contributions are (i) the first O(1) hardware deadlock detection and (ii) O(min(m, n)) preparation, both for multi-unit resource systems, where m and n are the number of processes and resources, respectively. O(min(m, n)), previously O(m times n), is achieved by performing all the searches for sink nodes for each and every resource in parallel in hardware over a matrix representing resource allocations as well as other auxiliary matrices. Our experiments demonstrate that deadlock detection always takes two clock cycles.
{"title":"A novel O(1) parallel deadlock detection algorithm and architecture for multi-unit resource systems","authors":"Xiang Xiao, J. Lee","doi":"10.1109/ICCD.2007.4601942","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601942","url":null,"abstract":"This paper introduces a novel O(1) parallel deadlock detection approach for multi-unit resource system-on-a-chips (SoCs), inspired by Kimpsilas method in O(1) detection as well as Shiupsilas method in parallel processing. Our contributions are (i) the first O(1) hardware deadlock detection and (ii) O(min(m, n)) preparation, both for multi-unit resource systems, where m and n are the number of processes and resources, respectively. O(min(m, n)), previously O(m times n), is achieved by performing all the searches for sink nodes for each and every resource in parallel in hardware over a matrix representing resource allocations as well as other auxiliary matrices. Our experiments demonstrate that deadlock detection always takes two clock cycles.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"12 1","pages":"480-487"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82225130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601887
M. Cummings, T. Cooklev
Software defined radio (SDR) is one of the most important emerging disruptive technologies that shaped wireless communication and mobile computing industries. The "ideal" software radio consists of a wideband antenna, wideband ADC and DAC, and a programmable processor. This paper discusses the development of software radios along with their applications in different fields of telecommunication.
{"title":"Tutorial: Software-defined radio technology","authors":"M. Cummings, T. Cooklev","doi":"10.1109/ICCD.2007.4601887","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601887","url":null,"abstract":"Software defined radio (SDR) is one of the most important emerging disruptive technologies that shaped wireless communication and mobile computing industries. The \"ideal\" software radio consists of a wideband antenna, wideband ADC and DAC, and a programmable processor. This paper discusses the development of software radios along with their applications in different fields of telecommunication.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"32 1","pages":"103-104"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75743650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601901
P. Jacob, A. Zia, Okan Erdogan, P. Belemjian, Peng Jin, Jin Woo Kim, M. Chu, R. Kraft, J. McDonald
Forty years ago Gene Amdahl published a figure of merit for parallel computation, which proved extremely controversial. The controversy still rages today, although those that have looked closely at this figure of merit conclude that it is correct, but perhaps misinterpreted. In this paper we will look at a small variation on that law that suggests computer designers should take a closer look at two emerging technologies, SiGe HBT BiCMOS and 3D chip stacking. We may be overlooking a way to continue the clock race, and in so doing accomplish better parallelism.
{"title":"Amdahl’s figure of merit, SiGe HBT BiCMOS, and 3D chip stacking","authors":"P. Jacob, A. Zia, Okan Erdogan, P. Belemjian, Peng Jin, Jin Woo Kim, M. Chu, R. Kraft, J. McDonald","doi":"10.1109/ICCD.2007.4601901","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601901","url":null,"abstract":"Forty years ago Gene Amdahl published a figure of merit for parallel computation, which proved extremely controversial. The controversy still rages today, although those that have looked closely at this figure of merit conclude that it is correct, but perhaps misinterpreted. In this paper we will look at a small variation on that law that suggests computer designers should take a closer look at two emerging technologies, SiGe HBT BiCMOS and 3D chip stacking. We may be overlooking a way to continue the clock race, and in so doing accomplish better parallelism.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"330 1","pages":"202-207"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76367543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601931
Sheng Sun, C. Sechen
Our objective was to determine the most energy efficient 64 b static CMOS adder architecture, for a range of high-performance delay targets. We examine extensively carry-lookahead (CLA) and carry-select adders with a wide range of tradeoffs in logic levels, fanouts and wiring complexity. We propose sparse CLA adder architectures based on buffering techniques to reduce logic redundancy and improve energy efficiency. All the designs were implemented using an energy-delay layout optimization flow with full RC extraction. Our new 64 b adder designs have a relative delay as low as 9.9 F04 (fanout-offour inverter) delays and promise better scaling for smaller technology nodes. They yield the best energy efficiency for a wide range of delay targets and are 30%, 15% and 7% more energy efficient than full Kogge-Stone, sparse-2 Kogge-Stone and Han-Carlson, respectively, at the fastest points. They consume only about 1/3 the energy of dynamic adders.
{"title":"Post-layout comparison of high performance 64b static adders in energy-delay space","authors":"Sheng Sun, C. Sechen","doi":"10.1109/ICCD.2007.4601931","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601931","url":null,"abstract":"Our objective was to determine the most energy efficient 64 b static CMOS adder architecture, for a range of high-performance delay targets. We examine extensively carry-lookahead (CLA) and carry-select adders with a wide range of tradeoffs in logic levels, fanouts and wiring complexity. We propose sparse CLA adder architectures based on buffering techniques to reduce logic redundancy and improve energy efficiency. All the designs were implemented using an energy-delay layout optimization flow with full RC extraction. Our new 64 b adder designs have a relative delay as low as 9.9 F04 (fanout-offour inverter) delays and promise better scaling for smaller technology nodes. They yield the best energy efficiency for a wide range of delay targets and are 30%, 15% and 7% more energy efficient than full Kogge-Stone, sparse-2 Kogge-Stone and Han-Carlson, respectively, at the fastest points. They consume only about 1/3 the energy of dynamic adders.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"8 1","pages":"401-408"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77796177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601899
Hai Lin, Xuan Guan, Yunsi Fei, Z. Shi
(ASIPs) are being increasingly used in mobile embedded systems, the ubiquitous networking connections have exposed these systems under various malicious security attacks, which may alter the program code running on the systems. In addition, soft errors in microprocessors can also change program code and result in system malfunction. At the instruction level, all code modifications are manifested as bit flips. In this work, we present a generalized methodology for monitoring code integrity at run-time in ASIPs, where both the instruction set architecture (ISA) and the underlying microarchitecture can be customized for a particular application domain. Based on the microoperation-based monitoring architecture that we have presented in previous work, we propose a compiler-assisted and application-controlled management approach for the monitoring architecture. Experimental results show that compared with the OS-managed scheme and other compiler-assisted schemes, our approach can detect program code integrity compromises with much less performance degradation.
{"title":"Compiler-assisted architectural support for program code integrity monitoring in application-specific instruction set processors","authors":"Hai Lin, Xuan Guan, Yunsi Fei, Z. Shi","doi":"10.1109/ICCD.2007.4601899","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601899","url":null,"abstract":"(ASIPs) are being increasingly used in mobile embedded systems, the ubiquitous networking connections have exposed these systems under various malicious security attacks, which may alter the program code running on the systems. In addition, soft errors in microprocessors can also change program code and result in system malfunction. At the instruction level, all code modifications are manifested as bit flips. In this work, we present a generalized methodology for monitoring code integrity at run-time in ASIPs, where both the instruction set architecture (ISA) and the underlying microarchitecture can be customized for a particular application domain. Based on the microoperation-based monitoring architecture that we have presented in previous work, we propose a compiler-assisted and application-controlled management approach for the monitoring architecture. Experimental results show that compared with the OS-managed scheme and other compiler-assisted schemes, our approach can detect program code integrity compromises with much less performance degradation.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"9 1","pages":"187-193"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72668289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601926
Siavash Bayat Sarmadi, M. A. Hasan
This paper investigates the concurrent detection of multiple-bit errors in polynomial basis (PB) multipliers over binary extension fields. To this end, multiple parity bits are considered for both inputs of the multiplier. For the multiplier architecture considered here, the two inputs go through considerably different sets of circuits and this allows us to use different number of parity bits with the inputs. In a bit-parallel implementation of a GF(2163) PB multiplier with eight parity bits for the first input and three parity bits for the second input, the area overhead and the probability of error detection are approximately 55.59% and 0.997, respectively. Additionally, the average time overhead of the scheme implemented in a bit-parallel fashion is approximately 25%.
{"title":"Detecting errors in a polynomial basis multiplier using multiple parity bits for both inputs","authors":"Siavash Bayat Sarmadi, M. A. Hasan","doi":"10.1109/ICCD.2007.4601926","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601926","url":null,"abstract":"This paper investigates the concurrent detection of multiple-bit errors in polynomial basis (PB) multipliers over binary extension fields. To this end, multiple parity bits are considered for both inputs of the multiplier. For the multiplier architecture considered here, the two inputs go through considerably different sets of circuits and this allows us to use different number of parity bits with the inputs. In a bit-parallel implementation of a GF(2163) PB multiplier with eight parity bits for the first input and three parity bits for the second input, the area overhead and the probability of error detection are approximately 55.59% and 0.997, respectively. Additionally, the average time overhead of the scheme implemented in a bit-parallel fashion is approximately 25%.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"1 1","pages":"368-375"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83909991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601923
H. Cheung, S. Gupta
Many recent studies show that a resistive bridging fault may cause intermediate voltages at the bridging fault site. Since the gates in the fanout of the fault site may have distinct and multiple logic threshold voltages, namely VIL and VIH, these gates may interpret the intermediate voltage as logic '1', logic '0', or logically indeterminate. Such fault behavior is described as the bridging fault Byzantine general problem (T. Nanya et al., Nov. 1989). None of the existing models of bridging faults used by bridging fault simulators accurately captures the indeterminate logic behavior of such bridges. We present a resistive bridging fault model that accurately yet efficiently captures indeterminate logic values. We also describe an efficient PPSFP bridging fault simulator and show that all previous approaches seriously overestimate bridging fault coverage.
近年来的许多研究表明,阻性桥接故障可能在桥接故障点产生中间电压。由于故障点的扇出门可能具有不同的多个逻辑阈值电压,即VIL和VIH,因此这些门可能将中间电压解释为逻辑“1”、逻辑“0”或逻辑不确定。这种故障行为被描述为桥接故障拜占庭一般问题(T. Nanya et al., Nov. 1989)。桥接故障模拟器所使用的现有桥接故障模型都不能准确地捕捉此类桥的不确定逻辑行为。我们提出了一种准确而有效地捕获不确定逻辑值的电阻桥接故障模型。我们还描述了一个高效的PPSFP桥接故障模拟器,并表明所有以前的方法都严重高估了桥接故障覆盖率。
{"title":"Accurate modeling and fault simulation of Byzantine resistive bridges","authors":"H. Cheung, S. Gupta","doi":"10.1109/ICCD.2007.4601923","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601923","url":null,"abstract":"Many recent studies show that a resistive bridging fault may cause intermediate voltages at the bridging fault site. Since the gates in the fanout of the fault site may have distinct and multiple logic threshold voltages, namely VIL and VIH, these gates may interpret the intermediate voltage as logic '1', logic '0', or logically indeterminate. Such fault behavior is described as the bridging fault Byzantine general problem (T. Nanya et al., Nov. 1989). None of the existing models of bridging faults used by bridging fault simulators accurately captures the indeterminate logic behavior of such bridges. We present a resistive bridging fault model that accurately yet efficiently captures indeterminate logic values. We also describe an efficient PPSFP bridging fault simulator and show that all previous approaches seriously overestimate bridging fault coverage.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"PP 1","pages":"347-353"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84363267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601916
Brian J. Hickmann, A. Krioukov, M. Schulte, M. A. Erle
Decimal floating-point multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents a fully parallel decimal floating-point multiplier compliant with the recent draft of the IEEE P754 Standard for Floating-point Arithmetic (IEEE P754). The novelty of the design is that it is the first parallel decimal floating-point multiplier offering low latency and high throughput. This design is based on a previously published parallel fixed-point decimal multiplier which uses alternate decimal digit encodings to reduce area and delay. The fixed-point design is extended to support floating-point multiplication by adding several components including exponent generation, rounding, shifting, and exception handling. Area and delay estimates are presented that show a significant latency and throughput improvement with a substantial increase in area as compared to the only published IEEE P754 compliant sequential floating-point multiplier. To the best of our knowledge, this is the first publication to present a fully parallel decimal floating-point multiplier that complies with IEEE P754.
{"title":"A parallel IEEE P754 decimal floating-point multiplier","authors":"Brian J. Hickmann, A. Krioukov, M. Schulte, M. A. Erle","doi":"10.1109/ICCD.2007.4601916","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601916","url":null,"abstract":"Decimal floating-point multiplication is important in many commercial applications including banking, tax calculation, currency conversion, and other financial areas. This paper presents a fully parallel decimal floating-point multiplier compliant with the recent draft of the IEEE P754 Standard for Floating-point Arithmetic (IEEE P754). The novelty of the design is that it is the first parallel decimal floating-point multiplier offering low latency and high throughput. This design is based on a previously published parallel fixed-point decimal multiplier which uses alternate decimal digit encodings to reduce area and delay. The fixed-point design is extended to support floating-point multiplication by adding several components including exponent generation, rounding, shifting, and exception handling. Area and delay estimates are presented that show a significant latency and throughput improvement with a substantial increase in area as compared to the only published IEEE P754 compliant sequential floating-point multiplier. To the best of our knowledge, this is the first publication to present a fully parallel decimal floating-point multiplier that complies with IEEE P754.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"50 1","pages":"296-303"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81149755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601950
Julian J. H. Pontes, R. Soares, Ewerson Carvalho, F. Moraes, Ney Laert Vilar Calazans
Building fully synchronous VLSI circuits is becoming less viable as circuit geometries evolve. However, before the adoption of purely asynchronous strategies in VLSI design, globally asynchronous, locally synchronous (GALS) design approaches should take over. The design of circuits using complex field programmable components like state of the art FPGAs follows this same trend. In GALS design, a critical step is the definition of asynchronous interfaces between synchronous regions. This paper proposes SCAFFI, a new asynchronous interface to interconnect modules inside FPGAs. The interface is based on clock stretching techniques to avoid metastability. Differently from other interfaces, it can use both logic levels for stretching and do not require the use of arbiters. Also, compactness of the implementation is enhanced by the use of dedicated FPGA hard macros. A GALS version implementation of an RSA cryptography core demonstrates the use of SCAFFI.
{"title":"SCAFFI: An intrachip FPGA asynchronous interface based on hard macros","authors":"Julian J. H. Pontes, R. Soares, Ewerson Carvalho, F. Moraes, Ney Laert Vilar Calazans","doi":"10.1109/ICCD.2007.4601950","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601950","url":null,"abstract":"Building fully synchronous VLSI circuits is becoming less viable as circuit geometries evolve. However, before the adoption of purely asynchronous strategies in VLSI design, globally asynchronous, locally synchronous (GALS) design approaches should take over. The design of circuits using complex field programmable components like state of the art FPGAs follows this same trend. In GALS design, a critical step is the definition of asynchronous interfaces between synchronous regions. This paper proposes SCAFFI, a new asynchronous interface to interconnect modules inside FPGAs. The interface is based on clock stretching techniques to avoid metastability. Differently from other interfaces, it can use both logic levels for stretching and do not require the use of arbiters. Also, compactness of the implementation is enhanced by the use of dedicated FPGA hard macros. A GALS version implementation of an RSA cryptography core demonstrates the use of SCAFFI.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"75 1","pages":"541-546"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84015411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-10-01DOI: 10.1109/ICCD.2007.4601939
Wanping Zhang, Ling Zhang, Rui Shi, He Peng, Zhi Zhu, L. Chua-Eoan, R. Murgai, Toshiyuki Shibuya, N. Ito, Chung-Kuan Cheng
This paper proposes an efficient analysis flow and an algorithm to identify the worst case noise for power networks with multiple clock domains. First, we apply the Laplace transform on the input current sources to derive the analytical formula. Then, we calculate the circuit frequency response with logarithmic scale frequency components. The frequency domain response is approximated by a rational function using vector fitting modeling. The rational function is used to derive the natural frequency of the power ground networks, and can be converted back into time domain easily. Based on the analysis results, we then present the worst case clock gating pattern algorithm to analyze the power networks with multiple clock domains. The most expensive part of the proposed algorithm is the matrix solving: O(F(N) ldr log f ldr D). Function F is the complexity of iterative solution of complex matrix with dimension N. We assume that there are D clock domains and the frequency spans from 0 to f Hz. Experimental results show that our method is up to 60X faster than HSPICE, and can analyze large circuits which are not affordable by HSPICE.
本文提出了一种有效的多时钟域电网最坏情况噪声识别分析流程和算法。首先,我们对输入电流源进行拉普拉斯变换,推导出解析公式。然后,我们用对数尺度频率分量计算电路的频率响应。频域响应近似为有理函数,采用向量拟合建模。利用有理函数推导出电力地网的固有频率,并可方便地转换回时域。在分析结果的基础上,提出了最坏情况下的时钟门控模式算法,用于分析具有多个时钟域的电网。该算法最昂贵的部分是矩阵求解:O(F(N) ldr log F ldr D)。函数F是维数为N的复矩阵迭代解的复杂度。我们假设有D个时钟域,频率从0到fhz。实验结果表明,该方法的速度比HSPICE快60倍,可以分析HSPICE无法负担的大型电路。
{"title":"Fast power network analysis with multiple clock domains","authors":"Wanping Zhang, Ling Zhang, Rui Shi, He Peng, Zhi Zhu, L. Chua-Eoan, R. Murgai, Toshiyuki Shibuya, N. Ito, Chung-Kuan Cheng","doi":"10.1109/ICCD.2007.4601939","DOIUrl":"https://doi.org/10.1109/ICCD.2007.4601939","url":null,"abstract":"This paper proposes an efficient analysis flow and an algorithm to identify the worst case noise for power networks with multiple clock domains. First, we apply the Laplace transform on the input current sources to derive the analytical formula. Then, we calculate the circuit frequency response with logarithmic scale frequency components. The frequency domain response is approximated by a rational function using vector fitting modeling. The rational function is used to derive the natural frequency of the power ground networks, and can be converted back into time domain easily. Based on the analysis results, we then present the worst case clock gating pattern algorithm to analyze the power networks with multiple clock domains. The most expensive part of the proposed algorithm is the matrix solving: O(F(N) ldr log f ldr D). Function F is the complexity of iterative solution of complex matrix with dimension N. We assume that there are D clock domains and the frequency spans from 0 to f Hz. Experimental results show that our method is up to 60X faster than HSPICE, and can analyze large circuits which are not affordable by HSPICE.","PeriodicalId":6306,"journal":{"name":"2007 25th International Conference on Computer Design","volume":"52 1","pages":"456-463"},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81632033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}