Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106770
M. Inoue, Chikateru Jinno, H. Fujiwara
We introduce a class of sequential circuits with internally switched balanced structure which allows test generation with combinational test generation complexity. The proposed class includes any other known classes with this feature. This paper also considers faults in hold registers and switches regarded as macros, while any related work does not consider faults in such macros. Experimental results show the effectiveness of using combinational test generation for the circuits with internally switched balanced structure.
{"title":"An extended class of sequential circuits with combinational test generation complexity","authors":"M. Inoue, Chikateru Jinno, H. Fujiwara","doi":"10.1109/ICCD.2002.1106770","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106770","url":null,"abstract":"We introduce a class of sequential circuits with internally switched balanced structure which allows test generation with combinational test generation complexity. The proposed class includes any other known classes with this feature. This paper also considers faults in hold registers and switches regarded as macros, while any related work does not consider faults in such macros. Experimental results show the effectiveness of using combinational test generation for the circuits with internally switched balanced structure.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124192587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106788
S. Augsburger, B. Nikolić
Multiple supply voltages, multiple transistor thresholds and transistor sizing could be used to reduce the power dissipation of digital blocks. This paper presents a framework for evaluating the effectiveness of each of these approaches independently and in conjunction with each other. Results show the advantages of multiple supply, transistor sizing, and multiple threshold can be compounded to maximize power reduction. The order of application of these techniques determines the final savings in active and leakage power.
{"title":"Combining dual-supply, dual-threshold and transistor sizing for power reduction","authors":"S. Augsburger, B. Nikolić","doi":"10.1109/ICCD.2002.1106788","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106788","url":null,"abstract":"Multiple supply voltages, multiple transistor thresholds and transistor sizing could be used to reduce the power dissipation of digital blocks. This paper presents a framework for evaluating the effectiveness of each of these approaches independently and in conjunction with each other. Results show the advantages of multiple supply, transistor sizing, and multiple threshold can be compounded to maximize power reduction. The order of application of these techniques determines the final savings in active and leakage power.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125822756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106757
O. Ergin, K. Ghose, Gürhan Küçük, D. Ponomarev
Datapath components in modem high performance superscalar processors employ a significant amount of associative addressing logic based on the use of comparators that dissipate energy on a mismatch. These comparators are used to detect a full match, but as mismatches are much more common than full matches in some components of the CPU, considerable energy-inefficiencies occur within the associative logic. We propose the design of two new comparator circuits that predominantly dissipate energy on a match, thus resulting in very significant savings in comparator power dissipation. The proposed designs are evaluated using SPICE simulations of actual VLSI layouts of the comparators in 0.18 micron 6-metal layer process and micro-architectural level statistics.
{"title":"A circuit-level implementation of fast, energy-efficient CMOS comparators for high-performance microprocessors","authors":"O. Ergin, K. Ghose, Gürhan Küçük, D. Ponomarev","doi":"10.1109/ICCD.2002.1106757","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106757","url":null,"abstract":"Datapath components in modem high performance superscalar processors employ a significant amount of associative addressing logic based on the use of comparators that dissipate energy on a mismatch. These comparators are used to detect a full match, but as mismatches are much more common than full matches in some components of the CPU, considerable energy-inefficiencies occur within the associative logic. We propose the design of two new comparator circuits that predominantly dissipate energy on a match, thus resulting in very significant savings in comparator power dissipation. The proposed designs are evaluated using SPICE simulations of actual VLSI layouts of the comparators in 0.18 micron 6-metal layer process and micro-architectural level statistics.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128519338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106784
Brucek Khailany, W. Dally, Andrew Chang, U. Kapasi, Jinyung Namkoong, Brian Towles
The Imagine stream processor is a 21 million transistor chip implemented by a collaboration between Stanford University and Texas Instruments in a 1.5 V 0.15 /spl mu/m process with five layers of aluminum metal. The VLSI design, clocking, and verification methodologies for the Imagine processor are presented. These methodologies enabled a small team of graduate students with limited resources to design a high-performance media processor in a modern ASIC flow.
Imagine流处理器是一个2100万个晶体管芯片,由斯坦福大学和德州仪器合作,采用1.5 V 0.15 /spl mu/m工艺,采用五层铝金属。介绍了Imagine处理器的VLSI设计、时钟和验证方法。这些方法使一个资源有限的研究生小组能够在现代ASIC流程中设计出高性能的媒体处理器。
{"title":"VLSI design and verification of the Imagine processor","authors":"Brucek Khailany, W. Dally, Andrew Chang, U. Kapasi, Jinyung Namkoong, Brian Towles","doi":"10.1109/ICCD.2002.1106784","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106784","url":null,"abstract":"The Imagine stream processor is a 21 million transistor chip implemented by a collaboration between Stanford University and Texas Instruments in a 1.5 V 0.15 /spl mu/m process with five layers of aluminum metal. The VLSI design, clocking, and verification methodologies for the Imagine processor are presented. These methodologies enabled a small team of graduate students with limited resources to design a high-performance media processor in a modern ASIC flow.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121629007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106739
R. Kakerow
The rapid development of multimedia applications and the Internet leads to the demand of mobility for these services. New wireless standards are supporting high data rates and additional services, but they require complex realizations in both frontend and baseband of a mobile system. The obtainable performance of such a system is often limited by the power consumption of the implementation, as long stand-by and talk times are still key parameters of a mobile terminal. Also the thermal problem, given by insufficient heat removal with highly integrated high-performance circuits in narrow-spaced terminals, calls for optimizations concerning power consumption. This paper discusses the problem of power consumption in system on chip (SoC) design for mobile applications and presents methodologies for power optimized design.
{"title":"Low power design methodologies for mobile communication","authors":"R. Kakerow","doi":"10.1109/ICCD.2002.1106739","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106739","url":null,"abstract":"The rapid development of multimedia applications and the Internet leads to the demand of mobility for these services. New wireless standards are supporting high data rates and additional services, but they require complex realizations in both frontend and baseband of a mobile system. The obtainable performance of such a system is often limited by the power consumption of the implementation, as long stand-by and talk times are still key parameters of a mobile terminal. Also the thermal problem, given by insufficient heat removal with highly integrated high-performance circuits in narrow-spaced terminals, calls for optimizations concerning power consumption. This paper discusses the problem of power consumption in system on chip (SoC) design for mobile applications and presents methodologies for power optimized design.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"C-23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126475765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106782
A. Steininger, Johann Vilanek
Fault-tolerant distributed real-time systems are facing many new challenges. Although many techniques provide effective masking of node failures on the architectural level, several trends are aggravating the reliability demands on the node level. Starting with a brief presentation of the fault tolerance properties of the time-triggered architecture TTA the corresponding support by the time-triggered protocol controller (TTPC-C) is discussed. We propose a strategy for improving these properties with respect to the anticipated new fault scenarios. It turns out that the application of BIST during node startup and before node reintegration improves system fault tolerance. Additionally a combined strategy of online BIST and error correction can efficiently protect memory. We illustrate the implementation of the proposed mechanisms. Our implementation experiences on an FPGA platform show that the involved overheads are moderate.
{"title":"Using offline and online BIST to improve system dependability - the TTPC-C example","authors":"A. Steininger, Johann Vilanek","doi":"10.1109/ICCD.2002.1106782","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106782","url":null,"abstract":"Fault-tolerant distributed real-time systems are facing many new challenges. Although many techniques provide effective masking of node failures on the architectural level, several trends are aggravating the reliability demands on the node level. Starting with a brief presentation of the fault tolerance properties of the time-triggered architecture TTA the corresponding support by the time-triggered protocol controller (TTPC-C) is discussed. We propose a strategy for improving these properties with respect to the anticipated new fault scenarios. It turns out that the application of BIST during node startup and before node reintegration improves system fault tolerance. Additionally a combined strategy of online BIST and error correction can efficiently protect memory. We illustrate the implementation of the proposed mechanisms. Our implementation experiences on an FPGA platform show that the involved overheads are moderate.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"22 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130489595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106807
Guoan Zhong, Cheng-Kok Koh
In this paper we propose a new exact closed form mutual inductance equation for on-chip interconnects. We express the mutual inductance between two parallel rectangular conductors as a weighted sum of self-inductances. We do not place any restrictions on the alignment of the two parallel rectangular conductors. Moreover they could be co-planar or reside on different layers. Most important, experimental results show that our formula is numerically more stable than that derived by Hoer and Love (1965) for long parallel onchip interconnects.
{"title":"Exact closed form formula for partial mutual inductances of on-chip interconnects","authors":"Guoan Zhong, Cheng-Kok Koh","doi":"10.1109/ICCD.2002.1106807","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106807","url":null,"abstract":"In this paper we propose a new exact closed form mutual inductance equation for on-chip interconnects. We express the mutual inductance between two parallel rectangular conductors as a weighted sum of self-inductances. We do not place any restrictions on the alignment of the two parallel rectangular conductors. Moreover they could be co-planar or reside on different layers. Most important, experimental results show that our formula is numerically more stable than that derived by Hoer and Love (1965) for long parallel onchip interconnects.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134600416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106742
C.-S. Seo, A. Chatterjee
A wiring model for system-on-chips utilizing flexible free space optical interconnects is introduced In this paper, we develop a CAD tool for physical placement of modules in system-on-chips manufactured using the optical interconnect technology. The tool also determines which of the interconnect are routed electrically and which are routed optically without exceeding the routing capacity of the optical interconnect while minimizing electrical wire length. About 50% reduction in largest delay of electrical wires is obtained through the use of optical interconnect (Performance improvement by a factor of 2).
{"title":"A CAD tool for system-on-chip placement and routing with free-space optical interconnect","authors":"C.-S. Seo, A. Chatterjee","doi":"10.1109/ICCD.2002.1106742","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106742","url":null,"abstract":"A wiring model for system-on-chips utilizing flexible free space optical interconnects is introduced In this paper, we develop a CAD tool for physical placement of modules in system-on-chips manufactured using the optical interconnect technology. The tool also determines which of the interconnect are routed electrically and which are routed optically without exceeding the routing capacity of the optical interconnect while minimizing electrical wire length. About 50% reduction in largest delay of electrical wires is obtained through the use of optical interconnect (Performance improvement by a factor of 2).","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125635426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106747
P. Roop, A. Sowmya, S. Ramesh
Automatic IP (Intellectual Property) matching is a key to reuse of IP cores. This paper presents an IP matching algorithm that can check whether a given programmable IP block can be adapted to match a given specification. When such adaptation is possible, the algorithm also generates a device driver to adapt the IP block. Though simulation, refinement and bisimulation based algorithms exist, they cannot be used to check the adaptability of an IP block, which is the essence of reuse. The IP matching algorithm is based on a formal verification technique called k-time forced simulation proposed in this paper k-time forced simulation may be used for identifying whether a given IP block (a device D) can be adapted to match a specification (a function F), given that D has a clock that is k-times faster than F. We demonstrate the applicability of the algorithm by reusing several IP blocks.
{"title":"K-time forced simulation: a formal verification technique for IP reuse","authors":"P. Roop, A. Sowmya, S. Ramesh","doi":"10.1109/ICCD.2002.1106747","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106747","url":null,"abstract":"Automatic IP (Intellectual Property) matching is a key to reuse of IP cores. This paper presents an IP matching algorithm that can check whether a given programmable IP block can be adapted to match a given specification. When such adaptation is possible, the algorithm also generates a device driver to adapt the IP block. Though simulation, refinement and bisimulation based algorithms exist, they cannot be used to check the adaptability of an IP block, which is the essence of reuse. The IP matching algorithm is based on a formal verification technique called k-time forced simulation proposed in this paper k-time forced simulation may be used for identifying whether a given IP block (a device D) can be adapted to match a specification (a function F), given that D has a clock that is k-times faster than F. We demonstrate the applicability of the algorithm by reusing several IP blocks.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116970054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-16DOI: 10.1109/ICCD.2002.1106794
T. Lyon, E. Delano, Cameron McNairy, Dean Mulla
The second member in the Itanium Processor Family, the Itanium 2 processor, was designed to meet the challenge for high performance in today's technical and commercial server applications. The Itanium 2 processor's data cache microarchitecture provides abundant memory resources, low memory latencies and cache organizations tuned to for a variety of applications. The data cache design provides four memory ports to support the many performance optimizations available in the EPIC (Explicitly Parallel Instruction Computing) design concepts, such as predication, speculation and explicit prefetching. The three-level cache hierarchy provides a 16KB 1-cycle first level cache to support the moderate bandwidths needed by integer applications. The second level cache is 256KB with a relatively low latency and FP balanced bandwidth to support technical applications. The onchip third level cache is 3MB and is designed to provide the low latency and the large size needed by commercial and technical applications.
{"title":"Data Cache design considerations for the Itanium/sub /spl reg// 2 Processor","authors":"T. Lyon, E. Delano, Cameron McNairy, Dean Mulla","doi":"10.1109/ICCD.2002.1106794","DOIUrl":"https://doi.org/10.1109/ICCD.2002.1106794","url":null,"abstract":"The second member in the Itanium Processor Family, the Itanium 2 processor, was designed to meet the challenge for high performance in today's technical and commercial server applications. The Itanium 2 processor's data cache microarchitecture provides abundant memory resources, low memory latencies and cache organizations tuned to for a variety of applications. The data cache design provides four memory ports to support the many performance optimizations available in the EPIC (Explicitly Parallel Instruction Computing) design concepts, such as predication, speculation and explicit prefetching. The three-level cache hierarchy provides a 16KB 1-cycle first level cache to support the moderate bandwidths needed by integer applications. The second level cache is 256KB with a relatively low latency and FP balanced bandwidth to support technical applications. The onchip third level cache is 3MB and is designed to provide the low latency and the large size needed by commercial and technical applications.","PeriodicalId":164768,"journal":{"name":"Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"221 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120940848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}