Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398097
Chorng-Sii Hwang, Chun-Yung Cho, Chung-Chun Chen, H. Tsao
This paper describes a dual-band clock and data recovery circuit using a new half-rate linear phase detector. With the proposed sampling scheme, the phase detector produces UP/DN signals with equal pulsewidth and thus eliminates the demand of current scaling in the charge pump. The test chip fabricated by CMOS 0.18 μm 1P6M process can operate at 2.7 and 1.62 Gbps which satisfies the DisplayPort standard. It can recover the NRZ data of a (27-1) PRBS with a bit error rate less than 10−12. The chip core occupies an area of 0.36 mm2. The power consumption is 50 mW at 2.7 Gbps with a 1.8 V supply voltage.
{"title":"Dual-band CDR using a half-rate linear phase detector","authors":"Chorng-Sii Hwang, Chun-Yung Cho, Chung-Chun Chen, H. Tsao","doi":"10.1109/SOCCON.2009.5398097","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398097","url":null,"abstract":"This paper describes a dual-band clock and data recovery circuit using a new half-rate linear phase detector. With the proposed sampling scheme, the phase detector produces UP/DN signals with equal pulsewidth and thus eliminates the demand of current scaling in the charge pump. The test chip fabricated by CMOS 0.18 μm 1P6M process can operate at 2.7 and 1.62 Gbps which satisfies the DisplayPort standard. It can recover the NRZ data of a (27-1) PRBS with a bit error rate less than 10−12. The chip core occupies an area of 0.36 mm2. The power consumption is 50 mW at 2.7 Gbps with a 1.8 V supply voltage.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127003353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398035
F. Dielacher, Christian Vogel, P. Singerl, S. Mendel, A. Wiesbauer
We exemplify the possibilities of a holistic design approach for systems on chip. After recapitulating basic observations for next generation systems, we outline the advantages and challenges of a holistic design approach. The discussion is supported by real world examples.
{"title":"A holistic design approach for systems on chip","authors":"F. Dielacher, Christian Vogel, P. Singerl, S. Mendel, A. Wiesbauer","doi":"10.1109/SOCCON.2009.5398035","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398035","url":null,"abstract":"We exemplify the possibilities of a holistic design approach for systems on chip. After recapitulating basic observations for next generation systems, we outline the advantages and challenges of a holistic design approach. The discussion is supported by real world examples.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122089198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398046
A. Sanusi, M. Bayoumi
The shrinking in device sizes brings about an increase in effects of noise sources and process variations, thus leading to increased faults and decreased chip yields in deep submicron systems. We propose a new fault-tolerant scheme called smart-flooding to fight both transient and permanent faults in networks-on-chips (NoCs). Smart-flooding tries to flood messages in cases where permanent faults have occurred, while end-to-end retransmission is used in cases of transient errors. Our experiments show that the proposed scheme exhibits a high performance while maintaining the level of fault-tolerance seen in regular flooding algorithm.
{"title":"Smart-flooding: A novel scheme for fault-tolerant NoCs","authors":"A. Sanusi, M. Bayoumi","doi":"10.1109/SOCCON.2009.5398046","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398046","url":null,"abstract":"The shrinking in device sizes brings about an increase in effects of noise sources and process variations, thus leading to increased faults and decreased chip yields in deep submicron systems. We propose a new fault-tolerant scheme called smart-flooding to fight both transient and permanent faults in networks-on-chips (NoCs). Smart-flooding tries to flood messages in cases where permanent faults have occurred, while end-to-end retransmission is used in cases of transient errors. Our experiments show that the proposed scheme exhibits a high performance while maintaining the level of fault-tolerance seen in regular flooding algorithm.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123814914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398088
V. Chouliaras, K. Manolopoulos, D. Reisis
The efficiency of Fused Multiply Add units plays a key role in the processor's performance for a variety of applications. A design keeping the advantages of the FMA regarding the latency and the hardware utilization and also improving the result's accuracy in both normalized and denormalized numbers is the subject of this work. The FMA unit has configurable latency and it is integrated in a VLIW processor. The VLSI TSMC 0.13 implementation achieved an operating frequency of 232.6 MHz and a final post-routed area of 121900.478 um2.
{"title":"A configurable length, Fused Multiply-Add floating point unit for a VLIW processor","authors":"V. Chouliaras, K. Manolopoulos, D. Reisis","doi":"10.1109/SOCCON.2009.5398088","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398088","url":null,"abstract":"The efficiency of Fused Multiply Add units plays a key role in the processor's performance for a variety of applications. A design keeping the advantages of the FMA regarding the latency and the hardware utilization and also improving the result's accuracy in both normalized and denormalized numbers is the subject of this work. The FMA unit has configurable latency and it is integrated in a VLIW processor. The VLSI TSMC 0.13 implementation achieved an operating frequency of 232.6 MHz and a final post-routed area of 121900.478 um2.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126434550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398033
J. Marku, J. Poikonen, A. Paasio
The temperature behaviour of a combination selection based mismatch calibration is discussed. The functionality of the calibration structure has already been presented. Clear benefits in implementation area and accuracy can be reached when using mismatch calibration based on combination selection of fine-tuning transistors. However, with the high accuracy requirements, the effects of temperature must be taken into the account. Temperature compensation circuitry for combination selection based mismatch calibration is developed, designed and simulated in digital 65 nm CMOS technology. The new temperature compensated and mismatch calibrated current source achieves 99% accuracy in 4σ confidence over the temperature range of 40 degrees in centigrade. This range can still be extended by recalibrating the current source in intervals of 20 degrees in centigrade.
{"title":"Temperature behavior of combination selection based mismatch calibration with 65 nm CMOS technology","authors":"J. Marku, J. Poikonen, A. Paasio","doi":"10.1109/SOCCON.2009.5398033","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398033","url":null,"abstract":"The temperature behaviour of a combination selection based mismatch calibration is discussed. The functionality of the calibration structure has already been presented. Clear benefits in implementation area and accuracy can be reached when using mismatch calibration based on combination selection of fine-tuning transistors. However, with the high accuracy requirements, the effects of temperature must be taken into the account. Temperature compensation circuitry for combination selection based mismatch calibration is developed, designed and simulated in digital 65 nm CMOS technology. The new temperature compensated and mismatch calibrated current source achieves 99% accuracy in 4σ confidence over the temperature range of 40 degrees in centigrade. This range can still be extended by recalibrating the current source in intervals of 20 degrees in centigrade.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131658020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398048
Alexander Fell, P. Biswas, Jugantor Chetia, S. Nandy, R. Narayan
RECONNECT is a Network-on-Chip using a honeycomb topology. In this paper we focus on properties of general rules applicable to a variety of routing algorithms for the NoC which take into account the missing links of the honeycomb topology when compared to a mesh. We also extend the original proposal [5] and show a method to insert and extract data to and from the network. Access Routers at the boundary of the execution fabric establish connections to multiple periphery modules and create a torus to decrease the node distances. Our approach is scalable and ensures homogeneity among the compute elements in the NoC. We synthesized and evaluated the proposed enhancement in terms of power dissipation and area. Our results indicate that the impact of necessary alterations to the fabric is negligible and effects the data transfer between the fabric and the periphery only marginally.
{"title":"Generic routing rules and a scalable access enhancement for the Network-on-Chip RECONNECT","authors":"Alexander Fell, P. Biswas, Jugantor Chetia, S. Nandy, R. Narayan","doi":"10.1109/SOCCON.2009.5398048","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398048","url":null,"abstract":"RECONNECT is a Network-on-Chip using a honeycomb topology. In this paper we focus on properties of general rules applicable to a variety of routing algorithms for the NoC which take into account the missing links of the honeycomb topology when compared to a mesh. We also extend the original proposal [5] and show a method to insert and extract data to and from the network. Access Routers at the boundary of the execution fabric establish connections to multiple periphery modules and create a torus to decrease the node distances. Our approach is scalable and ensures homogeneity among the compute elements in the NoC. We synthesized and evaluated the proposed enhancement in terms of power dissipation and area. Our results indicate that the impact of necessary alterations to the fabric is negligible and effects the data transfer between the fabric and the periphery only marginally.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127093715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398093
Mahtab Niknahad, M. Hübner, J. Becker
Online routing is the method, for connecting hardware resources on reconfigurable hardware while run-time. In this paper we show how to use the bipartite graph presentation of nano architectures to improve their performance during the online routing. We define the performance optimization problem in online routing and then, by defining a cost function based on the graph presentation, apply a semi simulated annealing to solve this optimization problem. The running order of the cost function computation algorithm is linear and easily applicable in runtime.
{"title":"Method for improving performance in online routing of reconfigurable nano architectures","authors":"Mahtab Niknahad, M. Hübner, J. Becker","doi":"10.1109/SOCCON.2009.5398093","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398093","url":null,"abstract":"Online routing is the method, for connecting hardware resources on reconfigurable hardware while run-time. In this paper we show how to use the bipartite graph presentation of nano architectures to improve their performance during the online routing. We define the performance optimization problem in online routing and then, by defining a cost function based on the graph presentation, apply a semi simulated annealing to solve this optimization problem. The running order of the cost function computation algorithm is linear and easily applicable in runtime.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132430927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398013
N. Li, N. V. D. Meijs
This paper presents a novel parallel pipeline FFT processor especially tailored for Multiband Orthogonal Frequency Division Multiplexing (MB-OFDM) Ultra Wideband (UWB) system, which was defined by ECMA International. The proposed Radix 22 Parallel Pipeline processor, which employs two parallel data path Radix 22 algorithm and single-path delay feedback (SDF) pipeline architecture, is a small-area and low-power-consumption solution for MB-OFDM UWB system. Both FPGA Xilinx Virtex4 and ASIC 90 nm technology, 1V supply voltage targeted synthesis results of this architecture are presented. It is shown from the results that, due to the revised algorithm and novel architecture, the required clock frequency is 264MHz to meet the ECMA requirement. Meanwhile, the required gates are 39000 without testing block and the corresponding area is 181140 μm2.
{"title":"A Radix 22 based parallel pipeline FFT processor for MB-OFDM UWB system","authors":"N. Li, N. V. D. Meijs","doi":"10.1109/SOCCON.2009.5398013","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398013","url":null,"abstract":"This paper presents a novel parallel pipeline FFT processor especially tailored for Multiband Orthogonal Frequency Division Multiplexing (MB-OFDM) Ultra Wideband (UWB) system, which was defined by ECMA International. The proposed Radix 22 Parallel Pipeline processor, which employs two parallel data path Radix 22 algorithm and single-path delay feedback (SDF) pipeline architecture, is a small-area and low-power-consumption solution for MB-OFDM UWB system. Both FPGA Xilinx Virtex4 and ASIC 90 nm technology, 1V supply voltage targeted synthesis results of this architecture are presented. It is shown from the results that, due to the revised algorithm and novel architecture, the required clock frequency is 264MHz to meet the ECMA requirement. Meanwhile, the required gates are 39000 without testing block and the corresponding area is 181140 μm2.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"51 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134126610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398060
C. Wang, V. Fusco
A 56–66GHz MMIC quadrupler was also developed for application in V-band wireless communications transceivers. The quadrupler MMIC has low return loss, (better than −15dB on both input and output), −15dBm conversion loss, for a 5dBm input drive signal, and excellent output spectrum purity, (all harmonics lie below −20dBc), with all unwanted harmonic components suppressed by more than 20dB. The quadrupler has two output ports isolated by a Wilkinson power divider. This arrangement allows it to simultaneously provide receive and transmit local oscillator drives for a V-band transceiver. With the solution provided here one quadrupler covers the entire worldwide 57–66GHz unlicensed V-band Gigabit/sec radio bandwidth allocation without need for individual sub-band components.
{"title":"High-purity 56–66GHz quadrupler for V-band radio homodyne and heterodyne transceiver applications","authors":"C. Wang, V. Fusco","doi":"10.1109/SOCCON.2009.5398060","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398060","url":null,"abstract":"A 56–66GHz MMIC quadrupler was also developed for application in V-band wireless communications transceivers. The quadrupler MMIC has low return loss, (better than −15dB on both input and output), −15dBm conversion loss, for a 5dBm input drive signal, and excellent output spectrum purity, (all harmonics lie below −20dBc), with all unwanted harmonic components suppressed by more than 20dB. The quadrupler has two output ports isolated by a Wilkinson power divider. This arrangement allows it to simultaneously provide receive and transmit local oscillator drives for a V-band transceiver. With the solution provided here one quadrupler covers the entire worldwide 57–66GHz unlicensed V-band Gigabit/sec radio bandwidth allocation without need for individual sub-band components.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"306 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116226268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-09-01DOI: 10.1109/SOCCON.2009.5398044
Yang Sun, Joseph R. Cavallaro, Tai Ly
This paper presents a scalable and low power low-density parity-check (LDPC) decoder design for the next generation wireless handset SoC. The methodology is based on high level synthesis: PICO (program-in chip-out) tool was used to produce efficient RTL directly from a sequential un-timed C algorithm. We propose two parallel LDPC decoder architectures: (1) per-layer decoding architecture with scalable parallelism, and (2) multi-layer pipelined decoding architecture to achieve higher throughput. Based on the PICO technology, we have implemented a two-layer pipelined decoder on a TSMC 65nm 0.9V 8-metal layer CMOS technology with a core area of 1.2 mm2. The maximum achievable throughput is 415 Mbps when operating at 400 MHz clock frequency and the estimated peak power consumption is 180 mW.
{"title":"Scalable and low power LDPC decoder design using high level algorithmic synthesis","authors":"Yang Sun, Joseph R. Cavallaro, Tai Ly","doi":"10.1109/SOCCON.2009.5398044","DOIUrl":"https://doi.org/10.1109/SOCCON.2009.5398044","url":null,"abstract":"This paper presents a scalable and low power low-density parity-check (LDPC) decoder design for the next generation wireless handset SoC. The methodology is based on high level synthesis: PICO (program-in chip-out) tool was used to produce efficient RTL directly from a sequential un-timed C algorithm. We propose two parallel LDPC decoder architectures: (1) per-layer decoding architecture with scalable parallelism, and (2) multi-layer pipelined decoding architecture to achieve higher throughput. Based on the PICO technology, we have implemented a two-layer pipelined decoder on a TSMC 65nm 0.9V 8-metal layer CMOS technology with a core area of 1.2 mm2. The maximum achievable throughput is 415 Mbps when operating at 400 MHz clock frequency and the estimated peak power consumption is 180 mW.","PeriodicalId":303505,"journal":{"name":"2009 IEEE International SOC Conference (SOCC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127816414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}