Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258123
P. Hallschmid, R. Saleh
Recent research in the area of application specific instruction set processors (ASIPs) has focused on the automatic selection of a custom instruction set based on a high level description of the application. Existing methods perform instruction selection under the assumption that data hazards can be ignored due to functional unit forwarding. This paper addresses data hazards in the ASIP flow when functional unit to functional unit forwarding is too expensive. This is accomplished by devising a "hazard-aware" predictor for measuring the impact of custom instructions on performance. Results show that our predictor reduces prediction error from 50% to 15% compared to the existing simple predictor and with a fraction of the run-time of rescheduling. When incorporated into an instruction enumeration and selection algorithm, our predictor reduces the total schedule length by as much as 8.4%
应用专用指令集处理器(application specific instruction set processor, asip)领域的最新研究主要集中在基于应用的高级描述自动选择自定义指令集。现有方法是在假设功能单元转发可以忽略数据危害的前提下进行指令选择的。本文讨论了当功能单元到功能单元的转发过于昂贵时,ASIP流中的数据危害。这是通过设计一个“危险意识”预测器来测量自定义指令对性能的影响来实现的。结果表明,与现有的简单预测器相比,我们的预测器将预测误差从50%降低到15%,并且只需要一小部分重新调度的运行时间。当合并到指令枚举和选择算法中时,我们的预测器将总调度长度减少了8.4%
{"title":"Hazard-Aware Performance Prediction for Automatic Instruction-Set Selection","authors":"P. Hallschmid, R. Saleh","doi":"10.1109/VDAT.2006.258123","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258123","url":null,"abstract":"Recent research in the area of application specific instruction set processors (ASIPs) has focused on the automatic selection of a custom instruction set based on a high level description of the application. Existing methods perform instruction selection under the assumption that data hazards can be ignored due to functional unit forwarding. This paper addresses data hazards in the ASIP flow when functional unit to functional unit forwarding is too expensive. This is accomplished by devising a \"hazard-aware\" predictor for measuring the impact of custom instructions on performance. Results show that our predictor reduces prediction error from 50% to 15% compared to the existing simple predictor and with a fraction of the run-time of rescheduling. When incorporated into an instruction enumeration and selection algorithm, our predictor reduces the total schedule length by as much as 8.4%","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"40 1-8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123379202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258158
Bo-Jiun Chen, Shao-Ku Kao, Shen-Iuan Liu
An all-digital 50% duty cycle corrector (DCC) is presented. The features of the proposed DCC include a wide operation frequency range, a wide input duty cycle range for the input clock, and a short locked time to recover the duty cycle of 50%. This digital DCC has been implemented in a 0.35mum 2P4M CMOS process. The acceptable duty cycle and frequency range of the input clock is 25%-75% and 250MHz-600MHz, respectively. The measured peak-peak jitter is 17.3ps at 600MHz. Besides, this DCC saves the power consumption by turning off a half delay line. Its power consumption is 16mW at 600MHz
提出了一种全数字50%占空比校正器(DCC)。该DCC具有宽工作频率范围、宽输入时钟占空比范围和短锁定时间以恢复50%的占空比等特点。该数字DCC已在0.35 μ m 2P4M CMOS工艺中实现。输入时钟的可接受占空比为25% ~ 75%,频率范围为250mhz ~ 600mhz。测量到的峰值抖动在600MHz时为17.3ps。此外,该DCC通过关闭半延迟线来节省功耗。在600MHz时,其功耗为16mW
{"title":"An All-Digital Duty Cycle Corrector","authors":"Bo-Jiun Chen, Shao-Ku Kao, Shen-Iuan Liu","doi":"10.1109/VDAT.2006.258158","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258158","url":null,"abstract":"An all-digital 50% duty cycle corrector (DCC) is presented. The features of the proposed DCC include a wide operation frequency range, a wide input duty cycle range for the input clock, and a short locked time to recover the duty cycle of 50%. This digital DCC has been implemented in a 0.35mum 2P4M CMOS process. The acceptable duty cycle and frequency range of the input clock is 25%-75% and 250MHz-600MHz, respectively. The measured peak-peak jitter is 17.3ps at 600MHz. Besides, this DCC saves the power consumption by turning off a half delay line. Its power consumption is 16mW at 600MHz","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"466 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123020468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258128
Donald Y. C. Lie, J. Popp, P. Lee, A. Yang, Jason Rowlando, Feipeng Wang, Donald Kimball
This paper discusses and compares the design of monolithic RF broadband class E SiGe power amplifiers (PAs) centered at 900MHz that are highly efficient and linear. It is found that high power-added-efficiency (~65%) can be achieved with PAs designed using either high-breakdown or high-fT SiGe transistors. The PAs designed with high-breakdown devices can provide ~3% better efficiency at higher supply voltages but with worse bias sensitivity, inferior broadband frequency response, and slightly lower gain than those designed with high-fT devices. However, the class E PAs designed using high-breakdown devices can be successfully linearized using an open-loop envelope tracking (ET) technique as their output spectra pass the stringent EDGE transmit mask with margins, achieving an overall system PAE of 44.4% that surpasses the ~30% PAE obtainable using commercial GaAs class AB PAs. These promising results indicate the feasibility of realizing true single-chip wireless transceivers with on-chip RF SiGe PAs for spectrally-efficient non-constant-envelope modulation schemes
{"title":"Monolithic Class E SiGe Power Amplifier Design with Wideband High-Efficiency and Linearity","authors":"Donald Y. C. Lie, J. Popp, P. Lee, A. Yang, Jason Rowlando, Feipeng Wang, Donald Kimball","doi":"10.1109/VDAT.2006.258128","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258128","url":null,"abstract":"This paper discusses and compares the design of monolithic RF broadband class E SiGe power amplifiers (PAs) centered at 900MHz that are highly efficient and linear. It is found that high power-added-efficiency (~65%) can be achieved with PAs designed using either high-breakdown or high-fT SiGe transistors. The PAs designed with high-breakdown devices can provide ~3% better efficiency at higher supply voltages but with worse bias sensitivity, inferior broadband frequency response, and slightly lower gain than those designed with high-fT devices. However, the class E PAs designed using high-breakdown devices can be successfully linearized using an open-loop envelope tracking (ET) technique as their output spectra pass the stringent EDGE transmit mask with margins, achieving an overall system PAE of 44.4% that surpasses the ~30% PAE obtainable using commercial GaAs class AB PAs. These promising results indicate the feasibility of realizing true single-chip wireless transceivers with on-chip RF SiGe PAs for spectrally-efficient non-constant-envelope modulation schemes","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"424 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123096127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258159
Ching-Che Chung, Pao-Lung Chen, Chen-Yi Lee
This paper presents an all-digital delay-locked loop (DLL) for DDR SDRAM controller applications. The presented all-digital, cell-based, DLL-based five-phase multi-phase clock generator can generate the required fixed timing delay (tSD) for DDR SDRAM controller to capture the output data (DQ) correctly. The proposed DLL-based multi-phase clock generator architecture can lock to the harmonic of input clock period and still get a correct multi-phase clock output. Hence the design challenges to build a high resolution delay line with minimum intrinsic delay can be reduced. Simulation results and chip measurement results show that the proposed DLL can generate desired tSD delay with error < 7.6%. The power consumption of the proposed DLL is 4.1mW (at DDR-200) and is 9.0mW (at DDR-400)
{"title":"An All-Digital Delay-Locked Loop for DDR SDRAM Controller Applications","authors":"Ching-Che Chung, Pao-Lung Chen, Chen-Yi Lee","doi":"10.1109/VDAT.2006.258159","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258159","url":null,"abstract":"This paper presents an all-digital delay-locked loop (DLL) for DDR SDRAM controller applications. The presented all-digital, cell-based, DLL-based five-phase multi-phase clock generator can generate the required fixed timing delay (tSD) for DDR SDRAM controller to capture the output data (DQ) correctly. The proposed DLL-based multi-phase clock generator architecture can lock to the harmonic of input clock period and still get a correct multi-phase clock output. Hence the design challenges to build a high resolution delay line with minimum intrinsic delay can be reduced. Simulation results and chip measurement results show that the proposed DLL can generate desired tSD delay with error < 7.6%. The power consumption of the proposed DLL is 4.1mW (at DDR-200) and is 9.0mW (at DDR-400)","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117331665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258149
Lu-Yen Ko, Shi-Yu Huang, Jia-Liang Chiou, H. Cheng
We address in this paper the defect modeling and testing of intra-cell bridging defects from the layout perspective. For defect modeling, we incorporate a butterfly structure to resolve the potential non-logical effect a bridging defect may cause. By doing so, a realistic Boolean fault model at the gate level can thus be generated for each defect under consideration. Furthermore, the test vectors can be generated by a formulation on top of existing ATPG tools. Experimental results indicate that simple stuck-at test set can only achieve 85% coverage for intra-cell bridging defects for ISCAS85. The proposed systematic flow can further boost it to 99%
{"title":"Modeling and Testing of Intra-Cell Bridging Defects Using Butterfly Structure","authors":"Lu-Yen Ko, Shi-Yu Huang, Jia-Liang Chiou, H. Cheng","doi":"10.1109/VDAT.2006.258149","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258149","url":null,"abstract":"We address in this paper the defect modeling and testing of intra-cell bridging defects from the layout perspective. For defect modeling, we incorporate a butterfly structure to resolve the potential non-logical effect a bridging defect may cause. By doing so, a realistic Boolean fault model at the gate level can thus be generated for each defect under consideration. Furthermore, the test vectors can be generated by a formulation on top of existing ATPG tools. Experimental results indicate that simple stuck-at test set can only achieve 85% coverage for intra-cell bridging defects for ISCAS85. The proposed systematic flow can further boost it to 99%","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126306410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258153
W. Lo, Yu-Liang Wu
Redundancy addition and removal, also known as rewiring, is a building block of a wide range of circuit optimization applications. In this paper, we propose a novel improvement on the FIRE redundancy identification technique and augment the state-of-the-art rewiring scheme RAMFIRE with it. Our method increases the number of alternative wires identified by 10% and improves the runtime by nearly 20%. Optimization applications based on rewiring can take the advantage of this speed up and enhanced rewiring power
{"title":"Improving Single-Pass Redundancy Addition and Removal with Inconsistent Assignments","authors":"W. Lo, Yu-Liang Wu","doi":"10.1109/VDAT.2006.258153","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258153","url":null,"abstract":"Redundancy addition and removal, also known as rewiring, is a building block of a wide range of circuit optimization applications. In this paper, we propose a novel improvement on the FIRE redundancy identification technique and augment the state-of-the-art rewiring scheme RAMFIRE with it. Our method increases the number of alternative wires identified by 10% and improves the runtime by nearly 20%. Optimization applications based on rewiring can take the advantage of this speed up and enhanced rewiring power","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128042752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258113
C. Chien, A. Wang, Weber Chien
Conventional displays often use a striped RGB pattern, but there exists some side-effect, such as sawtooth pattern, lower-yield rate and lower resolution/per area etc. Sitronix Technology Co. Ltd. are discovering alternative patterns, called SPRD, that could offer some new benefits. It combines the unique color filter and algorithm to perform the best display quality
{"title":"New LCD Display Technology for High Performance with Low Cost-Shared Pixel Rendering Display","authors":"C. Chien, A. Wang, Weber Chien","doi":"10.1109/VDAT.2006.258113","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258113","url":null,"abstract":"Conventional displays often use a striped RGB pattern, but there exists some side-effect, such as sawtooth pattern, lower-yield rate and lower resolution/per area etc. Sitronix Technology Co. Ltd. are discovering alternative patterns, called SPRD, that could offer some new benefits. It combines the unique color filter and algorithm to perform the best display quality","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"269 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114423392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258175
A. Katyal, N. Bansal
Power-on reset circuits are available as discrete devices as well as on-chip solutions and are indispensable to initialize some critical nodes of analog and digital designs during power-on. In this paper, we present a power-on reset circuit specifically designed for on-chip applications. The mentioned POR circuit should meet certain design requirements necessary to be integrated on-chip, some of them being area-efficiency, power-efficiency, supply rise-time insensitivity and ambient temperature insensitivity. The circuit is implemented within a small area (60mum times 35mum) using the 2.5V tolerant MOSFETs of a 0.28mum CMOS technology. It has a maximum quiescent current consumption of 40muA and works over infinite range of supply rise-times and ambient temperature range of -40degC to 150degC
{"title":"A Self-Biased Current Source Based Power-On Reset Circuit for On-Chip Applications","authors":"A. Katyal, N. Bansal","doi":"10.1109/VDAT.2006.258175","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258175","url":null,"abstract":"Power-on reset circuits are available as discrete devices as well as on-chip solutions and are indispensable to initialize some critical nodes of analog and digital designs during power-on. In this paper, we present a power-on reset circuit specifically designed for on-chip applications. The mentioned POR circuit should meet certain design requirements necessary to be integrated on-chip, some of them being area-efficiency, power-efficiency, supply rise-time insensitivity and ambient temperature insensitivity. The circuit is implemented within a small area (60mum times 35mum) using the 2.5V tolerant MOSFETs of a 0.28mum CMOS technology. It has a maximum quiescent current consumption of 40muA and works over infinite range of supply rise-times and ambient temperature range of -40degC to 150degC","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123831427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258115
Zhih-Siou Cheng, J. Bor
A 64 dB gain range VGA with DC offset calibration loop is proposed in this work. This VGA adopts the degeneration type amplifier to vary voltage gain and uses the super-source-follower input stage to enhance the linearity. A digital-based DC offset calibration loop is also designed to solve the DC offset problem. An experimental chip is fabricated in 0.18 mum process. With 2 dB step, the gain error is less than 0.8 dB and the output DC offset is less than 100mV at maximum gain setting. The total power consumption is 11 mW
{"title":"A CMOS Variable Gain Amplifier with DC Offset Calibration Loop for Wireless Communications","authors":"Zhih-Siou Cheng, J. Bor","doi":"10.1109/VDAT.2006.258115","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258115","url":null,"abstract":"A 64 dB gain range VGA with DC offset calibration loop is proposed in this work. This VGA adopts the degeneration type amplifier to vary voltage gain and uses the super-source-follower input stage to enhance the linearity. A digital-based DC offset calibration loop is also designed to solve the DC offset problem. An experimental chip is fabricated in 0.18 mum process. With 2 dB step, the gain error is less than 0.8 dB and the output DC offset is less than 100mV at maximum gain setting. The total power consumption is 11 mW","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116133529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-04-26DOI: 10.1109/VDAT.2006.258134
T. Ishikawa, K. Shimizu, T. Ikenaga, S. Goto
We have designed and implemented the LDPC decoder with memory-reduction method to achieve high-throughput and practical hardware size for long code-length. The decoder decodes (3,6)-11520-bit regular LDPC codes using modified min-sum algorithm. The decoder achieves a throughput of 312 Mb/s at an operating frequency of 69 MHz with 20 iterative decoding. The gate count is 2M gates
{"title":"High-Throughput LDPC Decoder for Long Code-Length","authors":"T. Ishikawa, K. Shimizu, T. Ikenaga, S. Goto","doi":"10.1109/VDAT.2006.258134","DOIUrl":"https://doi.org/10.1109/VDAT.2006.258134","url":null,"abstract":"We have designed and implemented the LDPC decoder with memory-reduction method to achieve high-throughput and practical hardware size for long code-length. The decoder decodes (3,6)-11520-bit regular LDPC codes using modified min-sum algorithm. The decoder achieves a throughput of 312 Mb/s at an operating frequency of 69 MHz with 20 iterative decoding. The gate count is 2M gates","PeriodicalId":356198,"journal":{"name":"2006 International Symposium on VLSI Design, Automation and Test","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127498220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}