Reliability of systems used in space, avionic and biomedical applications is highly critical. Such systems consist of an analog front-end to collect data, an ADC to convert the collected data to digital form and a digital unit to process it. It is important to analyze the fault sensitivities of each of these to effectively gauge and improve the reliability of the system. This paper addresses the issue of fault sensitivity of ADCs. A generic methodology for analyzing the fault sensitivity of ADCs is presented. A novel concept of "node weights" specific to /spl alpha/-particle induced transient faults is introduced to increase the accuracy of such an analysis.
{"title":"Transient fault sensitivity analysis of analog-to-digital converters (ADCs)","authors":"Mandeep Singh, R. Rachala, I. Koren","doi":"10.1109/IWV.2001.923153","DOIUrl":"https://doi.org/10.1109/IWV.2001.923153","url":null,"abstract":"Reliability of systems used in space, avionic and biomedical applications is highly critical. Such systems consist of an analog front-end to collect data, an ADC to convert the collected data to digital form and a digital unit to process it. It is important to analyze the fault sensitivities of each of these to effectively gauge and improve the reliability of the system. This paper addresses the issue of fault sensitivity of ADCs. A generic methodology for analyzing the fault sensitivity of ADCs is presented. A novel concept of \"node weights\" specific to /spl alpha/-particle induced transient faults is introduced to increase the accuracy of such an analysis.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117290927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Paker, Jens Sparsø, Niels Haandbæk, Mogens Isager, L. S. Nielsen
This paper describes a low-power programmable DSP architecture that targets audio signal processing. The architecture can be characterized as a heterogeneous multiprocessor consisting of small and simple instruction set processors called mini-cores that communicate using message passing. The processors are tailored for different classes of filtering algorithms (FIR, IIR, N-LMS etc.), and in a typical system the communication among processors occurs at the sampling rate only. The processors are parameterized in word-size, memory-size, etc. and can be instantiated according to the needs of the application at hand using a normal synthesis based ASIC design flow. To give an impression of the size of a processor we mention that one of the FIR processors in a prototype design has 16 instructions, a 32 word/spl times/16 bit program memory, a 64 word/spl times/16 bit data memory and a 25 word/spl times/16 bit coefficient memory. Early results obtained from the design of a prototype chip containing filter processors for a hearing aid application, indicate a power consumption that is an order of magnitude better than current state of the art low-power audio DSPs implemented using full-custom techniques. This is due to: (1) the small size of the processors and (2) a smaller instruction count for a given task.
{"title":"A heterogeneous multiprocessor architecture for low-power audio signal processing applications","authors":"O. Paker, Jens Sparsø, Niels Haandbæk, Mogens Isager, L. S. Nielsen","doi":"10.1109/IWV.2001.923139","DOIUrl":"https://doi.org/10.1109/IWV.2001.923139","url":null,"abstract":"This paper describes a low-power programmable DSP architecture that targets audio signal processing. The architecture can be characterized as a heterogeneous multiprocessor consisting of small and simple instruction set processors called mini-cores that communicate using message passing. The processors are tailored for different classes of filtering algorithms (FIR, IIR, N-LMS etc.), and in a typical system the communication among processors occurs at the sampling rate only. The processors are parameterized in word-size, memory-size, etc. and can be instantiated according to the needs of the application at hand using a normal synthesis based ASIC design flow. To give an impression of the size of a processor we mention that one of the FIR processors in a prototype design has 16 instructions, a 32 word/spl times/16 bit program memory, a 64 word/spl times/16 bit data memory and a 25 word/spl times/16 bit coefficient memory. Early results obtained from the design of a prototype chip containing filter processors for a hearing aid application, indicate a power consumption that is an order of magnitude better than current state of the art low-power audio DSPs implemented using full-custom techniques. This is due to: (1) the small size of the processors and (2) a smaller instruction count for a given task.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"8 Suppl 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130341961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Dasigenis, N. Kroupis, A. Argyriou, K. Tatas, D. Soudris, A. Thanailakis, N. Zervas
A methodology for power optimization of the data memory hierarchy and instruction memory, is introduced. The impact of the methodology on a set of widely used multimedia application kernels, namely Full Search (FS), Hierarchical Search (HS), Parallel Hierarchical One Dimension Search (PHODS), and Three Step Logarithmic Search (3SLS), is demonstrated. We find the power optimal data memory hierarchy applying the appropriate data-use transformation, while the instruction power optimization is done using suitable cache memory. Using data-reuse transformations, performance optimizations techniques, and instruction-level transformations, we perform exhaustive exploration of an the possible alternatives to reach power efficient solutions. Concerning the embedded processor ARM, the experimental results prove the efficiency of the methodology in terms of power for all the multimedia kernels.
{"title":"A memory management approach for efficient implementation of multimedia kernels on programmable architectures","authors":"M. Dasigenis, N. Kroupis, A. Argyriou, K. Tatas, D. Soudris, A. Thanailakis, N. Zervas","doi":"10.1109/IWV.2001.923157","DOIUrl":"https://doi.org/10.1109/IWV.2001.923157","url":null,"abstract":"A methodology for power optimization of the data memory hierarchy and instruction memory, is introduced. The impact of the methodology on a set of widely used multimedia application kernels, namely Full Search (FS), Hierarchical Search (HS), Parallel Hierarchical One Dimension Search (PHODS), and Three Step Logarithmic Search (3SLS), is demonstrated. We find the power optimal data memory hierarchy applying the appropriate data-use transformation, while the instruction power optimization is done using suitable cache memory. Using data-reuse transformations, performance optimizations techniques, and instruction-level transformations, we perform exhaustive exploration of an the possible alternatives to reach power efficient solutions. Concerning the embedded processor ARM, the experimental results prove the efficiency of the methodology in terms of power for all the multimedia kernels.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131176731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chemically assembled electronic nanotechnology (CAEN) is a promising alternative to CMOS for constructing circuits with feature sizes in the tens of nanometers range. In this paper we describe some of the recent advances in CAEN and how they influence the design of digital circuits. We show how reconfigurability supports inexpensive manufacturing. Finally, we describe a molecular latch that overcomes the lack of a viable CAEN-based transistor.
{"title":"Electronic nanotechnology and reconfigurable computing","authors":"S. Goldstein","doi":"10.1109/IWV.2001.923133","DOIUrl":"https://doi.org/10.1109/IWV.2001.923133","url":null,"abstract":"Chemically assembled electronic nanotechnology (CAEN) is a promising alternative to CMOS for constructing circuits with feature sizes in the tens of nanometers range. In this paper we describe some of the recent advances in CAEN and how they influence the design of digital circuits. We show how reconfigurability supports inexpensive manufacturing. Finally, we describe a molecular latch that overcomes the lack of a viable CAEN-based transistor.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133180161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper a novel hybrid wave-pipelined bit-pattern associative router is presented. A router is an important component in communication network systems. The bit-pattern associative router (BPAR) allows for flexibility and can accommodate a large number of routing algorithms. Wave-pipelining is a high performance approach which implements pipelining in logic without using intermediate registers. In this study a hybrid wave-pipelined approach has been proposed and implemented. Hybrid wave-pipelining allows for the reduction of the delay difference between the maximum and minimum delays by narrowing the gap between each stage of the system. This approach yields narrow "computing cones" that allow faster clocks to be run. This is the first study in wave-pipelining that deals with a system that has a substantially different set of pipeline stages. The bit-pattern associative router has three stages: condition match, selection function, and port assignment. Each stage's data delay paths are tightly controlled to optimize the proper propagation of signals. The simulation results show that using hybrid wave-pipelining significantly reduces the clock period and circuit delays become the limiting factor, preventing further clock cycle time reduction.
{"title":"A hybrid wave-pipelined network router","authors":"J. Delgado-Frías, J. Nyathi","doi":"10.1109/IWV.2001.923156","DOIUrl":"https://doi.org/10.1109/IWV.2001.923156","url":null,"abstract":"In this paper a novel hybrid wave-pipelined bit-pattern associative router is presented. A router is an important component in communication network systems. The bit-pattern associative router (BPAR) allows for flexibility and can accommodate a large number of routing algorithms. Wave-pipelining is a high performance approach which implements pipelining in logic without using intermediate registers. In this study a hybrid wave-pipelined approach has been proposed and implemented. Hybrid wave-pipelining allows for the reduction of the delay difference between the maximum and minimum delays by narrowing the gap between each stage of the system. This approach yields narrow \"computing cones\" that allow faster clocks to be run. This is the first study in wave-pipelining that deals with a system that has a substantially different set of pipeline stages. The bit-pattern associative router has three stages: condition match, selection function, and port assignment. Each stage's data delay paths are tightly controlled to optimize the proper propagation of signals. The simulation results show that using hybrid wave-pipelining significantly reduces the clock period and circuit delays become the limiting factor, preventing further clock cycle time reduction.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122384327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper describes a system level design approach to the wearable computers and wireless networks project at Carnegie Mellon University (CMU). Over the last almost ten years we have designed and fabricated twenty new generations of wearable computers, with most of them using wireless network infrastructure. We emphasize the importance of wireless communication and the amount of energy it requires. A system-level approach to power/performance optimization is going to be a crucial catalyst for making wearable computers an everyday tool for the general public.
{"title":"System design of low-energy wearable computers with wireless networking","authors":"A. Smailagic, D. Siewiorek, M. Ettus","doi":"10.1109/IWV.2001.923135","DOIUrl":"https://doi.org/10.1109/IWV.2001.923135","url":null,"abstract":"The paper describes a system level design approach to the wearable computers and wireless networks project at Carnegie Mellon University (CMU). Over the last almost ten years we have designed and fabricated twenty new generations of wearable computers, with most of them using wireless network infrastructure. We emphasize the importance of wireless communication and the amount of energy it requires. A system-level approach to power/performance optimization is going to be a crucial catalyst for making wearable computers an everyday tool for the general public.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"359 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122763190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a method for evaluating the metastability of a flip-flop circuit for random number generation applications. It is well known that digital circuits can exhibit metastable behavior when the input to a flip-flop is asynchronous to the system clock. In the past, extensive research has been focused on eliminating metastability in digital systems. Here, we present some preliminary results of our research to exploit metastable behavior in sequential logic circuits to produce random bit streams for random number generation. In particular, we explore the idea of tapping the electronic noise present in D-type flip-flops to produce random bit streams for use as a one-time cryptographic key-pad for encryption algorithms. This research will serve as a basis for further research into the very-large-scale-integration (VLSI) of random number generators (RNGs).
{"title":"Evaluating metastability in electronic circuits for random number generation","authors":"S. Walker, S. Foo","doi":"10.1109/IWV.2001.923146","DOIUrl":"https://doi.org/10.1109/IWV.2001.923146","url":null,"abstract":"This paper presents a method for evaluating the metastability of a flip-flop circuit for random number generation applications. It is well known that digital circuits can exhibit metastable behavior when the input to a flip-flop is asynchronous to the system clock. In the past, extensive research has been focused on eliminating metastability in digital systems. Here, we present some preliminary results of our research to exploit metastable behavior in sequential logic circuits to produce random bit streams for random number generation. In particular, we explore the idea of tapping the electronic noise present in D-type flip-flops to produce random bit streams for use as a one-time cryptographic key-pad for encryption algorithms. This research will serve as a basis for further research into the very-large-scale-integration (VLSI) of random number generators (RNGs).","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122897288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we focus on the design of threshold logic functions in Single Electron Tunneling (SET) technology, using the tunnel junction's specific behavior i.e., the ability to control the transport of individual electrons. We introduce a novel design of an n-input linear threshold gate which can accommodate both positive and negative weights and built-in signal amplification, using 1 tunnel junction and n+2 true capacitors. As an example we present a 4-input threshold gate with both positive and negative weights.
{"title":"A linear threshold gate implementation in single electron technology","authors":"C. Lageweg, S. Cotofana, S. Vassiliadis","doi":"10.1109/IWV.2001.923145","DOIUrl":"https://doi.org/10.1109/IWV.2001.923145","url":null,"abstract":"In this paper we focus on the design of threshold logic functions in Single Electron Tunneling (SET) technology, using the tunnel junction's specific behavior i.e., the ability to control the transport of individual electrons. We introduce a novel design of an n-input linear threshold gate which can accommodate both positive and negative weights and built-in signal amplification, using 1 tunnel junction and n+2 true capacitors. As an example we present a 4-input threshold gate with both positive and negative weights.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127772785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we study the technology mapping problem for LUT-based FPGAs targeting power minimization. We present the PowerMap algorithm to generate a mapping solution to minimize power consumption while keeping the delay optimal. We compute min-height K-feasible cuts for critical nodes to optimize the depth and compute min-weight K-feasible cuts for noncritical nodes to minimize the power consumption of the mapping solution. We have implemented PowerMap in C and tested it on a number of MCNC benchmark circuits. Compared to FlowMap, a delay-optimal mapper, our algorithm reduces the power consumption by 17.8% and uses 9.4% less LUTs without any depth penalty.
{"title":"LUT-based FPGA technology mapping for power minimization with optimal depth","authors":"Hao Li, Wai-Kei Mak, S. Katkoori","doi":"10.1109/IWV.2001.923150","DOIUrl":"https://doi.org/10.1109/IWV.2001.923150","url":null,"abstract":"In this paper, we study the technology mapping problem for LUT-based FPGAs targeting power minimization. We present the PowerMap algorithm to generate a mapping solution to minimize power consumption while keeping the delay optimal. We compute min-height K-feasible cuts for critical nodes to optimize the depth and compute min-weight K-feasible cuts for noncritical nodes to minimize the power consumption of the mapping solution. We have implemented PowerMap in C and tested it on a number of MCNC benchmark circuits. Compared to FlowMap, a delay-optimal mapper, our algorithm reduces the power consumption by 17.8% and uses 9.4% less LUTs without any depth penalty.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128602148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A method based on a modulo scheduling algorithm for software pipelining has been recently proposed to optimize clocked circuits. The resulting circuits are multi-phase clocked circuits, where all clocks have the same period. To preserve the functionality of the original circuit, registers must be placed after minimizing the clock period. The placement of these registers is derived from an arbitrary schedule determined during a clock period minimization step. A good schedule may allow one to decrease the number of registers and the number of phases needed in the final circuit. Decreasing the number of registers contributes to minimizing the area occupied by the circuit and reduces its power consumption; while decreasing the number of phases reduces the complexity of the clock generation and distribution task. In this paper, we propose polynomial-time-solvable methods to choose a good schedule once the clock period is minimized. The methods have been tested on a subject of the ISCAS89 benchmarks. Experimental results show that the number of registers which must be inserted in the final circuit, and the number of phases, have been significantly decreased compared to the case where an arbitrary schedule is chosen.
{"title":"Reducing register and phase requirements for synchronous circuits derived using software pipelining techniques","authors":"N. Chabini, E. Aboulhamid, Y. Savaria","doi":"10.1109/IWV.2001.923142","DOIUrl":"https://doi.org/10.1109/IWV.2001.923142","url":null,"abstract":"A method based on a modulo scheduling algorithm for software pipelining has been recently proposed to optimize clocked circuits. The resulting circuits are multi-phase clocked circuits, where all clocks have the same period. To preserve the functionality of the original circuit, registers must be placed after minimizing the clock period. The placement of these registers is derived from an arbitrary schedule determined during a clock period minimization step. A good schedule may allow one to decrease the number of registers and the number of phases needed in the final circuit. Decreasing the number of registers contributes to minimizing the area occupied by the circuit and reduces its power consumption; while decreasing the number of phases reduces the complexity of the clock generation and distribution task. In this paper, we propose polynomial-time-solvable methods to choose a good schedule once the clock period is minimized. The methods have been tested on a subject of the ISCAS89 benchmarks. Experimental results show that the number of registers which must be inserted in the final circuit, and the number of phases, have been significantly decreased compared to the case where an arbitrary schedule is chosen.","PeriodicalId":114059,"journal":{"name":"Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124358563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}