Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646587
M. Mehendale, S. B. Roy, S. Sherlekar, G. Venkatesh
Techniques based on common sub-computation extraction can be used to minimize number of additions in the multiplier-less implementations of Finite Impulse Response (FIR) filters. We present two types of coefficient transforms which used in conjunction with these techniques enable area-efficient realization of multiplier-less FIR filters. (i) Number theoretic transforms-that use redundant binary representations such as Canonical Sign Digit (CSD) (ii) Signal Flow Graph transformations that modify the coefficient values while retaining the output functionality. We demonstrate this with results of 6 different coefficient transforms for 14 low pass FIR filters with number of taps ranging from 16 to 128.
{"title":"Coefficient transformations for area-efficient implementation of multiplier-less FIR filters","authors":"M. Mehendale, S. B. Roy, S. Sherlekar, G. Venkatesh","doi":"10.1109/ICVD.1998.646587","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646587","url":null,"abstract":"Techniques based on common sub-computation extraction can be used to minimize number of additions in the multiplier-less implementations of Finite Impulse Response (FIR) filters. We present two types of coefficient transforms which used in conjunction with these techniques enable area-efficient realization of multiplier-less FIR filters. (i) Number theoretic transforms-that use redundant binary representations such as Canonical Sign Digit (CSD) (ii) Signal Flow Graph transformations that modify the coefficient values while retaining the output functionality. We demonstrate this with results of 6 different coefficient transforms for 14 low pass FIR filters with number of taps ranging from 16 to 128.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"292 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132035300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646593
C. Srinivasan
A new technique to improve the capture range of a Phase Locked Loop (PLL) in the context of partial response signalling is presented. A known preamble is transmitted at the beginning to aid phase and frequency locking. Previous timing recovery techniques have a false locking problem for large initial frequency errors. The new technique eliminates this problem by using information available in the sampled preamble sequence. The improvement obtained is demonstrated using computer simulations.
{"title":"A technique to improve capture range of a PLL in PRML read channel","authors":"C. Srinivasan","doi":"10.1109/ICVD.1998.646593","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646593","url":null,"abstract":"A new technique to improve the capture range of a Phase Locked Loop (PLL) in the context of partial response signalling is presented. A known preamble is transmitted at the beginning to aid phase and frequency locking. Previous timing recovery techniques have a false locking problem for large initial frequency errors. The new technique eliminates this problem by using information available in the sampled preamble sequence. The improvement obtained is demonstrated using computer simulations.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"516 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133035951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646635
Amey Karkare, M. Singla, Ajai Jain
Multilevel logic optimization transformations for DFT (design for testability) used in existing logic systems, are characterized with respect to their testability preserving and testability enhancing properties. In this paper, we propose three new transformations which preserve or improve path delay testability with reduction in circuitry. The paper also includes a theorem showing the condition under which a testability preserving transformation (TPT) will be a testability enhancing transformations (TET).
{"title":"Testability preserving and enhancing transformations for robust delay fault testability","authors":"Amey Karkare, M. Singla, Ajai Jain","doi":"10.1109/ICVD.1998.646635","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646635","url":null,"abstract":"Multilevel logic optimization transformations for DFT (design for testability) used in existing logic systems, are characterized with respect to their testability preserving and testability enhancing properties. In this paper, we propose three new transformations which preserve or improve path delay testability with reduction in circuitry. The paper also includes a theorem showing the condition under which a testability preserving transformation (TPT) will be a testability enhancing transformations (TET).","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133701071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646577
Z. Apanovich, A. Marchuk
The approach to technology migration presented in this paper is based on a compaction and rerouting strategy. It takes as input the full-chip mask layout hierarchical description (CIF format) and produces as output the mask layout in the target design rules. The applicability of the compaction and rerouting facilities, and the flexibility of the routing layers redistribution between different levels of mask layout hierarchy, are provided by a procedure for mask layout decomposition. The decomposition procedure takes as input any node of the mask layout hierarchical description and extracts the fragments which should be transformed by means of compaction. The size of the extracted fragments is controlled by decomposition parameters. Each extracted fragment is processed by a symbolisation procedure which provides resizing and regeneration of elementary objects such as transistors, contacts and wires. The target mask layout for each fragment is generated by a compaction procedure which is controlled by the constraints extracted during the symbolisation step. The resulting chip mask layout is generated by a routing procedure which is controlled by the data structures (netlist and floorplan) extracted during the decomposition step.
{"title":"Top-down approach to technology migration for full-custom mask layouts","authors":"Z. Apanovich, A. Marchuk","doi":"10.1109/ICVD.1998.646577","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646577","url":null,"abstract":"The approach to technology migration presented in this paper is based on a compaction and rerouting strategy. It takes as input the full-chip mask layout hierarchical description (CIF format) and produces as output the mask layout in the target design rules. The applicability of the compaction and rerouting facilities, and the flexibility of the routing layers redistribution between different levels of mask layout hierarchy, are provided by a procedure for mask layout decomposition. The decomposition procedure takes as input any node of the mask layout hierarchical description and extracts the fragments which should be transformed by means of compaction. The size of the extracted fragments is controlled by decomposition parameters. Each extracted fragment is processed by a symbolisation procedure which provides resizing and regeneration of elementary objects such as transistors, contacts and wires. The target mask layout for each fragment is generated by a compaction procedure which is controlled by the constraints extracted during the symbolisation step. The resulting chip mask layout is generated by a routing procedure which is controlled by the data structures (netlist and floorplan) extracted during the decomposition step.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114445574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646590
S. Balakrishnan, S. Nandy
Current day general purpose processors have been enhanced with what is called "media instruction set" to achieve performance gains in applications that are media processing intensive. The instruction set that has been added exploits the fact that media applications have small native datatypes and have widths much less than that supported by commercial processors and the plethora of data-parallelism in such applications. Current processors enhanced with the "media instruction set" support arithmetic on sub-datatypes of only 8-bit, 16-bit, 32-bit and 64-bit precision. In this paper we motivate the need for arbitrary precision packed arithmetic wherein the width of the sub-datatypes are programmable by the user and propose an implementation for arithmetic on such packed datatypes. The proposed scheme has marginal hardware overhead over conventional implementations of arithmetic on processors incorporating a multimedia extended instruction set.
{"title":"Arbitrary precision arithmetic-SIMD style","authors":"S. Balakrishnan, S. Nandy","doi":"10.1109/ICVD.1998.646590","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646590","url":null,"abstract":"Current day general purpose processors have been enhanced with what is called \"media instruction set\" to achieve performance gains in applications that are media processing intensive. The instruction set that has been added exploits the fact that media applications have small native datatypes and have widths much less than that supported by commercial processors and the plethora of data-parallelism in such applications. Current processors enhanced with the \"media instruction set\" support arithmetic on sub-datatypes of only 8-bit, 16-bit, 32-bit and 64-bit precision. In this paper we motivate the need for arbitrary precision packed arithmetic wherein the width of the sub-datatypes are programmable by the user and propose an implementation for arithmetic on such packed datatypes. The proposed scheme has marginal hardware overhead over conventional implementations of arithmetic on processors incorporating a multimedia extended instruction set.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114261869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646642
P. Rao, C.S. Jayathirtha, C.S. Raghavendraprasad
Spectral approaches for partitioning netlists that use the eigenvectors of a matrix derived from a weighted graph model of the netlist (hypergraph) have been attracting considerable attention. There are several known ways in which a weighted graph could be derived from the netlist. However, the effectiveness of these alternate net models for netlist partitioning has remained unexplored. In this paper we first evaluate the relative performance of these approaches and establish that the quality of the partition is sensitive to the choice of the model. We also propose and investigate a number of new approaches for deriving a weighted graph model for a netlist. We show through test results on benchmark partitioning problems that one of the new models proposed here, performs consistently better than all the other models.
{"title":"New net models for spectral netlist partitioning","authors":"P. Rao, C.S. Jayathirtha, C.S. Raghavendraprasad","doi":"10.1109/ICVD.1998.646642","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646642","url":null,"abstract":"Spectral approaches for partitioning netlists that use the eigenvectors of a matrix derived from a weighted graph model of the netlist (hypergraph) have been attracting considerable attention. There are several known ways in which a weighted graph could be derived from the netlist. However, the effectiveness of these alternate net models for netlist partitioning has remained unexplored. In this paper we first evaluate the relative performance of these approaches and establish that the quality of the partition is sensitive to the choice of the model. We also propose and investigate a number of new approaches for deriving a weighted graph model for a netlist. We show through test results on benchmark partitioning problems that one of the new models proposed here, performs consistently better than all the other models.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129617812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646607
Abhijit Das, Samrat Sen, Mohan Rangan, Rupesh Nayak, G. Nandakumar
The prevalent concern over static timing analysis is that it might produce a very pessimistic result in presence of false paths in the circuit. It is therefore essential to detect and avoid the false paths during timing analysis, to estimate the timing characteristics of the design better. In this paper a framework is presented to automatically detect false paths at the transistor level. The false path detection involves logic extraction from a transistor level netlist and then detecting false paths, with more accurate estimation of the path delays.
{"title":"False path detection at transistor level","authors":"Abhijit Das, Samrat Sen, Mohan Rangan, Rupesh Nayak, G. Nandakumar","doi":"10.1109/ICVD.1998.646607","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646607","url":null,"abstract":"The prevalent concern over static timing analysis is that it might produce a very pessimistic result in presence of false paths in the circuit. It is therefore essential to detect and avoid the false paths during timing analysis, to estimate the timing characteristics of the design better. In this paper a framework is presented to automatically detect false paths at the transistor level. The false path detection involves logic extraction from a transistor level netlist and then detecting false paths, with more accurate estimation of the path delays.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129961116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646656
P. Mazumder, S. Kulkarni, M. Bhattacharya, Alejandro F. González
Picosecond switching speeds and folded current voltage characteristics have made quantum tunneling devices promising alternatives for high-speed and compact VLSI circuit design. This paper describes new bistable digital logic circuit topologies that use resonant tunneling diodes (RTDs) in conjunction with heterojunction bipolar transistors (HBTs) and modulation-doped field effect transistors (MODFETs). The designed circuits include a single-gate, self-latching MAJORITY function besides basic NAND, NOR and inverter gates. The application of these circuits in the design of high-performance adders and parallel correlators is discussed. We also review multiple-valued logic (MVL) applications of RTDs that achieve significant compaction in terms of device count over comparable binary logic implementations in conventional technologies. These include a four-valued 4:1 multiplexer using 13 resonant tunneling bipolar transistors (RTBTs) and HBTs, a mask programmable four-valued, single-input gate using 4 RTDs and 14 HBTs, and a four-step countdown circuit using 1 RTD and 3 HBTs.
{"title":"Circuit design using resonant tunneling diodes","authors":"P. Mazumder, S. Kulkarni, M. Bhattacharya, Alejandro F. González","doi":"10.1109/ICVD.1998.646656","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646656","url":null,"abstract":"Picosecond switching speeds and folded current voltage characteristics have made quantum tunneling devices promising alternatives for high-speed and compact VLSI circuit design. This paper describes new bistable digital logic circuit topologies that use resonant tunneling diodes (RTDs) in conjunction with heterojunction bipolar transistors (HBTs) and modulation-doped field effect transistors (MODFETs). The designed circuits include a single-gate, self-latching MAJORITY function besides basic NAND, NOR and inverter gates. The application of these circuits in the design of high-performance adders and parallel correlators is discussed. We also review multiple-valued logic (MVL) applications of RTDs that achieve significant compaction in terms of device count over comparable binary logic implementations in conventional technologies. These include a four-valued 4:1 multiplexer using 13 resonant tunneling bipolar transistors (RTBTs) and HBTs, a mask programmable four-valued, single-input gate using 4 RTDs and 14 HBTs, and a four-step countdown circuit using 1 RTD and 3 HBTs.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114715623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646641
S. Jain, M. Balakrishnan, Anshul Kumar, Shashi Kumar
This paper describes a co-design environment which follows a new approach for speeding up compute intensive applications. The environment consists of three major components. First, a target architecture consisting of a uniprocessor host and a board with dynamically reconfigurable FPGAs and memory modules; second, a library of functions pre-synthesized for hardware or software implementation; and third, a tool which takes as input an application described in C and partitions it into hardware and software parts at functional granularity using information obtained by profiling the application. An important feature of the partitioning tool is a new efficient heuristic specifically suited for the architecture with reconfigurable hardware.
{"title":"Speeding up program execution using reconfigurable hardware and a hardware function library","authors":"S. Jain, M. Balakrishnan, Anshul Kumar, Shashi Kumar","doi":"10.1109/ICVD.1998.646641","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646641","url":null,"abstract":"This paper describes a co-design environment which follows a new approach for speeding up compute intensive applications. The environment consists of three major components. First, a target architecture consisting of a uniprocessor host and a board with dynamically reconfigurable FPGAs and memory modules; second, a library of functions pre-synthesized for hardware or software implementation; and third, a tool which takes as input an application described in C and partitions it into hardware and software parts at functional granularity using information obtained by profiling the application. An important feature of the partitioning tool is a new efficient heuristic specifically suited for the architecture with reconfigurable hardware.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115011555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICVD.1998.646640
M. B. Sherigar, A. Mahadevan, K. S. Kumar, D. S. Sumam
The paper presents a pipelined parallel processor architecture design to implement the MD4 message digest algorithm which computes the message digest or the fingerprint of 128 bit fixed length, for any arbitrary length of input message. The processor implements the arithmetic, logic and circular shift operations by a pipelined parallel process. The architecture is designed to suit the design flexibility of the Xilinx Field Programmable Gate Arrays (FPGAs) The processor reads the message from an external RAM, 16-bit at a time and internal operations are performed with 32-bit data. The major advantage of the design is increased speed of computation and minimum hardware. The processor computes the digest with a speed approximately three times faster than the software version implemented in DSP processors.
{"title":"A pipelined parallel processor to implement MD4 message digest algorithm on Xilinx FPGA","authors":"M. B. Sherigar, A. Mahadevan, K. S. Kumar, D. S. Sumam","doi":"10.1109/ICVD.1998.646640","DOIUrl":"https://doi.org/10.1109/ICVD.1998.646640","url":null,"abstract":"The paper presents a pipelined parallel processor architecture design to implement the MD4 message digest algorithm which computes the message digest or the fingerprint of 128 bit fixed length, for any arbitrary length of input message. The processor implements the arithmetic, logic and circular shift operations by a pipelined parallel process. The architecture is designed to suit the design flexibility of the Xilinx Field Programmable Gate Arrays (FPGAs) The processor reads the message from an external RAM, 16-bit at a time and internal operations are performed with 32-bit data. The major advantage of the design is increased speed of computation and minimum hardware. The processor computes the digest with a speed approximately three times faster than the software version implemented in DSP processors.","PeriodicalId":139023,"journal":{"name":"Proceedings Eleventh International Conference on VLSI Design","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123705402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}