K. Goossens, Dongrui She, Aleksandar Milutinovic, A. Molnos
Composability means that the behaviour of an application, including its timing, is not affected by the absence or presence of other applications. It is required to be able to design, test, and verify applications independently. In this paper we define composable dynamic voltage and frequency scaling (DVFS) hardware, and composable power management. We ensure that the functional and temporal behaviours of an application are not affected by other applications, even when they are power managed. For dataflow applications with worst-case execution times per task, our power management is also predictable, i.e. guarantees end-to-end real-time requirements, even when the application is mapped on multiple processors that are power managed independently. Our method can be used with various DVFS architectures, such as on-chip and off-chip VF regulators. Our FPGA implementation models a system with multiple tiles, each containing a processor with local memory running a real-time operating system (RTOS) and power management. Tiles are interconnected by a network on chip, and communicate using shared memories. Experiments indicate energy savings of 68% w.r.t. no power management, and 40% w.r.t. power gating only. We also demonstrate composability and predictability on the platform in the presence of power management.
{"title":"Composable Dynamic Voltage and Frequency Scaling and Power Management for Dataflow Applications","authors":"K. Goossens, Dongrui She, Aleksandar Milutinovic, A. Molnos","doi":"10.1109/DSD.2010.61","DOIUrl":"https://doi.org/10.1109/DSD.2010.61","url":null,"abstract":"Composability means that the behaviour of an application, including its timing, is not affected by the absence or presence of other applications. It is required to be able to design, test, and verify applications independently. In this paper we define composable dynamic voltage and frequency scaling (DVFS) hardware, and composable power management. We ensure that the functional and temporal behaviours of an application are not affected by other applications, even when they are power managed. For dataflow applications with worst-case execution times per task, our power management is also predictable, i.e. guarantees end-to-end real-time requirements, even when the application is mapped on multiple processors that are power managed independently. Our method can be used with various DVFS architectures, such as on-chip and off-chip VF regulators. Our FPGA implementation models a system with multiple tiles, each containing a processor with local memory running a real-time operating system (RTOS) and power management. Tiles are interconnected by a network on chip, and communicate using shared memories. Experiments indicate energy savings of 68% w.r.t. no power management, and 40% w.r.t. power gating only. We also demonstrate composability and predictability on the platform in the presence of power management.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124937127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Real-time face recognition by computer systems is required in many commercial and security applications since it is the only way to protect privacy and security. On the other hand, face recognition generates huge amounts of data in real-time. Filtering out meaningful data from this raw data with high accuracy is a complex task. Most of the existing techniques primarily focus on the accuracy aspect using extensive matrix-oriented computations. Efficient realizations primarily reduce the computational space using eigenvalues. On the other hand, an eigenvalues oriented evaluation has minimum time complexity of O (n3), where n is the rank of the covariance matrix, the computation cost for co-variance generation is extra. Our frequency distribution curve (FDC) technique avoids matrix decomposition and other high computationally intensive matrix operations. FDC is formulated with a bias towards efficient hardware realization and high accuracy by using simple vector operations. FDC requires pattern vector (PV) extraction from an image within O (n2) time. Our enhanced FDC-based architecture proposed in this paper further shifts a computationally expensive component of FDC to the offline layer of the system, thus resulting in very fast online evaluation of the input data. Furthermore, efficient online testing is pursued as well using an adaptive controller (AC) for PV classification utilizing the Euclidian vector norm length. The pipelined AC architecture adapts to the availability of resources in the target silicon device. Our implementation on an XC5VSX50t FPGA demonstrates a high accuracy of 99% in face recognition for 400 images in the ORL database, generally requiring less than 200 nsec per image.
{"title":"Hardware-Based Speed Up of Face Recognition Towards Real-Time Performance","authors":"I. Sajid, Sotirios G. Ziavras, M. M. Ahmed","doi":"10.1109/DSD.2010.45","DOIUrl":"https://doi.org/10.1109/DSD.2010.45","url":null,"abstract":"Real-time face recognition by computer systems is required in many commercial and security applications since it is the only way to protect privacy and security. On the other hand, face recognition generates huge amounts of data in real-time. Filtering out meaningful data from this raw data with high accuracy is a complex task. Most of the existing techniques primarily focus on the accuracy aspect using extensive matrix-oriented computations. Efficient realizations primarily reduce the computational space using eigenvalues. On the other hand, an eigenvalues oriented evaluation has minimum time complexity of O (n3), where n is the rank of the covariance matrix, the computation cost for co-variance generation is extra. Our frequency distribution curve (FDC) technique avoids matrix decomposition and other high computationally intensive matrix operations. FDC is formulated with a bias towards efficient hardware realization and high accuracy by using simple vector operations. FDC requires pattern vector (PV) extraction from an image within O (n2) time. Our enhanced FDC-based architecture proposed in this paper further shifts a computationally expensive component of FDC to the offline layer of the system, thus resulting in very fast online evaluation of the input data. Furthermore, efficient online testing is pursued as well using an adaptive controller (AC) for PV classification utilizing the Euclidian vector norm length. The pipelined AC architecture adapts to the availability of resources in the target silicon device. Our implementation on an XC5VSX50t FPGA demonstrates a high accuracy of 99% in face recognition for 400 images in the ORL database, generally requiring less than 200 nsec per image.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126184990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the various devices composing a smart camera system, various languages have to be known by the designer (like HDL and C/C++). Most of vision applications designers are software programmers and do not have a good knowledge of HDLs (VHDL). This paper presents a new high-level methodology for implementing vision applications on smart camera platforms. This methodology is based on a soft-core approach to manage the whole system and a dataflow (actor-oriented) language to design the processing elements. We discuss in particular interfacing constraints.
{"title":"A New High-Level Methodology for Programming FPGA-Based Smart Camera","authors":"Nicolas Roudel, F. Berry, J. Sérot, L. Eck","doi":"10.1109/DSD.2010.68","DOIUrl":"https://doi.org/10.1109/DSD.2010.68","url":null,"abstract":"Due to the various devices composing a smart camera system, various languages have to be known by the designer (like HDL and C/C++). Most of vision applications designers are software programmers and do not have a good knowledge of HDLs (VHDL). This paper presents a new high-level methodology for implementing vision applications on smart camera platforms. This methodology is based on a soft-core approach to manage the whole system and a dataflow (actor-oriented) language to design the processing elements. We discuss in particular interfacing constraints.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121720070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to optimize performance and power consumption. In this paper, we propose a method based on the analysis of accesses to vector, arrays, and other complex data structures to design a size-constrained two-partition array cache. This method reorganizes the ways of set-associative arrays caches into partitions with different line sizes and defines array-partition mappings so as to minimize the average memory access energy-delay product. Experimental results have shown that these split array caches have lower average energy-delay product for memory accesses as compared with unified set-associative array caches of the same size. For an MPEG-2 decoder, even with no parallel accesses to cache partitions, the average memory access energy-delay product of an 8K-byte trace-based split array cache is reduced by 50% as compared to that of the unified set-associative array cache with the lowest energy-delay product. If 25% of the accesses occur in pairs, there is an additional reduction of 9%.
{"title":"Design of Trace-Based Split Array Caches for Embedded Applications","authors":"A. Tokarnia, Marina Tachibana","doi":"10.1109/DSD.2010.33","DOIUrl":"https://doi.org/10.1109/DSD.2010.33","url":null,"abstract":"Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to optimize performance and power consumption. In this paper, we propose a method based on the analysis of accesses to vector, arrays, and other complex data structures to design a size-constrained two-partition array cache. This method reorganizes the ways of set-associative arrays caches into partitions with different line sizes and defines array-partition mappings so as to minimize the average memory access energy-delay product. Experimental results have shown that these split array caches have lower average energy-delay product for memory accesses as compared with unified set-associative array caches of the same size. For an MPEG-2 decoder, even with no parallel accesses to cache partitions, the average memory access energy-delay product of an 8K-byte trace-based split array cache is reduced by 50% as compared to that of the unified set-associative array cache with the lowest energy-delay product. If 25% of the accesses occur in pairs, there is an additional reduction of 9%.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128021811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ismael Gómez Miguelez, Massimo Camatel, J. Bracke, V. Marojevic, A. Gelonch, F. Vacca, G. Masera
Radio communications terminals and infrastructure tend to support an increasing range of algorithms and radio access technologies. Flexible processing platforms are therefore needed for supporting multi-standard or heterogeneous radios. Channel decoding is one of the most computing demanding digital signal processing blocks of a radio transceiver. At the same time, it provides a high degree of implementation flexibility as well as facilitates dynamic parameter adjustments. This paper presents a flexible LDPC decoder implemented on an FPGA device following the ALOE middleware design paradigm. We analyse the middleware efficiency in terms of flexibility versus resource requirements. The results show a relative middleware area overhead of 32 %.
{"title":"ALOE-Based Flexible LDPC Decoder","authors":"Ismael Gómez Miguelez, Massimo Camatel, J. Bracke, V. Marojevic, A. Gelonch, F. Vacca, G. Masera","doi":"10.1109/DSD.2010.107","DOIUrl":"https://doi.org/10.1109/DSD.2010.107","url":null,"abstract":"Radio communications terminals and infrastructure tend to support an increasing range of algorithms and radio access technologies. Flexible processing platforms are therefore needed for supporting multi-standard or heterogeneous radios. Channel decoding is one of the most computing demanding digital signal processing blocks of a radio transceiver. At the same time, it provides a high degree of implementation flexibility as well as facilitates dynamic parameter adjustments. This paper presents a flexible LDPC decoder implemented on an FPGA device following the ALOE middleware design paradigm. We analyse the middleware efficiency in terms of flexibility versus resource requirements. The results show a relative middleware area overhead of 32 %.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133345681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Berthelot, François Charot, Charles Wagner, C. Wolinski
The new Digital Video Broadcasting Satellite (DVB-S2) standard is able to provide capacity gains of about30% over the previous standard by using a powerfull Forward Error Correction (FEC) scheme based on very large LDPC code words and BCH codes. The implementation of the DVBS2FEC decoder is a big challenge. The designer must deal with the overall design complexity and the decoding throughput in order to obtain a high decoding performance in terms of bit error rate (BER). We present in detail a complete design flow allowing a better understanding of the algorithm in terms of complexity, performance and its hardware implementation. We focus on complexity-performance trade-offs due to message quantizations and we compare its effects on several algorithm corrections used to check nodes for DVB-S2 decoding. The simulation results show that the best compromise between complexity and performance is obtained for the FOMS algorithm approximation.
{"title":"Design Methodology for a High Performance Robust DVB-S2 Decoder Implementation","authors":"F. Berthelot, François Charot, Charles Wagner, C. Wolinski","doi":"10.1109/DSD.2010.40","DOIUrl":"https://doi.org/10.1109/DSD.2010.40","url":null,"abstract":"The new Digital Video Broadcasting Satellite (DVB-S2) standard is able to provide capacity gains of about30% over the previous standard by using a powerfull Forward Error Correction (FEC) scheme based on very large LDPC code words and BCH codes. The implementation of the DVBS2FEC decoder is a big challenge. The designer must deal with the overall design complexity and the decoding throughput in order to obtain a high decoding performance in terms of bit error rate (BER). We present in detail a complete design flow allowing a better understanding of the algorithm in terms of complexity, performance and its hardware implementation. We focus on complexity-performance trade-offs due to message quantizations and we compare its effects on several algorithm corrections used to check nodes for DVB-S2 decoding. The simulation results show that the best compromise between complexity and performance is obtained for the FOMS algorithm approximation.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130576842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Future VLSI technologies will allow for multiple clusters each of a number of processing nodes to be put on a single chip. Although we may then be able to select a network topology matching an application assigned to each cluster, it may be difficult to decide the topologies of connections between the (intra)cluster networks for effective parallel processing by the cooperation of clusters. To alleviate the problem, this paper proposes a class of recursive networks, RNs, of which constituent networks can have different topologies and sizes in different recursive levels but also in the same level. In RN, the last-level networks define the cluster networks, and the level-i network associated with a cluster network defines the i-th intercluster network between the cluster and another cluster. The cluster and intercluster networks can be any kinds of standard networks, such as the mesh and bus. The paper presents a partition-based method of generating RN and its routing and layout methods.
{"title":"A Class of Recursive Networks on a Chip for Enhancing Intercluster Parallelism","authors":"Masaru Takesue","doi":"10.1109/DSD.2010.46","DOIUrl":"https://doi.org/10.1109/DSD.2010.46","url":null,"abstract":"Future VLSI technologies will allow for multiple clusters each of a number of processing nodes to be put on a single chip. Although we may then be able to select a network topology matching an application assigned to each cluster, it may be difficult to decide the topologies of connections between the (intra)cluster networks for effective parallel processing by the cooperation of clusters. To alleviate the problem, this paper proposes a class of recursive networks, RNs, of which constituent networks can have different topologies and sizes in different recursive levels but also in the same level. In RN, the last-level networks define the cluster networks, and the level-i network associated with a cluster network defines the i-th intercluster network between the cluster and another cluster. The cluster and intercluster networks can be any kinds of standard networks, such as the mesh and bus. The paper presents a partition-based method of generating RN and its routing and layout methods.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132157979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper provides a detailed performance analysis of low power and high speed Look up Table (LUT) by using a circuit technique. Proper sizing of all the sleep transistors are done in the LUT to achieve an optimum power –delay relationship so that it can be used for fast growing low power applications. Also, we have implemented a benchmark circuit (8 × 10) encoder in Virtex-4, 90nm FPGA. As compared to the traditional 4-input LUT design, proposed design saves 12.8% of average power in high speed mode and 56.7% in low power mode with a little compromise in its speed.
{"title":"Performance Analysis of 90nm Look Up Table (LUT) for Low Power Application","authors":"Deepak Kumar, Pankaj Kumar, M. Pattanaik","doi":"10.1109/DSD.2010.72","DOIUrl":"https://doi.org/10.1109/DSD.2010.72","url":null,"abstract":"This paper provides a detailed performance analysis of low power and high speed Look up Table (LUT) by using a circuit technique. Proper sizing of all the sleep transistors are done in the LUT to achieve an optimum power –delay relationship so that it can be used for fast growing low power applications. Also, we have implemented a benchmark circuit (8 × 10) encoder in Virtex-4, 90nm FPGA. As compared to the traditional 4-input LUT design, proposed design saves 12.8% of average power in high speed mode and 56.7% in low power mode with a little compromise in its speed.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133606865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Engineering hardware platform for a Wireless Sensor Network (WSN) node is known to be a tough challenge, as the design must enforce many severe constraints, among which energy dissipation is by far the most challenging one. Today, most of the WSN node platforms are based on low cost and low-power programmable micro controllers, even if it is acknowledged that their energy efficiency remains limited and hinders the wide-spreading of WSN to new applications. In this paper, we propose a complete system level flow for an alternative approach based on the concept of hardware micro-tasks, which relies on hardware specialization and power gating to dramatically improve the energy efficiency of the computational part of the node. Early estimates show power saving by more than one order of magnitude over MCU-based implementations.
{"title":"System Level Synthesis for Ultra Low-Power Wireless Sensor Nodes","authors":"Muhammad Adeel Pasha, Steven Derrien, O. Sentieys","doi":"10.1109/DSD.2010.88","DOIUrl":"https://doi.org/10.1109/DSD.2010.88","url":null,"abstract":"Engineering hardware platform for a Wireless Sensor Network (WSN) node is known to be a tough challenge, as the design must enforce many severe constraints, among which energy dissipation is by far the most challenging one. Today, most of the WSN node platforms are based on low cost and low-power programmable micro controllers, even if it is acknowledged that their energy efficiency remains limited and hinders the wide-spreading of WSN to new applications. In this paper, we propose a complete system level flow for an alternative approach based on the concept of hardware micro-tasks, which relies on hardware specialization and power gating to dramatically improve the energy efficiency of the computational part of the node. Early estimates show power saving by more than one order of magnitude over MCU-based implementations.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126735253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A branching program machine (BM) is a special purpose processor that uses only two kinds of instructions: Branch and output instructions. Thus, the architecture for the BM is much simpler than that for a general purpose processor (MPU). Since the BM uses the dedicated instructions for a special purpose application, it is faster than the MPU. This paper presents a packet classifier using a parallel branching program machine (PBM). To reduce computation time and code size, first, a set of rules for the packet classifier is partitioned into groups. Then, they are evaluated by the PBM in parallel. Also, this paper shows a method to estimate the number of necessary BMs to realize the packet classifier. The PBM32 consisting of 32 BMs has been implemented on an FPGA, and compared with the Intel's Core2Duo@1.2GHz. The PBM32 is 8.1-11.1 times faster than the Core2Duo, and the PBM32 requires only 0.2-10.3 percent of the memory for the Core2Duo.
{"title":"A Packet Classifier Using a Parallel Branching Program Machine","authors":"Hiroki Nakahara, Tsutomu Sasao, M. Matsuura","doi":"10.1109/DSD.2010.18","DOIUrl":"https://doi.org/10.1109/DSD.2010.18","url":null,"abstract":"A branching program machine (BM) is a special purpose processor that uses only two kinds of instructions: Branch and output instructions. Thus, the architecture for the BM is much simpler than that for a general purpose processor (MPU). Since the BM uses the dedicated instructions for a special purpose application, it is faster than the MPU. This paper presents a packet classifier using a parallel branching program machine (PBM). To reduce computation time and code size, first, a set of rules for the packet classifier is partitioned into groups. Then, they are evaluated by the PBM in parallel. Also, this paper shows a method to estimate the number of necessary BMs to realize the packet classifier. The PBM32 consisting of 32 BMs has been implemented on an FPGA, and compared with the Intel's Core2Duo@1.2GHz. The PBM32 is 8.1-11.1 times faster than the Core2Duo, and the PBM32 requires only 0.2-10.3 percent of the memory for the Core2Duo.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115796301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}