Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457031
Teo Cupaiuolo, Massimiliano Siti, A. Tomasoni
In this paper a VLSI architecture of a high throughput and high performance soft-output (SO) MIMO detector (the recently presented Layered ORthogonal Lattice Detector, LORD) is presented. The baseline implementation includes optimal (i.e. maximum-likelihood - ML - in the max-log sense) SO generation. A reduced complexity variant of the SO generation stage is also described. To the best of the authors' knowledge, the proposed architecture is the first VLSI implementation of a max-log ML MIMO detector which includes QR decomposition and SO generation, having the latter a deterministic very high throughput thanks to a fully parallelizable structure, and parameterizability in terms of both the number of transmit and receive antennas, and the supported modulation orders. The two designs achieve a very high throughput making them particularly suitable for MIMO-OFDM systems like e.g. IEEE 802.11n WLANs: the most demanding requirements are satisfied at a reasonable cost of area and power consumption.
{"title":"Low-complexity high throughput VLSI architecture of soft-output ML MIMO detector","authors":"Teo Cupaiuolo, Massimiliano Siti, A. Tomasoni","doi":"10.1109/DATE.2010.5457031","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457031","url":null,"abstract":"In this paper a VLSI architecture of a high throughput and high performance soft-output (SO) MIMO detector (the recently presented Layered ORthogonal Lattice Detector, LORD) is presented. The baseline implementation includes optimal (i.e. maximum-likelihood - ML - in the max-log sense) SO generation. A reduced complexity variant of the SO generation stage is also described. To the best of the authors' knowledge, the proposed architecture is the first VLSI implementation of a max-log ML MIMO detector which includes QR decomposition and SO generation, having the latter a deterministic very high throughput thanks to a fully parallelizable structure, and parameterizability in terms of both the number of transmit and receive antennas, and the supported modulation orders. The two designs achieve a very high throughput making them particularly suitable for MIMO-OFDM systems like e.g. IEEE 802.11n WLANs: the most demanding requirements are satisfied at a reasonable cost of area and power consumption.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"104 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113989948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457132
Z. Mahmood, B. Bond, T. Moselhy, A. Megretski, L. Daniel
In this paper we present a passive reduced order modeling algorithm for linear multiport interconnect structures. The proposed technique uses rational fitting via semidefinite programming to identify a passive transfer matrix from given frequency domain data samples. Numerical results are presented for a power distribution grid and an array of inductors, and the proposed approach is compared to two existing rational fitting techniques.
{"title":"Passive reduced order modeling of multiport interconnects via semidefinite programming","authors":"Z. Mahmood, B. Bond, T. Moselhy, A. Megretski, L. Daniel","doi":"10.1109/DATE.2010.5457132","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457132","url":null,"abstract":"In this paper we present a passive reduced order modeling algorithm for linear multiport interconnect structures. The proposed technique uses rational fitting via semidefinite programming to identify a passive transfer matrix from given frequency domain data samples. Numerical results are presented for a power distribution grid and an array of inductors, and the proposed approach is compared to two existing rational fitting techniques.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115583675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457075
Alvaro Gómez, R. Sanahuja, L. Balado, J. Figueras
Production verification of analog circuit specifications is a challenging task requiring expensive test equipment and time consuming procedures. This paper presents a method for low cost on-chip parameter verification based on the analysis of a digital signature. A 65 nm CMOS on-chip monitor is proposed and validated in practice. The monitor composes two signals (x(t), y(t)) and divides the X-Y plane with nonlinear boundaries in order to generate a digital code for every analog (x, y) location. A digital signature is obtained using the digital code and its time duration. A metric defining a discrepancy factor is used to verify circuit parameters. The method is applied to detect possible deviations in the natural frequency of a Biquad filter. Simulated and experimental results show the possibilities of the proposal.
{"title":"Analog circuit test based on a digital signature","authors":"Alvaro Gómez, R. Sanahuja, L. Balado, J. Figueras","doi":"10.1109/DATE.2010.5457075","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457075","url":null,"abstract":"Production verification of analog circuit specifications is a challenging task requiring expensive test equipment and time consuming procedures. This paper presents a method for low cost on-chip parameter verification based on the analysis of a digital signature. A 65 nm CMOS on-chip monitor is proposed and validated in practice. The monitor composes two signals (x(t), y(t)) and divides the X-Y plane with nonlinear boundaries in order to generate a digital code for every analog (x, y) location. A digital signature is obtained using the digital code and its time duration. A metric defining a discrepancy factor is used to verify circuit parameters. The method is applied to detect possible deviations in the natural frequency of a Biquad filter. Simulated and experimental results show the possibilities of the proposal.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114347595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457187
Prateek Mishra, N. Jha
FinFETs with channel surface along the <110> plane can be easily fabricated by rotating the fins by 45o from the <100> plane. By designing logic gates, which have pFinFETs in the <110> plane and nFinFETs in the <100> plane, the gate delay can be reduced by as much as 14%, compared to the conventional <100> logic gates. The reduction in delay can be traded off for reduced power in FinFET circuits. In this paper, we propose a low-power FinFET-based circuit synthesis methodology based on surface orientation optimization. We study various logic design styles, which depend on different FinFET channel orientations, for synthesizing low-power circuits. We use BSIM, a process/physics based double-gate model in HSPICE, to derive accurate delay and power estimates. We design layouts of standard library cells containing FinFETs in different orientations to obtain an accurate area estimate for the low-power synthesized netlists after place-and-route. We use a linear programming based optimization methodology that gives power-optimized netlists, consisting of oriented gates, at tight delay constraints. Experimental results demonstrate the efficacy of our scheme.
{"title":"Low-power FinFET circuit synthesis using surface orientation optimization","authors":"Prateek Mishra, N. Jha","doi":"10.1109/DATE.2010.5457187","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457187","url":null,"abstract":"FinFETs with channel surface along the <110> plane can be easily fabricated by rotating the fins by 45o from the <100> plane. By designing logic gates, which have pFinFETs in the <110> plane and nFinFETs in the <100> plane, the gate delay can be reduced by as much as 14%, compared to the conventional <100> logic gates. The reduction in delay can be traded off for reduced power in FinFET circuits. In this paper, we propose a low-power FinFET-based circuit synthesis methodology based on surface orientation optimization. We study various logic design styles, which depend on different FinFET channel orientations, for synthesizing low-power circuits. We use BSIM, a process/physics based double-gate model in HSPICE, to derive accurate delay and power estimates. We design layouts of standard library cells containing FinFETs in different orientations to obtain an accurate area estimate for the low-power synthesized netlists after place-and-route. We use a linear programming based optimization methodology that gives power-optimized netlists, consisting of oriented gates, at tight delay constraints. Experimental results demonstrate the efficacy of our scheme.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114749292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457126
Timo Kerstan, Markus Oertel
Virtualization has become a key technology in the design of embedded systems. Within the scope of virtualization, emulation is a central aspect to overcome the limits induced by the heterogeneity of complex distributed embedded systems. Most of the techniques developed for the desktops and servers are not directly applicable to embedded systems due to their strict timing requirements. We will show the problems of existing emulation methods when applying them to embedded real-time systems and will propose a metric to determine the worst-case overhead caused by emulation. Based on this metrics we then propose an emulation method minimizing the worst-case overhead.
{"title":"Design of a real-time optimized emulation method","authors":"Timo Kerstan, Markus Oertel","doi":"10.1109/DATE.2010.5457126","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457126","url":null,"abstract":"Virtualization has become a key technology in the design of embedded systems. Within the scope of virtualization, emulation is a central aspect to overcome the limits induced by the heterogeneity of complex distributed embedded systems. Most of the techniques developed for the desktops and servers are not directly applicable to embedded systems due to their strict timing requirements. We will show the problems of existing emulation methods when applying them to embedded real-time systems and will propose a metric to determine the worst-case overhead caused by emulation. Based on this metrics we then propose an emulation method minimizing the worst-case overhead.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"07 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117253537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457076
Nabeel Iqbal, M. A. Siddique, J. Henkel
This paper addresses the problem of stochastic task execution time estimation agnostic to the process distributions. The proposed method is orthogonal to the application structure and underlying architecture. We build the time varying state space model of the task execution time. In the case of software pipelined tasks, to refine the estimate quality, the state-space is modeled as Multiple Input Single Output (MISO) system by taking into account the current execution time of the predecessor task. To obtain nearly Bayesian estimates, irrespective of the process distribution, the sequential Monte Carlo method is applied which form the recursive solution to reduce the overheads and comprises of time update and correction steps. We experimented on three different platforms, including multicore, using the time parallelized H.264 decoder: a control dominant computationally demanding application and AES encoder: a pure data flow application. Results show that estimates obtained by our method are superior in quality and are up to 68% better in comparison to others.
{"title":"DAGS: Distribution agnostic sequential Monte Carlo scheme for task execution time estimation","authors":"Nabeel Iqbal, M. A. Siddique, J. Henkel","doi":"10.1109/DATE.2010.5457076","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457076","url":null,"abstract":"This paper addresses the problem of stochastic task execution time estimation agnostic to the process distributions. The proposed method is orthogonal to the application structure and underlying architecture. We build the time varying state space model of the task execution time. In the case of software pipelined tasks, to refine the estimate quality, the state-space is modeled as Multiple Input Single Output (MISO) system by taking into account the current execution time of the predecessor task. To obtain nearly Bayesian estimates, irrespective of the process distribution, the sequential Monte Carlo method is applied which form the recursive solution to reduce the overheads and comprises of time update and correction steps. We experimented on three different platforms, including multicore, using the time parallelized H.264 decoder: a control dominant computationally demanding application and AES encoder: a pure data flow application. Results show that estimates obtained by our method are superior in quality and are up to 68% better in comparison to others.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124447538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457147
Matthias Müller, A. Braun, J. Gerlach, W. Rosenstiel, Dennis Nienhüser, Johann Marius Zöllner, O. Bringmann
This paper describes the design of an automotive traffic sign recognition application. All stages of the design process, starting on system-level with an abstract, pure functional model down to final hardware/software implementations on an FPGA, are shown. The proposed design flow tackles existing bottlenecks of today's system-level design processes, following an early model-based performance evaluation and analysis strategy, which takes into account hardware, software and real-time operating system aspects. The experiments with the traffic sign recognition application show, that the developed mechanisms are able to identify appropriate system configurations and to provide a seamless link into the underlying implementation flows.
{"title":"Design of an automotive traffic sign recognition system targeting a multi-core SoC implementation","authors":"Matthias Müller, A. Braun, J. Gerlach, W. Rosenstiel, Dennis Nienhüser, Johann Marius Zöllner, O. Bringmann","doi":"10.1109/DATE.2010.5457147","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457147","url":null,"abstract":"This paper describes the design of an automotive traffic sign recognition application. All stages of the design process, starting on system-level with an abstract, pure functional model down to final hardware/software implementations on an FPGA, are shown. The proposed design flow tackles existing bottlenecks of today's system-level design processes, following an early model-based performance evaluation and analysis strategy, which takes into account hardware, software and real-time operating system aspects. The experiments with the traffic sign recognition application show, that the developed mechanisms are able to identify appropriate system configurations and to provide a seamless link into the underlying implementation flows.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"521 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124486710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457212
A. Majid, D. Keezer
This paper describes a multi-gigahertz test module to enhance the performance capabilities of automated test equipment (ATE), such as high-speed signal generation, loopback testing, jitter injection, etc. The test module includes a core logic block consisting of a high-performance FPGA. It is designed to be compatible with existing ATE infrastructure; connecting to the device under test (DUT) via a device interface board (DIB). The core logic block controls the test module's functionality, thereby allowing it to operate independently of the ATE. Exploiting recent advances in FPGA SerDes, the test module is able to generate very high (multi-GHz) data rates at a relatively low cost. In this paper we demonstrate multiplexing logic to generate higher data rates (up to 10Gbps) and a low-jitter buffered loopback path to carry high speed signals from the DUT back to the DUT. The test module can generate 10Gbps signals with ∼32ps (p-p) jitter, while the loopback path adds ∼20ps (p-p) jitter to the input signal.
{"title":"Stretching the limits of FPGA SerDes for enhanced ATE performance","authors":"A. Majid, D. Keezer","doi":"10.1109/DATE.2010.5457212","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457212","url":null,"abstract":"This paper describes a multi-gigahertz test module to enhance the performance capabilities of automated test equipment (ATE), such as high-speed signal generation, loopback testing, jitter injection, etc. The test module includes a core logic block consisting of a high-performance FPGA. It is designed to be compatible with existing ATE infrastructure; connecting to the device under test (DUT) via a device interface board (DIB). The core logic block controls the test module's functionality, thereby allowing it to operate independently of the ATE. Exploiting recent advances in FPGA SerDes, the test module is able to generate very high (multi-GHz) data rates at a relatively low cost. In this paper we demonstrate multiplexing logic to generate higher data rates (up to 10Gbps) and a low-jitter buffered loopback path to carry high speed signals from the DUT back to the DUT. The test module can generate 10Gbps signals with ∼32ps (p-p) jitter, while the loopback path adds ∼20ps (p-p) jitter to the input signal.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124819207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5456985
Stefan Lämmermann, Jürgen Ruf, T. Kropf, W. Rosenstiel, A. Viehl, Alexander Jesser, L. Hedrich
In this paper a comprehensive assertion-based verification methodology for the digital, analog and software domain of heterogeneous systems is presented. The proposed methodology combines a novel mixedsignal assertion language and the corresponding automatic verification algorithm. The algorithm translates the heterogeneous temporal properties into observer automata for a semi-formal verification. This enables automatic verification of complex heterogeneous properties that can not be verified by existing approaches. The experimental results show the integration of mixed-signal assertions into a simulation environment and demonstrate the broad applicability and the high value of the evolved solution.
{"title":"Towards assertion-based verification of heterogeneous system designs","authors":"Stefan Lämmermann, Jürgen Ruf, T. Kropf, W. Rosenstiel, A. Viehl, Alexander Jesser, L. Hedrich","doi":"10.1109/DATE.2010.5456985","DOIUrl":"https://doi.org/10.1109/DATE.2010.5456985","url":null,"abstract":"In this paper a comprehensive assertion-based verification methodology for the digital, analog and software domain of heterogeneous systems is presented. The proposed methodology combines a novel mixedsignal assertion language and the corresponding automatic verification algorithm. The algorithm translates the heterogeneous temporal properties into observer automata for a semi-formal verification. This enables automatic verification of complex heterogeneous properties that can not be verified by existing approaches. The experimental results show the integration of mixed-signal assertions into a simulation environment and demonstrate the broad applicability and the high value of the evolved solution.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124906417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457101
E. Kilada, K. Stevens
Creating latency insensitive or asynchronous designs from clocked designs has potential benefits of increased modularity and robustness to variations. Several transformations have been suggested in the literature and each of these require a handshake control network (examples include synchronous elasticization and desynchronization). Numerous implementations of the control network are possible. This paper reports on an algorithm that has been proven to generate an optimal control network consisting of the minimum number of 2-input join and 2-output fork control components. This can substantially reduce the area and power consumption of a system. The algorithm has been implemented in a CAD tool, called CNG. It has been applied to the MiniMIPS processor showing a 14% reduction in the number of control steering units over a hand optimized design in a contemporary work.
{"title":"Control network generator for latency insensitive designs","authors":"E. Kilada, K. Stevens","doi":"10.1109/DATE.2010.5457101","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457101","url":null,"abstract":"Creating latency insensitive or asynchronous designs from clocked designs has potential benefits of increased modularity and robustness to variations. Several transformations have been suggested in the literature and each of these require a handshake control network (examples include synchronous elasticization and desynchronization). Numerous implementations of the control network are possible. This paper reports on an algorithm that has been proven to generate an optimal control network consisting of the minimum number of 2-input join and 2-output fork control components. This can substantially reduce the area and power consumption of a system. The algorithm has been implemented in a CAD tool, called CNG. It has been applied to the MiniMIPS processor showing a 14% reduction in the number of control steering units over a hand optimized design in a contemporary work.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113970978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}