Fubing Mao, Wei Zhang, Bo Feng, Bingsheng He, Yuchun Ma
Novel device with multiple FPGAs on-chip based on interposer interconnection has emerged to resolve the IOs limit and improve the inter-FPGA communication delay. However, new challenges arise for the placement on such architecture. Firstly, existing work does not consider the detailed models for the path wirelength and delay estimation for interposer, which may significantly affect the placement quality. Secondly, previous work is mostly based on traditional tile-based placement which is slow for the placement of large design on multiple FPGAs. In this paper, we propose a new fast two-stage modular placement flow for interposer based multiple FPGAs aiming for delay optimization with the incorporation of a detailed interposer routing model for wirelength and delay estimation. Firstly, we adopt the force-directed method for its global property to get an efficient solution as a start point of the placement. Secondly, we adopt the simulated annealing (SA) for its efficiency and effectiveness in searching the refinement solution. In order to speed up the refinement, the hierarchical B*-tree (HB*-tree) is employed to enable a fast search and convergence. The experiments demonstrate that our flow can achieve an efficient solution in a comparable time. The proposed approach is scalable to different design size.
{"title":"Modular placement for interposer based multi-FPGA systems","authors":"Fubing Mao, Wei Zhang, Bo Feng, Bingsheng He, Yuchun Ma","doi":"10.1145/2902961.2903025","DOIUrl":"https://doi.org/10.1145/2902961.2903025","url":null,"abstract":"Novel device with multiple FPGAs on-chip based on interposer interconnection has emerged to resolve the IOs limit and improve the inter-FPGA communication delay. However, new challenges arise for the placement on such architecture. Firstly, existing work does not consider the detailed models for the path wirelength and delay estimation for interposer, which may significantly affect the placement quality. Secondly, previous work is mostly based on traditional tile-based placement which is slow for the placement of large design on multiple FPGAs. In this paper, we propose a new fast two-stage modular placement flow for interposer based multiple FPGAs aiming for delay optimization with the incorporation of a detailed interposer routing model for wirelength and delay estimation. Firstly, we adopt the force-directed method for its global property to get an efficient solution as a start point of the placement. Secondly, we adopt the simulated annealing (SA) for its efficiency and effectiveness in searching the refinement solution. In order to speed up the refinement, the hierarchical B*-tree (HB*-tree) is employed to enable a fast search and convergence. The experiments demonstrate that our flow can achieve an efficient solution in a comparable time. The proposed approach is scalable to different design size.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"3 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132707896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Marthi, Sheikh Rufsan Reza, N. Hossain, J. Millithaler, M. Margala, I. Íñiguez-de-la-Torre, J. Mateos, T. González
In this paper, study of different digital logic circuits developed using two-BDT ballistic nanostructure is presented. New D flipflop (DFF) based on the same nanostructure is also proposed. The logic structure comprises two ballistic deflection transistors (BDTs) that are experimentally proven to operate at Terahertz frequencies. The non-linear behavior of the BDT's transfer characteristic has been perfectly reproduced by means of Monte Carlo simulations, where a specific attention has been devoted to surface charges. An analytical model built on the results of advanced MC simulations has been integrated into a behavioral Verilog AMS module to confirm the functionality of the circuit design. The module is used to analyze operating conditions of different combinational circuits and to investigate the feasibility of DFF design using BDT nanostructure. The simulation results indicate successful operation of both combinational and sequential circuits developed using two-BDT logic structure under proper biasing of gate and source terminals. The operating voltages of the proposed DFF are estimated to be ± 225mV.
{"title":"Modeling and study of two-BDT-nanostructure based sequential logic circuits","authors":"P. Marthi, Sheikh Rufsan Reza, N. Hossain, J. Millithaler, M. Margala, I. Íñiguez-de-la-Torre, J. Mateos, T. González","doi":"10.1145/2902961.2903001","DOIUrl":"https://doi.org/10.1145/2902961.2903001","url":null,"abstract":"In this paper, study of different digital logic circuits developed using two-BDT ballistic nanostructure is presented. New D flipflop (DFF) based on the same nanostructure is also proposed. The logic structure comprises two ballistic deflection transistors (BDTs) that are experimentally proven to operate at Terahertz frequencies. The non-linear behavior of the BDT's transfer characteristic has been perfectly reproduced by means of Monte Carlo simulations, where a specific attention has been devoted to surface charges. An analytical model built on the results of advanced MC simulations has been integrated into a behavioral Verilog AMS module to confirm the functionality of the circuit design. The module is used to analyze operating conditions of different combinational circuits and to investigate the feasibility of DFF design using BDT nanostructure. The simulation results indicate successful operation of both combinational and sequential circuits developed using two-BDT logic structure under proper biasing of gate and source terminals. The operating voltages of the proposed DFF are estimated to be ± 225mV.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125305251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper addresses computing complex functions using unipolar stochastic logic. Stochastic computing requires simple logic gates and is inherently fault-tolerant. Thus, these structures are well suited for nanoscale CMOS technologies. Implementations of complex functions cost extremely low hardware complexity compared to traditional two's complement implementation. In this paper an approach based on polynomial factorization is proposed to compute functions in unipolar stochastic logic. In this approach, functions are expressed using polynomials, which are derived from Taylor expansion or Lagrange interpolation. Polynomials are implemented in stochastic logic by using factorization. Experimental results in terms of accuracy and hardware complexity are presented to compare the proposed designs of complex functions with previous implementations using Bernstein polynomials.
{"title":"Computing complex functions using factorization in unipolar stochastic logic","authors":"Yin Liu, K. Parhi","doi":"10.1145/2902961.2902999","DOIUrl":"https://doi.org/10.1145/2902961.2902999","url":null,"abstract":"This paper addresses computing complex functions using unipolar stochastic logic. Stochastic computing requires simple logic gates and is inherently fault-tolerant. Thus, these structures are well suited for nanoscale CMOS technologies. Implementations of complex functions cost extremely low hardware complexity compared to traditional two's complement implementation. In this paper an approach based on polynomial factorization is proposed to compute functions in unipolar stochastic logic. In this approach, functions are expressed using polynomials, which are derived from Taylor expansion or Lagrange interpolation. Polynomials are implemented in stochastic logic by using factorization. Experimental results in terms of accuracy and hardware complexity are presented to compare the proposed designs of complex functions with previous implementations using Bernstein polynomials.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"186 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114853314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Monte-Carlo method is the method of choice for accurate yield estimation. Standard Monte-Carlo methods suffer from a huge computational burden even though they are very accurate. Recently a Monte-Carlo method was proposed for the parametric yield estimation of digital integrated circuits [13] that achieves significant computational savings at no loss of accuracy by focusing on those statistical variables that have a significant impact on yield. We adapt this technique to the context of analog circuit yield estimation. The inputs to the proposed method are the designable parameters, the uncontrollable statistical variations, and the operating conditions of interest. The technique of [13] operates on a linear model of circuit variations. In our work we first convexify the nonlinear design constraints to obtain a convex feasible region. We then consider an accurate polytope-approximation of the convex feasible region by taking tangent hyperplanes at various points on the surface of the convex region. The hyperplanes give rise to a matrix of design variable sensitivities, which is then used to glean information about the importance of design variables for yield estimation. Finally the knowledge of which design variables are very important for yield estimation is used to allow the Monte-Carlo technique achieve a lower error compared to standard Monte-Carlo in the same amount of simulation time.
{"title":"Parameter-importance based Monte-Carlo technique for variation-aware analog yield optimization","authors":"Sita Kondamadugula, S. Naidu","doi":"10.1145/2902961.2903018","DOIUrl":"https://doi.org/10.1145/2902961.2903018","url":null,"abstract":"The Monte-Carlo method is the method of choice for accurate yield estimation. Standard Monte-Carlo methods suffer from a huge computational burden even though they are very accurate. Recently a Monte-Carlo method was proposed for the parametric yield estimation of digital integrated circuits [13] that achieves significant computational savings at no loss of accuracy by focusing on those statistical variables that have a significant impact on yield. We adapt this technique to the context of analog circuit yield estimation. The inputs to the proposed method are the designable parameters, the uncontrollable statistical variations, and the operating conditions of interest. The technique of [13] operates on a linear model of circuit variations. In our work we first convexify the nonlinear design constraints to obtain a convex feasible region. We then consider an accurate polytope-approximation of the convex feasible region by taking tangent hyperplanes at various points on the surface of the convex region. The hyperplanes give rise to a matrix of design variable sensitivities, which is then used to glean information about the importance of design variables for yield estimation. Finally the knowledge of which design variables are very important for yield estimation is used to allow the Monte-Carlo technique achieve a lower error compared to standard Monte-Carlo in the same amount of simulation time.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115203966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Embedded computing systems include several off-chip serial links, that are typically used to interface processing elements with peripherals, such as sensors, actuators and I/O controllers. Because of the long physical lines of these connections, they can contribute significantly to the total energy consumption. On the other hand, many embedded applications are error resilient, i.e. they can tolerate intermediate approximations without a significant impact on the final quality of results. This feature can be exploited in serial buses to explore the trade-off between data approximations and energy consumption. We propose a simple yet very effective approximate encoding for reducing dynamic energy in serial buses. Our approach uses differential encoding as a baseline scheme, and extends it with bounded approximations to overcome the intrinsic limitations of differential encoding for data with low temporal correlation. We show that encoder and decoder for this algorithm can be implemented in hardware with no throughput loss and truly marginal power overheads. Nonetheless, our approach is superior to state-of-the-art approximate encodings, and for realistic inputs it reaches up to 95% power reduction with <;1% average error on decoded data.
{"title":"Approximate differential encoding for energy-efficient serial communication","authors":"D. J. Pagliari, E. Macii, M. Poncino","doi":"10.1145/2902961.2902974","DOIUrl":"https://doi.org/10.1145/2902961.2902974","url":null,"abstract":"Embedded computing systems include several off-chip serial links, that are typically used to interface processing elements with peripherals, such as sensors, actuators and I/O controllers. Because of the long physical lines of these connections, they can contribute significantly to the total energy consumption. On the other hand, many embedded applications are error resilient, i.e. they can tolerate intermediate approximations without a significant impact on the final quality of results. This feature can be exploited in serial buses to explore the trade-off between data approximations and energy consumption. We propose a simple yet very effective approximate encoding for reducing dynamic energy in serial buses. Our approach uses differential encoding as a baseline scheme, and extends it with bounded approximations to overcome the intrinsic limitations of differential encoding for data with low temporal correlation. We show that encoder and decoder for this algorithm can be implemented in hardware with no throughput loss and truly marginal power overheads. Nonetheless, our approach is superior to state-of-the-art approximate encodings, and for realistic inputs it reaches up to 95% power reduction with <;1% average error on decoded data.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115162798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Despite many attractive advantages, Null Convention Logic (NCL) remains to be a niche largely due to its high implementation costs. Using emerging spintronic devices, this paper proposes a Domain-Wall-Motion-based NCL circuit design methodology that achieves approximately 30× and 8× improvements in energy efficiency and chip layout area, respectively, over its equivalent CMOS design, while maintaining similar delay performance for a 32-bit full adder. These advantages are made possible mostly by exploiting the domain wall motion physics to natively realize the hysteresis critically needed in NCL. More Interestingly, this design choice achieves ultra-high robustness by allowing spintronic device parameters to vary within a predetermined range while still achieving correct operations.
{"title":"Ultra-robust null convention logic circuit with emerging domain wall devices","authors":"Yu Bai, Bo Hu, W. Kuang, Mingjie Lin","doi":"10.1145/2902961.2903019","DOIUrl":"https://doi.org/10.1145/2902961.2903019","url":null,"abstract":"Despite many attractive advantages, Null Convention Logic (NCL) remains to be a niche largely due to its high implementation costs. Using emerging spintronic devices, this paper proposes a Domain-Wall-Motion-based NCL circuit design methodology that achieves approximately 30× and 8× improvements in energy efficiency and chip layout area, respectively, over its equivalent CMOS design, while maintaining similar delay performance for a 32-bit full adder. These advantages are made possible mostly by exploiting the domain wall motion physics to natively realize the hysteresis critically needed in NCL. More Interestingly, this design choice achieves ultra-high robustness by allowing spintronic device parameters to vary within a predetermined range while still achieving correct operations.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122504156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Authentication of electronic devices has become critical. Hardware authentication is one way to enhance security of a chip. Along with software, it makes it harder for an intruder to access any computer, smart-phone, or other devices without authorization. One way of authenticating a device through hardware is to use the fabrication anomalies, which are random and unclonable. This mechanism is called a Physical Unclonable Function (PUF). PUFs are easy to evaluate but hard to predict. PUF is a concept that gained popularity since the past decade, when researchers started taking advantage of the randomness of electrical signals in order to build a unique authentication block. This survey will show the state-of-the-art devices that are currently investigated as PUFs. The different technologies are compared by taking into account reproducibility, uniqueness, randomness, area, scalability, and compatibility with CMOS. Emphasis is put on technologies that are emerging and gaining commercial interest. Through comparisons, we will show their applicability to different environments.
{"title":"Survey of emerging technology based physical unclonable funtions","authors":"Ilia A. Bautista Adames, J. Das, S. Bhanja","doi":"10.1145/2902961.2903044","DOIUrl":"https://doi.org/10.1145/2902961.2903044","url":null,"abstract":"Authentication of electronic devices has become critical. Hardware authentication is one way to enhance security of a chip. Along with software, it makes it harder for an intruder to access any computer, smart-phone, or other devices without authorization. One way of authenticating a device through hardware is to use the fabrication anomalies, which are random and unclonable. This mechanism is called a Physical Unclonable Function (PUF). PUFs are easy to evaluate but hard to predict. PUF is a concept that gained popularity since the past decade, when researchers started taking advantage of the randomness of electrical signals in order to build a unique authentication block. This survey will show the state-of-the-art devices that are currently investigated as PUFs. The different technologies are compared by taking into account reproducibility, uniqueness, randomness, area, scalability, and compatibility with CMOS. Emphasis is put on technologies that are emerging and gaining commercial interest. Through comparisons, we will show their applicability to different environments.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124345174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Out of the many options available for thermal simulation of digital electronic systems, those based on solving an RC equivalent circuit of the thermal network are the most popular choice in the EDA community, as they provide a reasonable tradeoff between accuracy and complexity. HotSpot, in particular, has become the de-facto standard in these communities, although other simulators are also popular. These tools have many benefits, but they are relatively inefficient when performing thermal analysis for long simulation times, due to the occurrence of a large number of redundant computations intrinsic in the underlying models. This work shows how a standard description language, namely SystemC and its analog and mixed-signal (AMS) extension, can be used to successfully simulate the equivalent thermal network, by achieving accuracy comparable to existing simulators, yet with much better performance. Results show that SystemC-AMS thermal simulation can outpace HotSpot simulation by 10X to 90X, with speedup improving as the size of the thermal network increases, and negligible estimation error. As a further advantage, the adoption of the same language to describe functionality and temperature allows the simultaneous simulation of both dimensions with no co-simulation overhead, thus enhancing the overall design flow.
{"title":"Fast thermal simulation using SystemC-AMS","authors":"Yukai Chen, S. Vinco, E. Macii, M. Poncino","doi":"10.1145/2902961.2902975","DOIUrl":"https://doi.org/10.1145/2902961.2902975","url":null,"abstract":"Out of the many options available for thermal simulation of digital electronic systems, those based on solving an RC equivalent circuit of the thermal network are the most popular choice in the EDA community, as they provide a reasonable tradeoff between accuracy and complexity. HotSpot, in particular, has become the de-facto standard in these communities, although other simulators are also popular. These tools have many benefits, but they are relatively inefficient when performing thermal analysis for long simulation times, due to the occurrence of a large number of redundant computations intrinsic in the underlying models. This work shows how a standard description language, namely SystemC and its analog and mixed-signal (AMS) extension, can be used to successfully simulate the equivalent thermal network, by achieving accuracy comparable to existing simulators, yet with much better performance. Results show that SystemC-AMS thermal simulation can outpace HotSpot simulation by 10X to 90X, with speedup improving as the size of the thermal network increases, and negligible estimation error. As a further advantage, the adoption of the same language to describe functionality and temperature allows the simultaneous simulation of both dimensions with no co-simulation overhead, thus enhancing the overall design flow.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126171268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aditya Dalakoti, Carrie Segal, Merritt Miller, F. Brewer
We present a metric for event detection, targeted for the analysis of CMOS asynchronous serial data links. Our metric is used to analyze signaling strategies that allow for coincident or nearly coincident detection of both data and event timing. The metric predicts that the CMOS link signaling mechanism has substantial implicit dispersion and intersymbol interference [ISI] tolerance when compared to conventionally timed links. In fact, it predicts correct link operation in situations where eye-diagram techniques predict link failure. Practical operation margins and metrics are described and evaluated for PCB and cabling solutions suggesting 10+ Gb/s low-power asynchronous links could be implemented in CMOS 130nm technology.
{"title":"Asynchronous high speed serial links analysis using integrated charge for event detection","authors":"Aditya Dalakoti, Carrie Segal, Merritt Miller, F. Brewer","doi":"10.1145/2902961.2902998","DOIUrl":"https://doi.org/10.1145/2902961.2902998","url":null,"abstract":"We present a metric for event detection, targeted for the analysis of CMOS asynchronous serial data links. Our metric is used to analyze signaling strategies that allow for coincident or nearly coincident detection of both data and event timing. The metric predicts that the CMOS link signaling mechanism has substantial implicit dispersion and intersymbol interference [ISI] tolerance when compared to conventionally timed links. In fact, it predicts correct link operation in situations where eye-diagram techniques predict link failure. Practical operation margins and metrics are described and evaluated for PCB and cabling solutions suggesting 10+ Gb/s low-power asynchronous links could be implemented in CMOS 130nm technology.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128467284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we explore the power/quality trade-off for streaming applications with a shift from the computation to the communication aspects of the design. The paper proposes a systematic exploration methodology to formulate and traverse power/quality trade-off for the class of adaptive streaming applications. The formalization enables to procedurally transition from a set of design requirements to architecture goals. The architecture goals can then be realized through design choices yielding system designs that meet the initial requirements. The reported results are based on an actual implementation of Mixture of Gaussian (MoG) background subtraction on Xilinx Zynq platform.
{"title":"Guiding power/quality exploration for communication-intense stream processing","authors":"H. Tabkhi, Majid Sabbagh, G. Schirner","doi":"10.1145/2902961.2903004","DOIUrl":"https://doi.org/10.1145/2902961.2903004","url":null,"abstract":"In this paper, we explore the power/quality trade-off for streaming applications with a shift from the computation to the communication aspects of the design. The paper proposes a systematic exploration methodology to formulate and traverse power/quality trade-off for the class of adaptive streaming applications. The formalization enables to procedurally transition from a set of design requirements to architecture goals. The architecture goals can then be realized through design choices yielding system designs that meet the initial requirements. The reported results are based on an actual implementation of Mixture of Gaussian (MoG) background subtraction on Xilinx Zynq platform.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128916622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}