The paper introduces foundations of the "Flexible Routing Method" that belongs to the topologically-geometric type. It develops the idea to divide the routing problem on two separate successive stages: topological and geometrical. At the first stage it was suggested to use a discrete topological model as Delaunay triangulation or/and Voronoi polygons to describe topology. The explicit and implicit topology models are offered which describe the relative topological nets location without specifying their geometrical characteristics. At the second stage possible is the laying the nets of arbitrary configuration: orthogonal, piecewise linear, curvilinear, under arbitrary angles and arbitrary widths.
{"title":"Topologically-geometric routing","authors":"R. Bazylevych, M. Palasinski, L. Bazylevych","doi":"10.1145/2947357.2947367","DOIUrl":"https://doi.org/10.1145/2947357.2947367","url":null,"abstract":"The paper introduces foundations of the \"Flexible Routing Method\" that belongs to the topologically-geometric type. It develops the idea to divide the routing problem on two separate successive stages: topological and geometrical. At the first stage it was suggested to use a discrete topological model as Delaunay triangulation or/and Voronoi polygons to describe topology. The explicit and implicit topology models are offered which describe the relative topological nets location without specifying their geometrical characteristics. At the second stage possible is the laying the nets of arbitrary configuration: orthogonal, piecewise linear, curvilinear, under arbitrary angles and arbitrary widths.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116528350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ishan G. Thakkar, S. V. R. Chittamuru, S. Pasricha
Photonic devices fabricated with back-end compatible silicon pho-tonic (BCSP) materials can provide independence from the complex CMOS front-end compatible silicon photonic (FCSP) process, to sig-nificantly enhance photonic network-on-chip (PNoC) architecture performance. In this paper, we present a detailed comparative analy-sis of a number of design tradeoffs for CMOS front-end and back-end compatible devices for silicon photonic interconnects. A cross-layer optimization of multiple device-level and link-level design pa-rameters is performed to enable the design of energy-efficient on-chip photonic interconnects using BCSP devices. The optimized design of BCSP on-chip links renders more energy-efficiency and aggregate bandwidth than FCSP on-chip links, in spite of the inferior opto-elec-tronic properties of BCSP devices. Our experimental analysis com-pares the use of BCSP and FCSP links at the architecture level, and shows that the optimized design of the BCSP-based Firefly PNoC achieves 1.15x greater throughput and 12.4% less energy-per-bit on average than the optimized design of FCSP-based Firefly PNoC. Similarly, the optimized design of the BCSP-based Corona PNoC achieves 3.5x greater throughput and 39.5% less energy-per-bit on average than the optimized design of FCSP-based Corona PNoC.
{"title":"A comparative analysis of front-end and back-end compatible silicon photonic on-chip interconnects","authors":"Ishan G. Thakkar, S. V. R. Chittamuru, S. Pasricha","doi":"10.1145/2947357.2947362","DOIUrl":"https://doi.org/10.1145/2947357.2947362","url":null,"abstract":"Photonic devices fabricated with back-end compatible silicon pho-tonic (BCSP) materials can provide independence from the complex CMOS front-end compatible silicon photonic (FCSP) process, to sig-nificantly enhance photonic network-on-chip (PNoC) architecture performance. In this paper, we present a detailed comparative analy-sis of a number of design tradeoffs for CMOS front-end and back-end compatible devices for silicon photonic interconnects. A cross-layer optimization of multiple device-level and link-level design pa-rameters is performed to enable the design of energy-efficient on-chip photonic interconnects using BCSP devices. The optimized design of BCSP on-chip links renders more energy-efficiency and aggregate bandwidth than FCSP on-chip links, in spite of the inferior opto-elec-tronic properties of BCSP devices. Our experimental analysis com-pares the use of BCSP and FCSP links at the architecture level, and shows that the optimized design of the BCSP-based Firefly PNoC achieves 1.15x greater throughput and 12.4% less energy-per-bit on average than the optimized design of FCSP-based Firefly PNoC. Similarly, the optimized design of the BCSP-based Corona PNoC achieves 3.5x greater throughput and 39.5% less energy-per-bit on average than the optimized design of FCSP-based Corona PNoC.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133305660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic voltage scaling (DVS) has become one of the most effective approaches to achieve ultra-low-power SoC. To eliminate timing errors arising from DVS, several error-resilient circuit design techniques were proposed to detect and/or correct timing violations. The most recently proposed time-borrowing-and-local-boosting (TBLB) technique has the advantage of lower power consumption and less performance degradation due to the needlessness of pipeline stalls. On the other hand, to make the best use of the TBLB technique, a special timing requirement for TBLB latches must be considered in the physical design process. To address this issue, a novel reliability-aware latch clustering method for low-power TBLB resilient circuits is introduced. Experimental results show that the proposed approach is very effective in reducing the delay of both combinational and error-detection circuits, which indicates better circuit reliability.
{"title":"Latch clustering for minimizing detection-to-boosting latency toward low-power resilient circuits","authors":"Chih-Cheng Hsu, Mark Po-Hung Lin, M. Hashimoto","doi":"10.1145/2947357.2947364","DOIUrl":"https://doi.org/10.1145/2947357.2947364","url":null,"abstract":"Dynamic voltage scaling (DVS) has become one of the most effective approaches to achieve ultra-low-power SoC. To eliminate timing errors arising from DVS, several error-resilient circuit design techniques were proposed to detect and/or correct timing violations. The most recently proposed time-borrowing-and-local-boosting (TBLB) technique has the advantage of lower power consumption and less performance degradation due to the needlessness of pipeline stalls. On the other hand, to make the best use of the TBLB technique, a special timing requirement for TBLB latches must be considered in the physical design process. To address this issue, a novel reliability-aware latch clustering method for low-power TBLB resilient circuits is introduced. Experimental results show that the proposed approach is very effective in reducing the delay of both combinational and error-detection circuits, which indicates better circuit reliability.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126021055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad A. Ahmed, S. Mohapatra, M. Chrzanowska-Jeske
A very important challenge in designing through-silicon via (TSV)-based 3D ICs is to accurately estimate, through all stages of the physical design, the interconnect delay which is strongly dependent on the layout of 3D IC. The earlier in the design process and more accurate it can be done; the better design decisions can be made. Incorporating an optimal buffer insertion approach in the early layout design stage can significantly minimize delay and power in 3D circuits. Unlike 2D ICs, buffer insertion in 3D ICs needs careful consideration of additional design constraints in interconnects spanning multiple device layers. In this paper, we propose a novel buffer insertion scheme for delay optimization during 3D floorplanning. For individual 3D nets, the algorithm efficiently computes the desired distance between consecutive buffers (buffer insertion length), which depends on the non-negligible TSV RC delay contribution of the net. This technique of variable buffer insertion length, used during floorplanning, allows optimizing buffers for individual 3D interconnects and reduces overall buffer count by up to 25% and total power consumption by up to 12%. The proposed approach also includes a method for buffer insertion around a TSV, based on the TSV location and its RC delay. Our experiments suggest that the proposed method of buffer planning around TSVs avoids delay violation and reduces delay across TSVs up to 11%, minimizing buffer usage. The paper also analyzes the impact of key parameters such as buffer size and TSV contact resistance on the delay and power dissipation in 3D interconnects.
{"title":"Buffered interconnects in 3D IC layout design","authors":"Mohammad A. Ahmed, S. Mohapatra, M. Chrzanowska-Jeske","doi":"10.1145/2947357.2947366","DOIUrl":"https://doi.org/10.1145/2947357.2947366","url":null,"abstract":"A very important challenge in designing through-silicon via (TSV)-based 3D ICs is to accurately estimate, through all stages of the physical design, the interconnect delay which is strongly dependent on the layout of 3D IC. The earlier in the design process and more accurate it can be done; the better design decisions can be made. Incorporating an optimal buffer insertion approach in the early layout design stage can significantly minimize delay and power in 3D circuits. Unlike 2D ICs, buffer insertion in 3D ICs needs careful consideration of additional design constraints in interconnects spanning multiple device layers. In this paper, we propose a novel buffer insertion scheme for delay optimization during 3D floorplanning. For individual 3D nets, the algorithm efficiently computes the desired distance between consecutive buffers (buffer insertion length), which depends on the non-negligible TSV RC delay contribution of the net. This technique of variable buffer insertion length, used during floorplanning, allows optimizing buffers for individual 3D interconnects and reduces overall buffer count by up to 25% and total power consumption by up to 12%. The proposed approach also includes a method for buffer insertion around a TSV, based on the TSV location and its RC delay. Our experiments suggest that the proposed method of buffer planning around TSVs avoids delay violation and reduces delay across TSVs up to 11%, minimizing buffer usage. The paper also analyzes the impact of key parameters such as buffer size and TSV contact resistance on the delay and power dissipation in 3D interconnects.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123757456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carrie Segal, Aditya Dalakoti, Merritt Miller, F. Brewer
Hardware neuromorphic systems are challenged to achieve biologically realistic levels of interconnectivity. When building a physical implementation of a neural net, the properties of the media immediately impose limits on the number of interconnects and available timing options. The design of any system must consider the energy and area costs associated with the physical layout of neuron core connectivity, rst, by accepting the wiring limits imposed by Rent's rule and second, by understanding the temporal overhead introduced by routing. The presented results show the energyarea trade-o for a model of a neuromorphic system with event driven interconnections. The low area overhead of the asynchronous pulse-mode links create an attractive opportunity for a digital neuromorphic system with a connectivity model closer to the existing software models of neural nets.
{"title":"Connectivity effects on energy and area for neuromorphic system with high speed asynchronous pulse mode links","authors":"Carrie Segal, Aditya Dalakoti, Merritt Miller, F. Brewer","doi":"10.1145/2947357.2947365","DOIUrl":"https://doi.org/10.1145/2947357.2947365","url":null,"abstract":"Hardware neuromorphic systems are challenged to achieve biologically realistic levels of interconnectivity. When building a physical implementation of a neural net, the properties of the media immediately impose limits on the number of interconnects and available timing options. The design of any system must consider the energy and area costs associated with the physical layout of neuron core connectivity, rst, by accepting the wiring limits imposed by Rent's rule and second, by understanding the temporal overhead introduced by routing. The presented results show the energyarea trade-o for a model of a neuromorphic system with event driven interconnections. The low area overhead of the asynchronous pulse-mode links create an attractive opportunity for a digital neuromorphic system with a connectivity model closer to the existing software models of neural nets.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123925178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long distance data communication over multi-hop wireline paths in conventional Networks-on-Chips (NoCs) cause high energy consumption and degradation in bandwidth. Wireless interconnects in the millimeter-wave band have emerged as an energy-efficient interconnection paradigm for multi-core chips interconnected with NoCs. However, spatial variations in traffic distribution and temporal variations in workloads can exert variable bandwidth demands on the NoC fabric. Wireless interconnects which do not require a physical layout of interconnects can be utilized to mitigate this issue. In order to dynamically allocate variable bandwidth to the wireless transceivers depending on the demand, the design of a dynamic and efficient Medium Access Control (MAC) mechanism to grant access to the on-chip wireless communication channel is needed. In this paper, a history based predictor, which can predict the bandwidth demand of the wireless nodes in the wireless NoC is designed. Based on these predicted demands we propose the design of two MAC mechanisms that are able to dynamically allocate bandwidth to the wireless transceivers. Through system level simulations, we show that the demand-aware MAC mechanisms are more energy efficient as well as capable of sustaining higher data bandwidth in wireless NoCs.
{"title":"A demand-aware predictive dynamic bandwidth allocation mechanism for wireless network-on-chip","authors":"N. Mansoor, Md Shahriar Shamim, A. Ganguly","doi":"10.1145/2947357.2947361","DOIUrl":"https://doi.org/10.1145/2947357.2947361","url":null,"abstract":"Long distance data communication over multi-hop wireline paths in conventional Networks-on-Chips (NoCs) cause high energy consumption and degradation in bandwidth. Wireless interconnects in the millimeter-wave band have emerged as an energy-efficient interconnection paradigm for multi-core chips interconnected with NoCs. However, spatial variations in traffic distribution and temporal variations in workloads can exert variable bandwidth demands on the NoC fabric. Wireless interconnects which do not require a physical layout of interconnects can be utilized to mitigate this issue. In order to dynamically allocate variable bandwidth to the wireless transceivers depending on the demand, the design of a dynamic and efficient Medium Access Control (MAC) mechanism to grant access to the on-chip wireless communication channel is needed. In this paper, a history based predictor, which can predict the bandwidth demand of the wireless nodes in the wireless NoC is designed. Based on these predicted demands we propose the design of two MAC mechanisms that are able to dynamically allocate bandwidth to the wireless transceivers. Through system level simulations, we show that the demand-aware MAC mechanisms are more energy efficient as well as capable of sustaining higher data bandwidth in wireless NoCs.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122050195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
3DICs with multiple tiers are expected to achieve large benefits (e.g., in terms of power, area) as compared to conventional planar designs. However, few if any previous works study upper bounds on power and area benefits from 3DIC integration with multiple tiers. In this work, we use the concept of implementation with infinite dimension to estimate the upper bound of power and area benefit from 3DICs. We observe that the maximum power benefit evaluated with infinite dimension is only 18% for particular designs. Such benefits further reduce under the assumption of inter-tier variation. In addition, we study power of designs across various dimensions (e.g., pseudo-1D, 2D, 3D with two, three and four tiers).1 We observe that design power sensitivity to implementation with different dimensions correlates well with placement-based Rent parameter of the netlist. Therefore, placement-based Rent parameter can possibly be a simple indicator of 3D power benefit. Our study also shows that netlist synthesis and optimization should be aware of the target implementation dimension (e.g., 2D versus 3D).
{"title":"Revisiting 3DIC Benefit with Multiple Tiers","authors":"W. Chan, A. Kahng, Jiajia Li","doi":"10.1145/2947357.2947363","DOIUrl":"https://doi.org/10.1145/2947357.2947363","url":null,"abstract":"3DICs with multiple tiers are expected to achieve large benefits (e.g., in terms of power, area) as compared to conventional planar designs. However, few if any previous works study upper bounds on power and area benefits from 3DIC integration with multiple tiers. In this work, we use the concept of implementation with infinite dimension to estimate the upper bound of power and area benefit from 3DICs. We observe that the maximum power benefit evaluated with infinite dimension is only 18% for particular designs. Such benefits further reduce under the assumption of inter-tier variation. In addition, we study power of designs across various dimensions (e.g., pseudo-1D, 2D, 3D with two, three and four tiers).1 We observe that design power sensitivity to implementation with different dimensions correlates well with placement-based Rent parameter of the netlist. Therefore, placement-based Rent parameter can possibly be a simple indicator of 3D power benefit. Our study also shows that netlist synthesis and optimization should be aware of the target implementation dimension (e.g., 2D versus 3D).","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115586331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enes Eken, Ismail Bayram, Yaojun Zhang, Bonan Yan, Wenqing Wu, Hai Helen Li, Yiran Chen
In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted significant attentions from both industry and academia due to its attractive attributes such as small cell area and non-volatility. However, long switching time and large programming energy of Magnetic Tunneling Junction (MTJ) continue being major challenges in STT-RAM designs. In order to overcome this problem, a Spin-Hall Effect (SHE) assisted STT-RAM structure (SHE-RAM) has been recently invented. In this work, we investigate two possible SHE-RAM designs from the aspects of two different write access operations, namely, High Density SHE-RAM and Disturbance Free SHE-RAM, respectively. In High Density SHE-RAM, SHE current is shared by the entire bit line. Such a structure removes the SHE control transistor from each SHE-RAM cell and hence, substantially reduces the memory cell area. In Disturbance Free SHE-RAM, one memory cell contains two transistors to remove the disturbance to the unselected bits and eliminate the possible erroneous flipping of the bits.
{"title":"Spin-hall assisted STT-RAM design and discussion","authors":"Enes Eken, Ismail Bayram, Yaojun Zhang, Bonan Yan, Wenqing Wu, Hai Helen Li, Yiran Chen","doi":"10.1145/2947357.2947360","DOIUrl":"https://doi.org/10.1145/2947357.2947360","url":null,"abstract":"In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted significant attentions from both industry and academia due to its attractive attributes such as small cell area and non-volatility. However, long switching time and large programming energy of Magnetic Tunneling Junction (MTJ) continue being major challenges in STT-RAM designs. In order to overcome this problem, a Spin-Hall Effect (SHE) assisted STT-RAM structure (SHE-RAM) has been recently invented. In this work, we investigate two possible SHE-RAM designs from the aspects of two different write access operations, namely, High Density SHE-RAM and Disturbance Free SHE-RAM, respectively. In High Density SHE-RAM, SHE current is shared by the entire bit line. Such a structure removes the SHE control transistor from each SHE-RAM cell and hence, substantially reduces the memory cell area. In Disturbance Free SHE-RAM, one memory cell contains two transistors to remove the disturbance to the unselected bits and eliminate the possible erroneous flipping of the bits.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131015847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Front Matter]","authors":"A. Fu, A. Halevy","doi":"10.1525/jer.2007.2.1.fm","DOIUrl":"https://doi.org/10.1525/jer.2007.2.1.fm","url":null,"abstract":"Presents the font cover of this conference.","PeriodicalId":331624,"journal":{"name":"2016 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134304333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}