Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158661
Yusuke Shirota, S. Yoshimura, Tatsunori Kanai
Aggressive use of low power modes in embedded systems using emerging non-volatile or low power compute state retainable devices can greatly reduce its power consumption of idle-state. However, in general, non-volatile devices require comparatively large power to switch between the stable states. Therefore, to realize extremely low power mobile platforms with powerful multimedia application processor running solely on photovoltaic-power, mitigating power consumption of its active-state is the next issue. Replacing power hungry conventional LCDs with non-volatile displays is inevitable in realizing such low power platforms, but naive replacement is insufficient. As such, low power control cognizant of non-volatile device properties is necessary[2]. We propose a display update request scheduling scheme designed for a promising non-volatile display: Electronic Paper Display(EPD) and give deep analysis of power consumption. Proposed scheme dynamically rearranges update requests ill-suited for EPDs to localized and collision-free low power consuming requests at the device driver level, reducing EPD-based tablet's energy consumption by up to 49% without requiring application specific modifications.
{"title":"Electronic Paper Display update scheduler for extremely low power non-volatile embedded systems","authors":"Yusuke Shirota, S. Yoshimura, Tatsunori Kanai","doi":"10.1109/CoolChips.2015.7158661","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158661","url":null,"abstract":"Aggressive use of low power modes in embedded systems using emerging non-volatile or low power compute state retainable devices can greatly reduce its power consumption of idle-state. However, in general, non-volatile devices require comparatively large power to switch between the stable states. Therefore, to realize extremely low power mobile platforms with powerful multimedia application processor running solely on photovoltaic-power, mitigating power consumption of its active-state is the next issue. Replacing power hungry conventional LCDs with non-volatile displays is inevitable in realizing such low power platforms, but naive replacement is insufficient. As such, low power control cognizant of non-volatile device properties is necessary[2]. We propose a display update request scheduling scheme designed for a promising non-volatile display: Electronic Paper Display(EPD) and give deep analysis of power consumption. Proposed scheme dynamically rearranges update requests ill-suited for EPDs to localized and collision-free low power consuming requests at the device driver level, reducing EPD-based tablet's energy consumption by up to 49% without requiring application specific modifications.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"932 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123288737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158654
Shinpei Kato
Autonomous driving is becoming more and more multidisciplinary. Not only vehicular technologies but also computing, networking, and data management technologies are involved in autonomous driving. Of particular interest includes the trade-off between in-vehicle computing and cloud computing to support artificial intelligence of autonomous driving. Perception and planning of autonomy requires high-performance computing while battery-driven vehicles must consider power problems. Offloading such computations onto the cloud could be a drastic solution, though safety and reliability of driving remain major concerns. Data management is also a grand challenge of autonomous driving. In particular, high-precision maps are considered to be the common infrastructure to self-localize vehicles and efficiently route them to their destinations. Unfortunately, current navigation systems are not well compatible to high-precision maps and the sustainable management of map data also remains an open problem. These problems of autonomous driving are not dedicated to particular technologies but need to be addressed by tight coordination of multiple technologies. This panel gathers experts from multiple areas across vehicles, computing platforms, maps, and consumer electronics.
{"title":"Panel discussions computing technology for autonomous driving","authors":"Shinpei Kato","doi":"10.1109/CoolChips.2015.7158654","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158654","url":null,"abstract":"Autonomous driving is becoming more and more multidisciplinary. Not only vehicular technologies but also computing, networking, and data management technologies are involved in autonomous driving. Of particular interest includes the trade-off between in-vehicle computing and cloud computing to support artificial intelligence of autonomous driving. Perception and planning of autonomy requires high-performance computing while battery-driven vehicles must consider power problems. Offloading such computations onto the cloud could be a drastic solution, though safety and reliability of driving remain major concerns. Data management is also a grand challenge of autonomous driving. In particular, high-precision maps are considered to be the common infrastructure to self-localize vehicles and efficiently route them to their destinations. Unfortunately, current navigation systems are not well compatible to high-precision maps and the sustainable management of map data also remains an open problem. These problems of autonomous driving are not dedicated to particular technologies but need to be addressed by tight coordination of multiple technologies. This panel gathers experts from multiple areas across vehicles, computing platforms, maps, and consumer electronics.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128076111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158657
A. Deval, Avinash Ananthakrishnan, Craig Forbell
The desire to deliver breakthrough performance in a tablet form factor required several innovations in the 14nm Intel® flagship Core™ processor (Broadwell). Better frequency control algorithms including duty cycling graphics cores were developed to improve energy efficiency. New power sharing algorithms were developed to maximize performance of multiple compute domains within tight thermal and power delivery constraints. Innovations resulted in upto 50% increase in performance and upto 25% improvement in battery life over a Haswell system thermally constrained to a 4.5W fanless form factor.
{"title":"Power management on 14 nm Intel® Core− M processor","authors":"A. Deval, Avinash Ananthakrishnan, Craig Forbell","doi":"10.1109/CoolChips.2015.7158657","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158657","url":null,"abstract":"The desire to deliver breakthrough performance in a tablet form factor required several innovations in the 14nm Intel® flagship Core™ processor (Broadwell). Better frequency control algorithms including duty cycling graphics cores were developed to improve energy efficiency. New power sharing algorithms were developed to maximize performance of multiple compute domains within tight thermal and power delivery constraints. Innovations resulted in upto 50% increase in performance and upto 25% improvement in battery life over a Haswell system thermally constrained to a 4.5W fanless form factor.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124522056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158664
Aiko Iwasaki, Yuichiro Shibata, K. Oguri, Ryuichi Harasawa
This paper proposes an FPGA-based soft core processor architecture equipped with a configurable accelerator to speed up GF(2m) arithmetic for elliptic curve cryptography (ECC) systems. Focusing on the fact the number of operations required for GF(2m) arithmetic is influenced by the relationship between the irreducible polynomial and the machine word size, we propose an approach where the word size of the accelerator is tailored to a given irreducible polynomial. The evaluation results reveal that the performance and the energy efficiency of GF(2m) multiplication including reduction can be improved by up to 6.67 times and 5.24 times, respectively.
{"title":"An energy-efficient FPGA-based soft-core processor with a configurable word size ECC arithmetic accelerator","authors":"Aiko Iwasaki, Yuichiro Shibata, K. Oguri, Ryuichi Harasawa","doi":"10.1109/CoolChips.2015.7158664","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158664","url":null,"abstract":"This paper proposes an FPGA-based soft core processor architecture equipped with a configurable accelerator to speed up GF(2m) arithmetic for elliptic curve cryptography (ECC) systems. Focusing on the fact the number of operations required for GF(2m) arithmetic is influenced by the relationship between the irreducible polynomial and the machine word size, we propose an approach where the word size of the accelerator is tailored to a given irreducible polynomial. The evaluation results reveal that the performance and the energy efficiency of GF(2m) multiplication including reduction can be improved by up to 6.67 times and 5.24 times, respectively.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128041082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158655
Johannes Maximilian Kühn, H. Amano, O. Bringmann, W. Rosenstiel
New SOI processes offer unprecedented flexibility in regard to low-power and performance. The use of fine-grained body biasing in STMicro's 28nm UTBB-FDSOI is evaluated for a Dynamically Reconfigurable Processor design in a frequency scaling scenario. Three different strategies are evaluated for Processing Elements: Static, programmable and dynamic body biasing. Fine-grained body biasing significantly mitigates increased leakage currents of forward body biasing between 42.85% to 64.5% on average. This makes static body biasing a viable low-cost option and makes dynamic body biasing worthwhile even at short time periods.
{"title":"Fined-grained body biasing for frequency scaling in advanced SOI processes","authors":"Johannes Maximilian Kühn, H. Amano, O. Bringmann, W. Rosenstiel","doi":"10.1109/CoolChips.2015.7158655","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158655","url":null,"abstract":"New SOI processes offer unprecedented flexibility in regard to low-power and performance. The use of fine-grained body biasing in STMicro's 28nm UTBB-FDSOI is evaluated for a Dynamically Reconfigurable Processor design in a frequency scaling scenario. Three different strategies are evaluated for Processing Elements: Static, programmable and dynamic body biasing. Fine-grained body biasing significantly mitigates increased leakage currents of forward body biasing between 42.85% to 64.5% on average. This makes static body biasing a viable low-cost option and makes dynamic body biasing worthwhile even at short time periods.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124499608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158531
Injoon Hong, Dongjoo Shin, Youchang Kim, Kyeongryeol Bong, Seongwook Park, K. Lee, H. Yoo
In this paper, a low-power real-time gaze-activated object recognition processor is proposed for a battery-powered smart glasses system. For high energy efficiency, we propose keypoint-level pipelined architecture to increase the hardware utilziation which results in significant power reduction of the real-time recognition processor. In addition, low-power gaze-activation image sensor with mixed-mode architecture is proposed for the glass user's gaze estimation. Therefore, only the small image region where the glasses user is seeing needs to be processed by the recognition processor leading to further power reduction. As a result, the proposed object recognition processor shows 30fps real-time performance only with 75mW power consumption, which is 3.5x and 4.4x smaller power than the state-of-the-art works.
{"title":"A keypoint-level parallel pipelined object recognition processor with gaze activation image sensor for mobile smart glasses system","authors":"Injoon Hong, Dongjoo Shin, Youchang Kim, Kyeongryeol Bong, Seongwook Park, K. Lee, H. Yoo","doi":"10.1109/CoolChips.2015.7158531","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158531","url":null,"abstract":"In this paper, a low-power real-time gaze-activated object recognition processor is proposed for a battery-powered smart glasses system. For high energy efficiency, we propose keypoint-level pipelined architecture to increase the hardware utilziation which results in significant power reduction of the real-time recognition processor. In addition, low-power gaze-activation image sensor with mixed-mode architecture is proposed for the glass user's gaze estimation. Therefore, only the small image region where the glasses user is seeing needs to be processed by the recognition processor leading to further power reduction. As a result, the proposed object recognition processor shows 30fps real-time performance only with 75mW power consumption, which is 3.5x and 4.4x smaller power than the state-of-the-art works.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129508935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158665
Jun Zhou, Huawei Li, Tiancheng Wang, Ying Wang, Xiaowei Li
Deadlock is a common problem in 3D networks-on-chip. In this paper, we propose a lightweight and deadlock-free turn-guided routing scheme named TURO without requiring any virtual channels, which is a minimal routing guided by a new 3D turn model NeoOE. The theoretical analysis and experimental results show that TURO possesses improved adaptivity, higher performance and lower overhead compared with the state-of-the-art routing schemes using other 3D turn models.
{"title":"TURO: A lightweight turn-guided routing scheme for 3D NoCs","authors":"Jun Zhou, Huawei Li, Tiancheng Wang, Ying Wang, Xiaowei Li","doi":"10.1109/CoolChips.2015.7158665","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158665","url":null,"abstract":"Deadlock is a common problem in 3D networks-on-chip. In this paper, we propose a lightweight and deadlock-free turn-guided routing scheme named TURO without requiring any virtual channels, which is a minimal routing guided by a new 3D turn model NeoOE. The theoretical analysis and experimental results show that TURO possesses improved adaptivity, higher performance and lower overhead compared with the state-of-the-art routing schemes using other 3D turn models.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122180245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158656
Hayate Okuhara, K. Usami, H. Amano
A leakage current monitor circuit was developed for dynamic back gate bias control of CMOS LSI with Silicon on Thin BOX (SOTB) technology. By using the SOTB technology, sensors or wearable devices can suppress the leakage power by giving deep reverse body bias when they are not used. Once an event occurs, they must turn to the operational mode by changing the body bias quickly. According to the real chip evaluation, it takes hundreds of micro seconds, and the wake-up time is difficult to be estimated. The proposed detector using a leakage current monitor circuit guarantees that the target module is ready to be operational. The target body bias voltage for operation can be controlled by the bias voltage of the detector domain, which is computed with an expression in advance. SPICE simulation reveals that formulation is done and power overhead is only 42.7-42.9nW in the room temperature. Compensation equations for various temperatures are also shown.
采用薄盒上硅(Silicon on Thin BOX, SOTB)技术,研制了一种用于动态控制CMOS LSI后门偏置的漏电流监测电路。通过使用SOTB技术,传感器或可穿戴设备可以在不使用时通过施加深度反向体偏压来抑制泄漏功率。一旦事件发生,他们必须通过迅速改变身体偏见来转向操作模式。根据真实芯片评估,需要数百微秒,唤醒时间难以估计。所提出的检测器采用漏电流监测电路,保证目标模块准备好可操作。目标体的工作偏置电压可以通过探测器域的偏置电压来控制,该偏置电压是用预先计算好的表达式来控制的。SPICE仿真结果表明,在室温下,配方完成,功率开销仅为42.7-42.9nW。还给出了不同温度下的补偿方程。
{"title":"A leakage current monitor circuit using silicon on thin BOX MOSFET for dynamic back gate bias control","authors":"Hayate Okuhara, K. Usami, H. Amano","doi":"10.1109/CoolChips.2015.7158656","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158656","url":null,"abstract":"A leakage current monitor circuit was developed for dynamic back gate bias control of CMOS LSI with Silicon on Thin BOX (SOTB) technology. By using the SOTB technology, sensors or wearable devices can suppress the leakage power by giving deep reverse body bias when they are not used. Once an event occurs, they must turn to the operational mode by changing the body bias quickly. According to the real chip evaluation, it takes hundreds of micro seconds, and the wake-up time is difficult to be estimated. The proposed detector using a leakage current monitor circuit guarantees that the target module is ready to be operational. The target body bias voltage for operation can be controlled by the bias voltage of the detector domain, which is computed with an expression in advance. SPICE simulation reveals that formulation is done and power overhead is only 42.7-42.9nW in the room temperature. Compensation equations for various temperatures are also shown.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129878529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158653
A. Zhang, Jun Yao, Y. Nakashima
K-means is a method of vector quantization, which is now popularly used for clustering analysis in massive data mining. Due to its heavily computational-intensive feature for iteratively re-computing and sorting distances, the execution of k-means takes a huge amount of time, especially when processing large graph data such as the practical social networks. This paper studies an alternative method to emulate the k-clustering from another view, in which the vertices in a graph are partitioned into k farthest clusters. This method can be implementable in a breadth-first-search (BFS) form and then becomes easily parallelizable. Our result shows that our BFS-based k-clustering achieves more than 100x speeds than the traditional partitioning in the open-source graphlab project.
{"title":"Lowering the complexity of k-means clustering by BFS-dijkstra method for graph computing","authors":"A. Zhang, Jun Yao, Y. Nakashima","doi":"10.1109/CoolChips.2015.7158653","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158653","url":null,"abstract":"K-means is a method of vector quantization, which is now popularly used for clustering analysis in massive data mining. Due to its heavily computational-intensive feature for iteratively re-computing and sorting distances, the execution of k-means takes a huge amount of time, especially when processing large graph data such as the practical social networks. This paper studies an alternative method to emulate the k-clustering from another view, in which the vertices in a graph are partitioned into k farthest clusters. This method can be implementable in a breadth-first-search (BFS) form and then becomes easily parallelizable. Our result shows that our BFS-based k-clustering achieves more than 100x speeds than the traditional partitioning in the open-source graphlab project.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124106645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-04-13DOI: 10.1109/CoolChips.2015.7158532
S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal
Nowadays face recognition application is widely used in various industries such as traffic, safety, medical engineering, etc. In this paper, we propose a power and energy efficient heterogeneous platform to accelerate face recognition applications. To achieve this efficiency, we propose a novel hybrid platform which consists of a Xilinx Zynq (ARM+FPGA) and an NVidia's Jetson TK1 (ARM+GPU) coupled with PCIe card. In this application, we optimized local binary pattern and eigenvalue based face detection and recognition in order to achieve a speedup of 69x when compared to sequential execution on the ARM core, 4.8x against Zynq platform (ARM+FPGA), 3.2x against NVidia platform (ARM+GPU) and 40% more energy efficient against sequential execution.
{"title":"An energy efficient hybrid FPGA-GPU based embedded platform to accelerate face recognition application","authors":"S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal","doi":"10.1109/CoolChips.2015.7158532","DOIUrl":"https://doi.org/10.1109/CoolChips.2015.7158532","url":null,"abstract":"Nowadays face recognition application is widely used in various industries such as traffic, safety, medical engineering, etc. In this paper, we propose a power and energy efficient heterogeneous platform to accelerate face recognition applications. To achieve this efficiency, we propose a novel hybrid platform which consists of a Xilinx Zynq (ARM+FPGA) and an NVidia's Jetson TK1 (ARM+GPU) coupled with PCIe card. In this application, we optimized local binary pattern and eigenvalue based face detection and recognition in order to achieve a speedup of 69x when compared to sequential execution on the ARM core, 4.8x against Zynq platform (ARM+FPGA), 3.2x against NVidia platform (ARM+GPU) and 40% more energy efficient against sequential execution.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127429876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}