In 2007, under the auspices of the Industry/University Cooperative Research Centers (I/URC) program of the National Science Foundation, we established the Center for High-performance Reconfigurable Computing (CHREC) to facilitate scientific and engineering research in architectures, algorithms, software, services, applications, and performance optimization and evaluation for the advancement of multi-paradigm reconfigurable computing --- "reconfigurable" in both hardware or software. Each of the university sites in CHREC --- University of Pittsburgh, University of Florida, Brigham Young University, and Virginia Tech --- contributes unique expertise and capabilities for research in this critical field. Reflecting upon our ten-year odyssey with CHREC, we achieved the following successes in collaborative partnership with our CHREC members from industry and other government agencies: (1) established the nations first multidisciplinary research center in reconfigurable high-performance computing as a basis for long-term partnership and collaboration amongst industry, academe, and government; (2) directly supported the research needs of our center members in a cost-effective manner with pooled and leveraged resources and maximized synergy; (3) enhanced the educational experience for a diverse set of top-quality graduate and undergraduate students; and (4) advanced the knowledge and technologies in this field and ensured commercial relevance of the research with rapid and effective technology transfer.
{"title":"Center for High-Performance Reconfigurable Computing (CHREC): A Ten-Year Odyssey","authors":"W. Feng, A. George, H. Lamm, M. Wirthlin","doi":"10.1145/3075564.3095082","DOIUrl":"https://doi.org/10.1145/3075564.3095082","url":null,"abstract":"In 2007, under the auspices of the Industry/University Cooperative Research Centers (I/URC) program of the National Science Foundation, we established the Center for High-performance Reconfigurable Computing (CHREC) to facilitate scientific and engineering research in architectures, algorithms, software, services, applications, and performance optimization and evaluation for the advancement of multi-paradigm reconfigurable computing --- \"reconfigurable\" in both hardware or software. Each of the university sites in CHREC --- University of Pittsburgh, University of Florida, Brigham Young University, and Virginia Tech --- contributes unique expertise and capabilities for research in this critical field. Reflecting upon our ten-year odyssey with CHREC, we achieved the following successes in collaborative partnership with our CHREC members from industry and other government agencies: (1) established the nations first multidisciplinary research center in reconfigurable high-performance computing as a basis for long-term partnership and collaboration amongst industry, academe, and government; (2) directly supported the research needs of our center members in a cost-effective manner with pooled and leveraged resources and maximized synergy; (3) enhanced the educational experience for a diverse set of top-quality graduate and undergraduate students; and (4) advanced the knowledge and technologies in this field and ensured commercial relevance of the research with rapid and effective technology transfer.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127787555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Aprile, J. Wüthrich, Luca Baldassarre, Y. Leblebici, V. Cevher
This work presents an area and power efficient encoding system for wireless implantable devices capable of monitoring the electrical activity of the brain. Such devices are becoming an important tool for understanding, real-time monitoring, and potentially treating mental diseases such as epilepsy and depression. Recent advances on compressive sensing (CS) have shown a huge potential for sub-Nyquist sampling of neuronal signals. However, its implementation is still facing critical issues in delivering sufficient performance and in hardware complexity. In this work, we explore the tradeoffs between area and power requirements applying a novel DCT Learning-Based Compressive Subsampling approach on a human iEEG dataset. The proposed method achieves compression rates up to 64x, increasing the reconstruction performance and reducing the wireless transmission costs with respect to recent state-of-art. This new fully digital architecture handles the data compression of each individual neural acquisition channel with an area of 490 x 650/μm in 0.18 μm CMOS technology, and a power dissipation of only 2μW.
这项工作提出了一种面积和功率有效的编码系统,用于无线植入式设备,能够监测大脑的电活动。这种设备正在成为理解、实时监测和潜在治疗癫痫和抑郁症等精神疾病的重要工具。压缩感知(CS)的最新进展显示了神经元信号亚奈奎斯特采样的巨大潜力。然而,它的实现在提供足够的性能和硬件复杂性方面仍然面临着关键问题。在这项工作中,我们在人类iEEG数据集上应用一种新的基于DCT学习的压缩子采样方法来探索面积和功率需求之间的权衡。该方法实现了高达64倍的压缩率,提高了重建性能,并降低了无线传输成本。这种全新的全数字架构处理每个单独的神经采集通道的数据压缩,其面积为490 x 650/μm,采用0.18 μm CMOS技术,功耗仅为2μW。
{"title":"DCT Learning-Based Hardware Design for Neural Signal Acquisition Systems","authors":"C. Aprile, J. Wüthrich, Luca Baldassarre, Y. Leblebici, V. Cevher","doi":"10.1145/3075564.3078890","DOIUrl":"https://doi.org/10.1145/3075564.3078890","url":null,"abstract":"This work presents an area and power efficient encoding system for wireless implantable devices capable of monitoring the electrical activity of the brain. Such devices are becoming an important tool for understanding, real-time monitoring, and potentially treating mental diseases such as epilepsy and depression. Recent advances on compressive sensing (CS) have shown a huge potential for sub-Nyquist sampling of neuronal signals. However, its implementation is still facing critical issues in delivering sufficient performance and in hardware complexity. In this work, we explore the tradeoffs between area and power requirements applying a novel DCT Learning-Based Compressive Subsampling approach on a human iEEG dataset. The proposed method achieves compression rates up to 64x, increasing the reconstruction performance and reducing the wireless transmission costs with respect to recent state-of-art. This new fully digital architecture handles the data compression of each individual neural acquisition channel with an area of 490 x 650/μm in 0.18 μm CMOS technology, and a power dissipation of only 2μW.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115589762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parallel programming paradigms are commonly characterized by the core metrics of scalability, memory use, ease of use, hardware requirements and resiliency. Increasingly the support of heterogeneous environments, for example a mix of CPUs and accelerators, are of interest. Analysis of the semantics of different classes of parallel programming paradigms and their cost leads to DYCE (Distributed Yet Common Environment), a shared memory, rich but hardware friendly, race and deadlock free parallel programming paradigm that allows for resiliency without the need for explicit check-pointing code. Pointer based structures that span the memory of multiple heterogeneous compute devices are possible. Importantly, data exchange is independent of the specific data structures and does not require serialization and deserialization code, even for data structures such as a dynamic linked radix tree of strings. The analysis shows that DYCE does not require coherence from the system and thus can be executed with near minimal overhead and hardware requirements, including the page table cost for large unified address spaces that span many devices. We demonstrate efficacy with a prototype.
{"title":"DYCE: A Resilient Shared Memory Paradigm for Heterogenous Distributed Systems without Memory Coherence","authors":"Ulrich Finkler, H. Franke, David S. Kung","doi":"10.1145/3075564.3075579","DOIUrl":"https://doi.org/10.1145/3075564.3075579","url":null,"abstract":"Parallel programming paradigms are commonly characterized by the core metrics of scalability, memory use, ease of use, hardware requirements and resiliency. Increasingly the support of heterogeneous environments, for example a mix of CPUs and accelerators, are of interest. Analysis of the semantics of different classes of parallel programming paradigms and their cost leads to DYCE (Distributed Yet Common Environment), a shared memory, rich but hardware friendly, race and deadlock free parallel programming paradigm that allows for resiliency without the need for explicit check-pointing code. Pointer based structures that span the memory of multiple heterogeneous compute devices are possible. Importantly, data exchange is independent of the specific data structures and does not require serialization and deserialization code, even for data structures such as a dynamic linked radix tree of strings. The analysis shows that DYCE does not require coherence from the system and thus can be executed with near minimal overhead and hardware requirements, including the page table cost for large unified address spaces that span many devices. We demonstrate efficacy with a prototype.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129707660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Inherent resilience of applications enables the design paradigm of approximate computing that exploits computation in-exactness by trading off output quality for runtime system resources. When executing such quality-scalable applications on multiprocessor embedded systems, it is expected not only to achieve the highest possible output quality, but also to handle the critical thermal challenge spurred by vastly increased chip density. While the rising temperature causes significant quality distortion at runtime, existing thermal-management techniques, such as dynamic frequency scaling, rarely take into account the trade-off possibilities between output quality and thermal budget. In this paper, we explore the application-level quality-scaling features of resilient applications to achieve effective temperature control as well as quality maximization. We propose an efficient iterative pseudo quadratic programming heuristic to decide the optimal frequency and application execution cycles, in order to achieve quality optimization, under temperature, timing, and energy constraints. Our approaches are evaluated using realistic benchmarks with known platform thermal parameters. The proposed methods show a 98.5% quality improvement with temperature violation awareness.
{"title":"Quality Optimization of Resilient Applications under Temperature Constraints","authors":"Heng Yu, Y. Ha, Jing Wang","doi":"10.1145/3075564.3075577","DOIUrl":"https://doi.org/10.1145/3075564.3075577","url":null,"abstract":"Inherent resilience of applications enables the design paradigm of approximate computing that exploits computation in-exactness by trading off output quality for runtime system resources. When executing such quality-scalable applications on multiprocessor embedded systems, it is expected not only to achieve the highest possible output quality, but also to handle the critical thermal challenge spurred by vastly increased chip density. While the rising temperature causes significant quality distortion at runtime, existing thermal-management techniques, such as dynamic frequency scaling, rarely take into account the trade-off possibilities between output quality and thermal budget. In this paper, we explore the application-level quality-scaling features of resilient applications to achieve effective temperature control as well as quality maximization. We propose an efficient iterative pseudo quadratic programming heuristic to decide the optimal frequency and application execution cycles, in order to achieve quality optimization, under temperature, timing, and energy constraints. Our approaches are evaluated using realistic benchmarks with known platform thermal parameters. The proposed methods show a 98.5% quality improvement with temperature violation awareness.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125312724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Optimizing the memory access behavior is an important challenge to improve the performance and energy consumption of parallel applications on shared memory architectures. Modern systems contain complex memory hierarchies with multiple memory controllers and several levels of caches. In such machines, analyzing the affinity between threads and data to map them to the hardware hierarchy reduces the cost of memory accesses. In this paper, we introduce a hybrid technique to optimize the memory access behavior of parallel applications. It is based on a compiler optimization that inserts code to predict, at runtime, the memory access behavior of the application and an OS mechanism that uses this information to optimize the mapping of threads and data. In contrast to previous work, our proposal uses a proactive technique to improve the future memory access behavior using predictions instead of the past behavior. Our mechanism achieves substantial performance gains for a variety of parallel applications.
{"title":"Optimizing memory affinity with a hybrid compiler/OS approach","authors":"M. Diener, E. Cruz, M. Alves, E. Borin, P. Navaux","doi":"10.1145/3075564.3075566","DOIUrl":"https://doi.org/10.1145/3075564.3075566","url":null,"abstract":"Optimizing the memory access behavior is an important challenge to improve the performance and energy consumption of parallel applications on shared memory architectures. Modern systems contain complex memory hierarchies with multiple memory controllers and several levels of caches. In such machines, analyzing the affinity between threads and data to map them to the hardware hierarchy reduces the cost of memory accesses. In this paper, we introduce a hybrid technique to optimize the memory access behavior of parallel applications. It is based on a compiler optimization that inserts code to predict, at runtime, the memory access behavior of the application and an OS mechanism that uses this information to optimize the mapping of threads and data. In contrast to previous work, our proposal uses a proactive technique to improve the future memory access behavior using predictions instead of the past behavior. Our mechanism achieves substantial performance gains for a variety of parallel applications.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116967896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diabetes is becoming a more and more serious health challenge worldwide with the yearly rising prevalence, especially in developing countries. The vast majority of diabetes are type 2 diabetes, which has been indicated that about 80% of type 2 diabetes complications can be prevented or delayed by timely detection. In this paper, we propose an ensemble model to precisely diagnose the diabetic on a large-scale and imbalance dataset. The dataset used in our work covers millions of people from one province in China from 2009 to 2015, which is highly skew. Results on the real-world dataset prove that our method is promising for diabetes diagnosis with a high sensitivity, F3 and G --- mean, i.e, 91.00%, 58.24%, 86.69%, respectively.
{"title":"An Ensemble Model for Diabetes Diagnosis in Large-scale and Imbalanced Dataset","authors":"Xun Wei, Fan Jiang, Feng Wei, Jiekui Zhang, Weiwei Liao, Shaoyin Cheng","doi":"10.1145/3075564.3075576","DOIUrl":"https://doi.org/10.1145/3075564.3075576","url":null,"abstract":"Diabetes is becoming a more and more serious health challenge worldwide with the yearly rising prevalence, especially in developing countries. The vast majority of diabetes are type 2 diabetes, which has been indicated that about 80% of type 2 diabetes complications can be prevented or delayed by timely detection. In this paper, we propose an ensemble model to precisely diagnose the diabetic on a large-scale and imbalance dataset. The dataset used in our work covers millions of people from one province in China from 2009 to 2015, which is highly skew. Results on the real-world dataset prove that our method is promising for diabetes diagnosis with a high sensitivity, F3 and G --- mean, i.e, 91.00%, 58.24%, 86.69%, respectively.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129198859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many-core microprocessor architectures are quickly becoming prevalent in data centers, due to their demonstrated processing power and network flexibility. However, this flexibility comes at a cost; co-mingled data from disparate users must be kept secure, which forces processor cycles to be wasted on cryptographic operations. This paper introduces a novel, secure, stream processing architecture which supports efficient homomorphic authentication of data and enforces secrecy of individuals' data. Additionally, this architecture is shown to secure time-series analysis of data from multiple users from both corruption and disclosure. Hardware synthesis shows that security-related circuitry incurs less than 10% overhead, and latency analysis shows an increase of 2 clocks per hop. However, despite the increase in latency, the proposed architecture shows an improvement over stream processing systems that use traditional security methods.
{"title":"Hardware Support for Secure Stream Processing in Cloud Environments","authors":"Jeff Anderson, T. El-Ghazawi","doi":"10.1145/3075564.3075592","DOIUrl":"https://doi.org/10.1145/3075564.3075592","url":null,"abstract":"Many-core microprocessor architectures are quickly becoming prevalent in data centers, due to their demonstrated processing power and network flexibility. However, this flexibility comes at a cost; co-mingled data from disparate users must be kept secure, which forces processor cycles to be wasted on cryptographic operations. This paper introduces a novel, secure, stream processing architecture which supports efficient homomorphic authentication of data and enforces secrecy of individuals' data. Additionally, this architecture is shown to secure time-series analysis of data from multiple users from both corruption and disclosure. Hardware synthesis shows that security-related circuitry incurs less than 10% overhead, and latency analysis shows an increase of 2 clocks per hop. However, despite the increase in latency, the proposed architecture shows an improvement over stream processing systems that use traditional security methods.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114061938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Broekema, Damiaan R. Twelker, Daniel Romao, P. Grosso, Rob V. van Nieuwpoort, H. Bal
Traditional networks are relatively static and rely on a complex stack of interoperating protocols for proper operation. Modern large-scale science instruments, such as radio telescopes, consist of an interconnected collection of sensors generating large quantities of data, transported over high-bandwidth IP over Ethernet networks. The concept of a software-defined network (SDN) has recently gained popularity, moving control over the data flow to a programmable software component, the network controller. In this paper we explore the viability of such an SDN in sensor networks typical of future large-scale radio telescopes, such as the Square Kilometre Array (SKA). Based on experience with the LOw Frequency ARray (LOFAR), a recent radio telescope, we show that the addition of such software control adds to the reliability and flexibility of the instrument. We identify some essential technical SDN requirements for this application, and investigate the level of functional support on three current switches and a virtual software switch. A proof of concept application validates the viability of this concept. While we identify limitations in the SDN implementations and performance of two of our hardware switches, excellent performance is shown on a third.
{"title":"Software-defined networks in large-scale radio telescopes","authors":"P. Broekema, Damiaan R. Twelker, Daniel Romao, P. Grosso, Rob V. van Nieuwpoort, H. Bal","doi":"10.1145/3075564.3075594","DOIUrl":"https://doi.org/10.1145/3075564.3075594","url":null,"abstract":"Traditional networks are relatively static and rely on a complex stack of interoperating protocols for proper operation. Modern large-scale science instruments, such as radio telescopes, consist of an interconnected collection of sensors generating large quantities of data, transported over high-bandwidth IP over Ethernet networks. The concept of a software-defined network (SDN) has recently gained popularity, moving control over the data flow to a programmable software component, the network controller. In this paper we explore the viability of such an SDN in sensor networks typical of future large-scale radio telescopes, such as the Square Kilometre Array (SKA). Based on experience with the LOw Frequency ARray (LOFAR), a recent radio telescope, we show that the addition of such software control adds to the reliability and flexibility of the instrument. We identify some essential technical SDN requirements for this application, and investigate the level of functional support on three current switches and a virtual software switch. A proof of concept application validates the viability of this concept. While we identify limitations in the SDN implementations and performance of two of our hardware switches, excellent performance is shown on a third.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128317071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Achieving high performance in task-parallel runtime systems, especially with high degrees of parallelism and fine-grained tasks, requires tuning a large variety of behavioral parameters according to program characteristics. In the current state of the art, this tuning is generally performed in one of two ways: either by a group of experts who derive a single setup which achieves good -- but not optimal -- performance across a wide variety of use cases, or by monitoring a system's behavior at runtime and responding to it. The former approach invariably fails to achieve optimal performance for programs with highly distinct execution patterns, while the latter induces some overhead and cannot affect parameters which need to be fixed at compile time. In order to mitigate these drawbacks, we propose a set of novel static compiler analyses specifically designed to determine program features which affect the optimal settings for a task-parallel execution environment. These features include the parallel structure of task spawning, the granularity of individual tasks, and an estimate of the stack size required per task. Based on the result of these analyses, various runtime system parameters are then tuned at compile time. We have implemented this approach in the Insieme compiler and runtime system, and evaluated its effectiveness on a set of 12 task parallel benchmarks running with 1 to 64 hardware threads. Across this entire space of use cases, our implementation achieves a geometric mean performance improvement of 39%.
{"title":"Task-parallel Runtime System Optimization Using Static Compiler Analysis","authors":"Peter Thoman, P. Zangerl, T. Fahringer","doi":"10.1145/3075564.3075574","DOIUrl":"https://doi.org/10.1145/3075564.3075574","url":null,"abstract":"Achieving high performance in task-parallel runtime systems, especially with high degrees of parallelism and fine-grained tasks, requires tuning a large variety of behavioral parameters according to program characteristics. In the current state of the art, this tuning is generally performed in one of two ways: either by a group of experts who derive a single setup which achieves good -- but not optimal -- performance across a wide variety of use cases, or by monitoring a system's behavior at runtime and responding to it. The former approach invariably fails to achieve optimal performance for programs with highly distinct execution patterns, while the latter induces some overhead and cannot affect parameters which need to be fixed at compile time. In order to mitigate these drawbacks, we propose a set of novel static compiler analyses specifically designed to determine program features which affect the optimal settings for a task-parallel execution environment. These features include the parallel structure of task spawning, the granularity of individual tasks, and an estimate of the stack size required per task. Based on the result of these analyses, various runtime system parameters are then tuned at compile time. We have implemented this approach in the Insieme compiler and runtime system, and evaluated its effectiveness on a set of 12 task parallel benchmarks running with 1 to 64 hardware threads. Across this entire space of use cases, our implementation achieves a geometric mean performance improvement of 39%.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121397511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Conti, G. D. Natale, Annelie Heuser, T. Pöppelmann, N. Mentens
In this paper, four cryptography and security experts point out to future research directions in internet-of-things (IoT) security. Coming from different research domains, the experts address a broad range of issues related to IoT security. In preparation to a panel discussion at the International Workshop on Malicious Software and Hardware in the Internet of Things (MalIoT), they indicate which aspects are important in the design of secured IoT systems, and to which extent we need a holistic approach that integrates security measures at all levels of design abstraction.
{"title":"Do we need a holistic approach for the design of secure IoT systems?","authors":"M. Conti, G. D. Natale, Annelie Heuser, T. Pöppelmann, N. Mentens","doi":"10.1145/3075564.3079070","DOIUrl":"https://doi.org/10.1145/3075564.3079070","url":null,"abstract":"In this paper, four cryptography and security experts point out to future research directions in internet-of-things (IoT) security. Coming from different research domains, the experts address a broad range of issues related to IoT security. In preparation to a panel discussion at the International Workshop on Malicious Software and Hardware in the Internet of Things (MalIoT), they indicate which aspects are important in the design of secured IoT systems, and to which extent we need a holistic approach that integrates security measures at all levels of design abstraction.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133346058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}