首页 > 最新文献

Proceedings of the Computing Frontiers Conference最新文献

英文 中文
Center for High-Performance Reconfigurable Computing (CHREC): A Ten-Year Odyssey 高性能可重构计算中心(CHREC):十年之旅
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3095082
W. Feng, A. George, H. Lamm, M. Wirthlin
In 2007, under the auspices of the Industry/University Cooperative Research Centers (I/URC) program of the National Science Foundation, we established the Center for High-performance Reconfigurable Computing (CHREC) to facilitate scientific and engineering research in architectures, algorithms, software, services, applications, and performance optimization and evaluation for the advancement of multi-paradigm reconfigurable computing --- "reconfigurable" in both hardware or software. Each of the university sites in CHREC --- University of Pittsburgh, University of Florida, Brigham Young University, and Virginia Tech --- contributes unique expertise and capabilities for research in this critical field. Reflecting upon our ten-year odyssey with CHREC, we achieved the following successes in collaborative partnership with our CHREC members from industry and other government agencies: (1) established the nations first multidisciplinary research center in reconfigurable high-performance computing as a basis for long-term partnership and collaboration amongst industry, academe, and government; (2) directly supported the research needs of our center members in a cost-effective manner with pooled and leveraged resources and maximized synergy; (3) enhanced the educational experience for a diverse set of top-quality graduate and undergraduate students; and (4) advanced the knowledge and technologies in this field and ensured commercial relevance of the research with rapid and effective technology transfer.
2007年,在国家科学基金产学研合作研究中心(I/URC)项目的支持下,我们建立了高性能可重构计算中心(CHREC),以促进在架构、算法、软件、服务、应用、性能优化和评估方面的科学和工程研究,以推进多范式可重构计算——硬件或软件的“可重构”。CHREC的每一所大学——匹兹堡大学、佛罗里达大学、杨百翰大学和弗吉尼亚理工大学——都为这一关键领域的研究贡献了独特的专业知识和能力。回顾我们与中国计算机研究中心的十年历程,我们与来自业界和其他政府机构的中国计算机研究中心成员的合作取得了以下成功:(1)建立了全国第一个可重构高性能计算多学科研究中心,为产学研和政府之间的长期合作奠定了基础;(2)集中资源、撬动资源、发挥最大协同效应,以高性价比的方式直接支持中心成员的研究需求;(3)提高了各类高素质研究生和本科生的教育体验;(4)推进该领域的知识和技术,并通过快速有效的技术转移确保研究的商业相关性。
{"title":"Center for High-Performance Reconfigurable Computing (CHREC): A Ten-Year Odyssey","authors":"W. Feng, A. George, H. Lamm, M. Wirthlin","doi":"10.1145/3075564.3095082","DOIUrl":"https://doi.org/10.1145/3075564.3095082","url":null,"abstract":"In 2007, under the auspices of the Industry/University Cooperative Research Centers (I/URC) program of the National Science Foundation, we established the Center for High-performance Reconfigurable Computing (CHREC) to facilitate scientific and engineering research in architectures, algorithms, software, services, applications, and performance optimization and evaluation for the advancement of multi-paradigm reconfigurable computing --- \"reconfigurable\" in both hardware or software. Each of the university sites in CHREC --- University of Pittsburgh, University of Florida, Brigham Young University, and Virginia Tech --- contributes unique expertise and capabilities for research in this critical field. Reflecting upon our ten-year odyssey with CHREC, we achieved the following successes in collaborative partnership with our CHREC members from industry and other government agencies: (1) established the nations first multidisciplinary research center in reconfigurable high-performance computing as a basis for long-term partnership and collaboration amongst industry, academe, and government; (2) directly supported the research needs of our center members in a cost-effective manner with pooled and leveraged resources and maximized synergy; (3) enhanced the educational experience for a diverse set of top-quality graduate and undergraduate students; and (4) advanced the knowledge and technologies in this field and ensured commercial relevance of the research with rapid and effective technology transfer.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127787555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DCT Learning-Based Hardware Design for Neural Signal Acquisition Systems 基于DCT学习的神经信号采集系统硬件设计
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3078890
C. Aprile, J. Wüthrich, Luca Baldassarre, Y. Leblebici, V. Cevher
This work presents an area and power efficient encoding system for wireless implantable devices capable of monitoring the electrical activity of the brain. Such devices are becoming an important tool for understanding, real-time monitoring, and potentially treating mental diseases such as epilepsy and depression. Recent advances on compressive sensing (CS) have shown a huge potential for sub-Nyquist sampling of neuronal signals. However, its implementation is still facing critical issues in delivering sufficient performance and in hardware complexity. In this work, we explore the tradeoffs between area and power requirements applying a novel DCT Learning-Based Compressive Subsampling approach on a human iEEG dataset. The proposed method achieves compression rates up to 64x, increasing the reconstruction performance and reducing the wireless transmission costs with respect to recent state-of-art. This new fully digital architecture handles the data compression of each individual neural acquisition channel with an area of 490 x 650/μm in 0.18 μm CMOS technology, and a power dissipation of only 2μW.
这项工作提出了一种面积和功率有效的编码系统,用于无线植入式设备,能够监测大脑的电活动。这种设备正在成为理解、实时监测和潜在治疗癫痫和抑郁症等精神疾病的重要工具。压缩感知(CS)的最新进展显示了神经元信号亚奈奎斯特采样的巨大潜力。然而,它的实现在提供足够的性能和硬件复杂性方面仍然面临着关键问题。在这项工作中,我们在人类iEEG数据集上应用一种新的基于DCT学习的压缩子采样方法来探索面积和功率需求之间的权衡。该方法实现了高达64倍的压缩率,提高了重建性能,并降低了无线传输成本。这种全新的全数字架构处理每个单独的神经采集通道的数据压缩,其面积为490 x 650/μm,采用0.18 μm CMOS技术,功耗仅为2μW。
{"title":"DCT Learning-Based Hardware Design for Neural Signal Acquisition Systems","authors":"C. Aprile, J. Wüthrich, Luca Baldassarre, Y. Leblebici, V. Cevher","doi":"10.1145/3075564.3078890","DOIUrl":"https://doi.org/10.1145/3075564.3078890","url":null,"abstract":"This work presents an area and power efficient encoding system for wireless implantable devices capable of monitoring the electrical activity of the brain. Such devices are becoming an important tool for understanding, real-time monitoring, and potentially treating mental diseases such as epilepsy and depression. Recent advances on compressive sensing (CS) have shown a huge potential for sub-Nyquist sampling of neuronal signals. However, its implementation is still facing critical issues in delivering sufficient performance and in hardware complexity. In this work, we explore the tradeoffs between area and power requirements applying a novel DCT Learning-Based Compressive Subsampling approach on a human iEEG dataset. The proposed method achieves compression rates up to 64x, increasing the reconstruction performance and reducing the wireless transmission costs with respect to recent state-of-art. This new fully digital architecture handles the data compression of each individual neural acquisition channel with an area of 490 x 650/μm in 0.18 μm CMOS technology, and a power dissipation of only 2μW.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115589762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
DYCE: A Resilient Shared Memory Paradigm for Heterogenous Distributed Systems without Memory Coherence 无内存一致性的异构分布式系统的弹性共享内存范式
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075579
Ulrich Finkler, H. Franke, David S. Kung
Parallel programming paradigms are commonly characterized by the core metrics of scalability, memory use, ease of use, hardware requirements and resiliency. Increasingly the support of heterogeneous environments, for example a mix of CPUs and accelerators, are of interest. Analysis of the semantics of different classes of parallel programming paradigms and their cost leads to DYCE (Distributed Yet Common Environment), a shared memory, rich but hardware friendly, race and deadlock free parallel programming paradigm that allows for resiliency without the need for explicit check-pointing code. Pointer based structures that span the memory of multiple heterogeneous compute devices are possible. Importantly, data exchange is independent of the specific data structures and does not require serialization and deserialization code, even for data structures such as a dynamic linked radix tree of strings. The analysis shows that DYCE does not require coherence from the system and thus can be executed with near minimal overhead and hardware requirements, including the page table cost for large unified address spaces that span many devices. We demonstrate efficacy with a prototype.
并行编程范例通常以可伸缩性、内存使用、易用性、硬件需求和弹性等核心指标为特征。对异构环境(例如cpu和加速器的混合)的支持越来越受到关注。对不同类型的并行编程范例的语义和它们的成本的分析导致了DYCE(分布式公共环境),一个共享内存,丰富但硬件友好,无竞争和死锁的并行编程范例,允许弹性而不需要显式的检查点代码。跨越多个异构计算设备的内存的基于指针的结构是可能的。重要的是,数据交换独立于特定的数据结构,并且不需要序列化和反序列化代码,即使对于字符串的动态链接基数树这样的数据结构也是如此。分析表明,DYCE不需要系统的一致性,因此可以在几乎最小的开销和硬件要求下执行,包括跨许多设备的大型统一地址空间的页表成本。我们用一个原型来证明有效性。
{"title":"DYCE: A Resilient Shared Memory Paradigm for Heterogenous Distributed Systems without Memory Coherence","authors":"Ulrich Finkler, H. Franke, David S. Kung","doi":"10.1145/3075564.3075579","DOIUrl":"https://doi.org/10.1145/3075564.3075579","url":null,"abstract":"Parallel programming paradigms are commonly characterized by the core metrics of scalability, memory use, ease of use, hardware requirements and resiliency. Increasingly the support of heterogeneous environments, for example a mix of CPUs and accelerators, are of interest. Analysis of the semantics of different classes of parallel programming paradigms and their cost leads to DYCE (Distributed Yet Common Environment), a shared memory, rich but hardware friendly, race and deadlock free parallel programming paradigm that allows for resiliency without the need for explicit check-pointing code. Pointer based structures that span the memory of multiple heterogeneous compute devices are possible. Importantly, data exchange is independent of the specific data structures and does not require serialization and deserialization code, even for data structures such as a dynamic linked radix tree of strings. The analysis shows that DYCE does not require coherence from the system and thus can be executed with near minimal overhead and hardware requirements, including the page table cost for large unified address spaces that span many devices. We demonstrate efficacy with a prototype.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129707660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Quality Optimization of Resilient Applications under Temperature Constraints 温度约束下弹性应用的质量优化
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075577
Heng Yu, Y. Ha, Jing Wang
Inherent resilience of applications enables the design paradigm of approximate computing that exploits computation in-exactness by trading off output quality for runtime system resources. When executing such quality-scalable applications on multiprocessor embedded systems, it is expected not only to achieve the highest possible output quality, but also to handle the critical thermal challenge spurred by vastly increased chip density. While the rising temperature causes significant quality distortion at runtime, existing thermal-management techniques, such as dynamic frequency scaling, rarely take into account the trade-off possibilities between output quality and thermal budget. In this paper, we explore the application-level quality-scaling features of resilient applications to achieve effective temperature control as well as quality maximization. We propose an efficient iterative pseudo quadratic programming heuristic to decide the optimal frequency and application execution cycles, in order to achieve quality optimization, under temperature, timing, and energy constraints. Our approaches are evaluated using realistic benchmarks with known platform thermal parameters. The proposed methods show a 98.5% quality improvement with temperature violation awareness.
应用程序固有的弹性支持近似计算的设计范例,这种设计范例通过牺牲输出质量换取运行时系统资源来利用精确计算。当在多处理器嵌入式系统上执行这种高质量可扩展的应用程序时,不仅要实现最高的输出质量,还要处理芯片密度大幅增加所带来的关键热挑战。虽然温度升高会在运行时导致严重的质量失真,但现有的热管理技术,如动态频率缩放,很少考虑输出质量和热预算之间的权衡。在本文中,我们探讨了弹性应用的应用级质量缩放特征,以实现有效的温度控制和质量最大化。我们提出了一种有效的迭代伪二次规划启发式方法来确定最优频率和应用程序执行周期,以便在温度,时间和能量约束下实现质量优化。我们的方法是使用已知平台热参数的实际基准进行评估的。采用温度违例感知的方法,质量提高98.5%。
{"title":"Quality Optimization of Resilient Applications under Temperature Constraints","authors":"Heng Yu, Y. Ha, Jing Wang","doi":"10.1145/3075564.3075577","DOIUrl":"https://doi.org/10.1145/3075564.3075577","url":null,"abstract":"Inherent resilience of applications enables the design paradigm of approximate computing that exploits computation in-exactness by trading off output quality for runtime system resources. When executing such quality-scalable applications on multiprocessor embedded systems, it is expected not only to achieve the highest possible output quality, but also to handle the critical thermal challenge spurred by vastly increased chip density. While the rising temperature causes significant quality distortion at runtime, existing thermal-management techniques, such as dynamic frequency scaling, rarely take into account the trade-off possibilities between output quality and thermal budget. In this paper, we explore the application-level quality-scaling features of resilient applications to achieve effective temperature control as well as quality maximization. We propose an efficient iterative pseudo quadratic programming heuristic to decide the optimal frequency and application execution cycles, in order to achieve quality optimization, under temperature, timing, and energy constraints. Our approaches are evaluated using realistic benchmarks with known platform thermal parameters. The proposed methods show a 98.5% quality improvement with temperature violation awareness.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125312724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Optimizing memory affinity with a hybrid compiler/OS approach 使用混合编译器/操作系统方法优化内存关联
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075566
M. Diener, E. Cruz, M. Alves, E. Borin, P. Navaux
Optimizing the memory access behavior is an important challenge to improve the performance and energy consumption of parallel applications on shared memory architectures. Modern systems contain complex memory hierarchies with multiple memory controllers and several levels of caches. In such machines, analyzing the affinity between threads and data to map them to the hardware hierarchy reduces the cost of memory accesses. In this paper, we introduce a hybrid technique to optimize the memory access behavior of parallel applications. It is based on a compiler optimization that inserts code to predict, at runtime, the memory access behavior of the application and an OS mechanism that uses this information to optimize the mapping of threads and data. In contrast to previous work, our proposal uses a proactive technique to improve the future memory access behavior using predictions instead of the past behavior. Our mechanism achieves substantial performance gains for a variety of parallel applications.
优化内存访问行为是提高共享内存架构上并行应用程序性能和能耗的一个重要挑战。现代系统包含复杂的内存层次结构,具有多个内存控制器和多个缓存级别。在这样的机器中,分析线程和数据之间的关系,将它们映射到硬件层次结构,可以降低内存访问的成本。在本文中,我们介绍了一种混合技术来优化并行应用程序的内存访问行为。它基于一个编译器优化,在运行时插入代码来预测应用程序的内存访问行为,以及一个使用该信息来优化线程和数据映射的操作系统机制。与以往的工作不同,我们的建议使用一种主动的技术来改善未来的记忆访问行为,使用预测而不是过去的行为。我们的机制为各种并行应用程序实现了显著的性能提升。
{"title":"Optimizing memory affinity with a hybrid compiler/OS approach","authors":"M. Diener, E. Cruz, M. Alves, E. Borin, P. Navaux","doi":"10.1145/3075564.3075566","DOIUrl":"https://doi.org/10.1145/3075564.3075566","url":null,"abstract":"Optimizing the memory access behavior is an important challenge to improve the performance and energy consumption of parallel applications on shared memory architectures. Modern systems contain complex memory hierarchies with multiple memory controllers and several levels of caches. In such machines, analyzing the affinity between threads and data to map them to the hardware hierarchy reduces the cost of memory accesses. In this paper, we introduce a hybrid technique to optimize the memory access behavior of parallel applications. It is based on a compiler optimization that inserts code to predict, at runtime, the memory access behavior of the application and an OS mechanism that uses this information to optimize the mapping of threads and data. In contrast to previous work, our proposal uses a proactive technique to improve the future memory access behavior using predictions instead of the past behavior. Our mechanism achieves substantial performance gains for a variety of parallel applications.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116967896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Ensemble Model for Diabetes Diagnosis in Large-scale and Imbalanced Dataset 大规模不平衡数据集下糖尿病诊断的集成模型
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075576
Xun Wei, Fan Jiang, Feng Wei, Jiekui Zhang, Weiwei Liao, Shaoyin Cheng
Diabetes is becoming a more and more serious health challenge worldwide with the yearly rising prevalence, especially in developing countries. The vast majority of diabetes are type 2 diabetes, which has been indicated that about 80% of type 2 diabetes complications can be prevented or delayed by timely detection. In this paper, we propose an ensemble model to precisely diagnose the diabetic on a large-scale and imbalance dataset. The dataset used in our work covers millions of people from one province in China from 2009 to 2015, which is highly skew. Results on the real-world dataset prove that our method is promising for diabetes diagnosis with a high sensitivity, F3 and G --- mean, i.e, 91.00%, 58.24%, 86.69%, respectively.
糖尿病正成为世界范围内日益严重的健康挑战,其患病率逐年上升,特别是在发展中国家。绝大多数糖尿病是2型糖尿病,有研究表明,约80%的2型糖尿病并发症可以通过及时发现来预防或延缓。在本文中,我们提出了一个集成模型来精确诊断糖尿病的大规模和不平衡数据集。我们工作中使用的数据集涵盖了2009年至2015年中国一个省份的数百万人,这是高度倾斜的。在真实数据集上的结果证明,我们的方法对糖尿病的诊断具有很高的灵敏度,F3和G - mean分别为91.00%,58.24%,86.69%。
{"title":"An Ensemble Model for Diabetes Diagnosis in Large-scale and Imbalanced Dataset","authors":"Xun Wei, Fan Jiang, Feng Wei, Jiekui Zhang, Weiwei Liao, Shaoyin Cheng","doi":"10.1145/3075564.3075576","DOIUrl":"https://doi.org/10.1145/3075564.3075576","url":null,"abstract":"Diabetes is becoming a more and more serious health challenge worldwide with the yearly rising prevalence, especially in developing countries. The vast majority of diabetes are type 2 diabetes, which has been indicated that about 80% of type 2 diabetes complications can be prevented or delayed by timely detection. In this paper, we propose an ensemble model to precisely diagnose the diabetic on a large-scale and imbalance dataset. The dataset used in our work covers millions of people from one province in China from 2009 to 2015, which is highly skew. Results on the real-world dataset prove that our method is promising for diabetes diagnosis with a high sensitivity, F3 and G --- mean, i.e, 91.00%, 58.24%, 86.69%, respectively.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129198859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Hardware Support for Secure Stream Processing in Cloud Environments 云环境下安全流处理的硬件支持
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075592
Jeff Anderson, T. El-Ghazawi
Many-core microprocessor architectures are quickly becoming prevalent in data centers, due to their demonstrated processing power and network flexibility. However, this flexibility comes at a cost; co-mingled data from disparate users must be kept secure, which forces processor cycles to be wasted on cryptographic operations. This paper introduces a novel, secure, stream processing architecture which supports efficient homomorphic authentication of data and enforces secrecy of individuals' data. Additionally, this architecture is shown to secure time-series analysis of data from multiple users from both corruption and disclosure. Hardware synthesis shows that security-related circuitry incurs less than 10% overhead, and latency analysis shows an increase of 2 clocks per hop. However, despite the increase in latency, the proposed architecture shows an improvement over stream processing systems that use traditional security methods.
多核微处理器架构由于其出色的处理能力和网络灵活性,在数据中心迅速流行起来。然而,这种灵活性是有代价的;来自不同用户的混合数据必须保持安全,这迫使处理器周期浪费在加密操作上。本文介绍了一种新的、安全的流处理体系结构,该体系结构支持数据的高效同态认证,并增强了个人数据的保密性。此外,该体系结构还可以保护来自多个用户的数据的时间序列分析,避免损坏和泄露。硬件综合显示,与安全相关的电路产生的开销不到10%,延迟分析显示,每跳增加2个时钟。然而,尽管延迟增加,所提出的体系结构比使用传统安全方法的流处理系统有了改进。
{"title":"Hardware Support for Secure Stream Processing in Cloud Environments","authors":"Jeff Anderson, T. El-Ghazawi","doi":"10.1145/3075564.3075592","DOIUrl":"https://doi.org/10.1145/3075564.3075592","url":null,"abstract":"Many-core microprocessor architectures are quickly becoming prevalent in data centers, due to their demonstrated processing power and network flexibility. However, this flexibility comes at a cost; co-mingled data from disparate users must be kept secure, which forces processor cycles to be wasted on cryptographic operations. This paper introduces a novel, secure, stream processing architecture which supports efficient homomorphic authentication of data and enforces secrecy of individuals' data. Additionally, this architecture is shown to secure time-series analysis of data from multiple users from both corruption and disclosure. Hardware synthesis shows that security-related circuitry incurs less than 10% overhead, and latency analysis shows an increase of 2 clocks per hop. However, despite the increase in latency, the proposed architecture shows an improvement over stream processing systems that use traditional security methods.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114061938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Software-defined networks in large-scale radio telescopes 大型射电望远镜中的软件定义网络
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075594
P. Broekema, Damiaan R. Twelker, Daniel Romao, P. Grosso, Rob V. van Nieuwpoort, H. Bal
Traditional networks are relatively static and rely on a complex stack of interoperating protocols for proper operation. Modern large-scale science instruments, such as radio telescopes, consist of an interconnected collection of sensors generating large quantities of data, transported over high-bandwidth IP over Ethernet networks. The concept of a software-defined network (SDN) has recently gained popularity, moving control over the data flow to a programmable software component, the network controller. In this paper we explore the viability of such an SDN in sensor networks typical of future large-scale radio telescopes, such as the Square Kilometre Array (SKA). Based on experience with the LOw Frequency ARray (LOFAR), a recent radio telescope, we show that the addition of such software control adds to the reliability and flexibility of the instrument. We identify some essential technical SDN requirements for this application, and investigate the level of functional support on three current switches and a virtual software switch. A proof of concept application validates the viability of this concept. While we identify limitations in the SDN implementations and performance of two of our hardware switches, excellent performance is shown on a third.
传统的网络是相对静态的,并且依赖于复杂的互操作协议栈来正常运行。现代大型科学仪器,如射电望远镜,由产生大量数据的相互连接的传感器集合组成,这些数据通过以太网的高带宽IP传输。软件定义网络(SDN)的概念最近得到了普及,将对数据流的控制转移到可编程软件组件,即网络控制器。在本文中,我们探讨了这种SDN在未来大型射电望远镜(如平方公里阵列(SKA))典型的传感器网络中的可行性。根据低频阵列(LOFAR),一个最新的射电望远镜的经验,我们表明,增加这样的软件控制增加了仪器的可靠性和灵活性。我们确定了该应用程序的一些基本技术SDN需求,并研究了在三个当前交换机和一个虚拟软件交换机上的功能支持级别。概念验证应用程序验证此概念的可行性。虽然我们确定了SDN实现和两个硬件交换机的性能限制,但第三个硬件交换机显示了出色的性能。
{"title":"Software-defined networks in large-scale radio telescopes","authors":"P. Broekema, Damiaan R. Twelker, Daniel Romao, P. Grosso, Rob V. van Nieuwpoort, H. Bal","doi":"10.1145/3075564.3075594","DOIUrl":"https://doi.org/10.1145/3075564.3075594","url":null,"abstract":"Traditional networks are relatively static and rely on a complex stack of interoperating protocols for proper operation. Modern large-scale science instruments, such as radio telescopes, consist of an interconnected collection of sensors generating large quantities of data, transported over high-bandwidth IP over Ethernet networks. The concept of a software-defined network (SDN) has recently gained popularity, moving control over the data flow to a programmable software component, the network controller. In this paper we explore the viability of such an SDN in sensor networks typical of future large-scale radio telescopes, such as the Square Kilometre Array (SKA). Based on experience with the LOw Frequency ARray (LOFAR), a recent radio telescope, we show that the addition of such software control adds to the reliability and flexibility of the instrument. We identify some essential technical SDN requirements for this application, and investigate the level of functional support on three current switches and a virtual software switch. A proof of concept application validates the viability of this concept. While we identify limitations in the SDN implementations and performance of two of our hardware switches, excellent performance is shown on a third.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128317071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task-parallel Runtime System Optimization Using Static Compiler Analysis 使用静态编译器分析的任务并行运行时系统优化
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075574
Peter Thoman, P. Zangerl, T. Fahringer
Achieving high performance in task-parallel runtime systems, especially with high degrees of parallelism and fine-grained tasks, requires tuning a large variety of behavioral parameters according to program characteristics. In the current state of the art, this tuning is generally performed in one of two ways: either by a group of experts who derive a single setup which achieves good -- but not optimal -- performance across a wide variety of use cases, or by monitoring a system's behavior at runtime and responding to it. The former approach invariably fails to achieve optimal performance for programs with highly distinct execution patterns, while the latter induces some overhead and cannot affect parameters which need to be fixed at compile time. In order to mitigate these drawbacks, we propose a set of novel static compiler analyses specifically designed to determine program features which affect the optimal settings for a task-parallel execution environment. These features include the parallel structure of task spawning, the granularity of individual tasks, and an estimate of the stack size required per task. Based on the result of these analyses, various runtime system parameters are then tuned at compile time. We have implemented this approach in the Insieme compiler and runtime system, and evaluated its effectiveness on a set of 12 task parallel benchmarks running with 1 to 64 hardware threads. Across this entire space of use cases, our implementation achieves a geometric mean performance improvement of 39%.
在任务并行运行时系统中实现高性能,特别是具有高度并行性和细粒度任务的系统,需要根据程序特征调优大量的行为参数。在目前的技术水平下,这种调优通常以以下两种方式之一执行:要么由一组专家得出一个单一的设置,该设置可以在各种各样的用例中实现良好的(但不是最优的)性能,要么通过在运行时监视系统的行为并对其进行响应。对于具有高度不同执行模式的程序,前一种方法总是无法实现最佳性能,而后一种方法会带来一些开销,并且不会影响需要在编译时修复的参数。为了减轻这些缺点,我们提出了一套新的静态编译器分析,专门用于确定影响任务并行执行环境的最佳设置的程序特性。这些特性包括任务生成的并行结构、单个任务的粒度以及每个任务所需的堆栈大小的估计。根据这些分析的结果,然后在编译时调优各种运行时系统参数。我们已经在Insieme编译器和运行时系统中实现了这种方法,并在使用1到64个硬件线程运行的12个任务并行基准测试中评估了它的有效性。在整个用例空间中,我们的实现实现了39%的几何平均性能改进。
{"title":"Task-parallel Runtime System Optimization Using Static Compiler Analysis","authors":"Peter Thoman, P. Zangerl, T. Fahringer","doi":"10.1145/3075564.3075574","DOIUrl":"https://doi.org/10.1145/3075564.3075574","url":null,"abstract":"Achieving high performance in task-parallel runtime systems, especially with high degrees of parallelism and fine-grained tasks, requires tuning a large variety of behavioral parameters according to program characteristics. In the current state of the art, this tuning is generally performed in one of two ways: either by a group of experts who derive a single setup which achieves good -- but not optimal -- performance across a wide variety of use cases, or by monitoring a system's behavior at runtime and responding to it. The former approach invariably fails to achieve optimal performance for programs with highly distinct execution patterns, while the latter induces some overhead and cannot affect parameters which need to be fixed at compile time. In order to mitigate these drawbacks, we propose a set of novel static compiler analyses specifically designed to determine program features which affect the optimal settings for a task-parallel execution environment. These features include the parallel structure of task spawning, the granularity of individual tasks, and an estimate of the stack size required per task. Based on the result of these analyses, various runtime system parameters are then tuned at compile time. We have implemented this approach in the Insieme compiler and runtime system, and evaluated its effectiveness on a set of 12 task parallel benchmarks running with 1 to 64 hardware threads. Across this entire space of use cases, our implementation achieves a geometric mean performance improvement of 39%.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121397511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Do we need a holistic approach for the design of secure IoT systems? 我们是否需要一个整体的方法来设计安全的物联网系统?
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3079070
M. Conti, G. D. Natale, Annelie Heuser, T. Pöppelmann, N. Mentens
In this paper, four cryptography and security experts point out to future research directions in internet-of-things (IoT) security. Coming from different research domains, the experts address a broad range of issues related to IoT security. In preparation to a panel discussion at the International Workshop on Malicious Software and Hardware in the Internet of Things (MalIoT), they indicate which aspects are important in the design of secured IoT systems, and to which extent we need a holistic approach that integrates security measures at all levels of design abstraction.
在本文中,四位密码学和安全专家指出了物联网安全的未来研究方向。来自不同的研究领域,专家们解决了与物联网安全相关的广泛问题。在准备物联网恶意软件和硬件国际研讨会(MalIoT)的小组讨论中,他们指出了安全物联网系统设计中哪些方面是重要的,以及在何种程度上我们需要一种集成所有设计抽象级别的安全措施的整体方法。
{"title":"Do we need a holistic approach for the design of secure IoT systems?","authors":"M. Conti, G. D. Natale, Annelie Heuser, T. Pöppelmann, N. Mentens","doi":"10.1145/3075564.3079070","DOIUrl":"https://doi.org/10.1145/3075564.3079070","url":null,"abstract":"In this paper, four cryptography and security experts point out to future research directions in internet-of-things (IoT) security. Coming from different research domains, the experts address a broad range of issues related to IoT security. In preparation to a panel discussion at the International Workshop on Malicious Software and Hardware in the Internet of Things (MalIoT), they indicate which aspects are important in the design of secured IoT systems, and to which extent we need a holistic approach that integrates security measures at all levels of design abstraction.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133346058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Computing Frontiers Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1