首页 > 最新文献

Proceedings of the ACM International Conference on Computing Frontiers最新文献

英文 中文
Security analysis and exploitation of arduino devices in the internet of things arduino设备在物联网中的安全性分析与开发
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2911708
Carlos Alberca, S. Pastrana, Guillermo Suarez-Tangil, P. Palmieri
The pervasive presence of interconnected objects enables new communication paradigms where devices can easily reach each other while interacting within their environment. The so-called Internet of Things (IoT) represents the integration of several computing and communications systems aiming at facilitating the interaction between these devices. Arduino is one of the most popular platforms used to prototype new IoT devices due to its open, flexible and easy-to-use architecture. Ardunio Yun is a dual board microcontroller that supports a Linux distribution and it is currently one of the most versatile and powerful Arduino systems. This feature positions Arduino Yun as a popular platform for developers, but it also introduces unique infection vectors from the security viewpoint. In this work, we present a security analysis of Arduino Yun. We show that Arduino Yun is vulnerable to a number of attacks and we implement a proof of concept capable of exploiting some of them.
互连对象的普遍存在使新的通信范式成为可能,其中设备可以在其环境中交互时轻松地相互连接。所谓的物联网(IoT)代表了几个计算和通信系统的集成,旨在促进这些设备之间的交互。Arduino由于其开放、灵活和易于使用的架构,是用于新物联网设备原型的最流行平台之一。Ardunio Yun是一个支持Linux发行版的双板微控制器,它是目前最通用和最强大的Arduino系统之一。这一特性使Arduino Yun成为开发人员的热门平台,但从安全角度来看,它也引入了独特的感染媒介。在这项工作中,我们提出了Arduino Yun的安全性分析。我们展示了Arduino Yun容易受到许多攻击,并且我们实现了能够利用其中一些攻击的概念证明。
{"title":"Security analysis and exploitation of arduino devices in the internet of things","authors":"Carlos Alberca, S. Pastrana, Guillermo Suarez-Tangil, P. Palmieri","doi":"10.1145/2903150.2911708","DOIUrl":"https://doi.org/10.1145/2903150.2911708","url":null,"abstract":"The pervasive presence of interconnected objects enables new communication paradigms where devices can easily reach each other while interacting within their environment. The so-called Internet of Things (IoT) represents the integration of several computing and communications systems aiming at facilitating the interaction between these devices. Arduino is one of the most popular platforms used to prototype new IoT devices due to its open, flexible and easy-to-use architecture. Ardunio Yun is a dual board microcontroller that supports a Linux distribution and it is currently one of the most versatile and powerful Arduino systems. This feature positions Arduino Yun as a popular platform for developers, but it also introduces unique infection vectors from the security viewpoint. In this work, we present a security analysis of Arduino Yun. We show that Arduino Yun is vulnerable to a number of attacks and we implement a proof of concept capable of exploiting some of them.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130979023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Automated parsing and interpretation of identity leaks 自动解析和解释身份泄漏
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2903156
Hendrik Graupner, David Jaeger, Feng Cheng, C. Meinel
The relevance of identity data leaks on the Internet is more present than ever. Almost every month we read about leakage of databases with more than a million users in the news. Smaller but not less dangerous leaks happen even multiple times a day. The public availability of such leaked data is a major threat to the victims, but also creates the opportunity to learn not only about security of service providers but also the behavior of users when choosing passwords. Our goal is to analyze this data and generate knowledge that can be used to increase security awareness and security, respectively. This paper presents a novel approach to automatic analysis of a vast majority of bigger and smaller leaks. Our contribution is the concept and a prototype implementation of a parser, composed of a syntactic and a semantic module, and a data analyzer for identity leaks. In this context, we deal with the two major challenges of a huge amount of different formats and the recognition of leaks' unknown data types. Based on the data collected, this paper reveals how easy it is for criminals to collect lots of passwords, which are plain text or only weakly hashed.
互联网上身份数据泄露的相关性比以往任何时候都更加突出。几乎每个月我们都会在新闻中读到超过一百万用户的数据库泄露事件。更小但同样危险的泄漏甚至一天会发生多次。这些泄露数据的公开可用性对受害者来说是一个主要威胁,但也创造了一个机会,不仅可以了解服务提供商的安全性,还可以了解用户在选择密码时的行为。我们的目标是分析这些数据并生成可分别用于提高安全意识和安全性的知识。本文提出了一种自动分析绝大多数大小泄漏的新方法。我们的贡献是解析器的概念和原型实现,由语法和语义模块组成,以及用于身份泄漏的数据分析器。在这种情况下,我们要处理两个主要的挑战:大量不同的格式和识别泄漏的未知数据类型。根据收集到的数据,本文揭示了犯罪分子如何容易收集大量密码,这些密码是纯文本或仅弱散列的。
{"title":"Automated parsing and interpretation of identity leaks","authors":"Hendrik Graupner, David Jaeger, Feng Cheng, C. Meinel","doi":"10.1145/2903150.2903156","DOIUrl":"https://doi.org/10.1145/2903150.2903156","url":null,"abstract":"The relevance of identity data leaks on the Internet is more present than ever. Almost every month we read about leakage of databases with more than a million users in the news. Smaller but not less dangerous leaks happen even multiple times a day. The public availability of such leaked data is a major threat to the victims, but also creates the opportunity to learn not only about security of service providers but also the behavior of users when choosing passwords. Our goal is to analyze this data and generate knowledge that can be used to increase security awareness and security, respectively. This paper presents a novel approach to automatic analysis of a vast majority of bigger and smaller leaks. Our contribution is the concept and a prototype implementation of a parser, composed of a syntactic and a semantic module, and a data analyzer for identity leaks. In this context, we deal with the two major challenges of a huge amount of different formats and the recognition of leaks' unknown data types. Based on the data collected, this paper reveals how easy it is for criminals to collect lots of passwords, which are plain text or only weakly hashed.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126344555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Partial FPGA bitstream encryption enabling hardware DRM in mobile environments 部分FPGA比特流加密使硬件DRM在移动环境
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2911711
M. Barbareschi, A. Cilardo, A. Mazzeo
The concept of digital right management (DRM) has become extremely important in current mobile environments. This paper shows how partial bitstream encryption can allow the secure distribution of hardware applications resembling the mechanisms of traditional software DRM. Building on the recent developments towards the secure distribution of hardware cores, the paper demonstrates a prototypical implementation of a user mobile device supporting such distribution mechanisms. The prototype extends the Android operating system with support for hardware reconfigurability and showcases the interplay of novel security concepts enabled by hardware DRM, the advantages of a design flow based on high-level synthesis, and the opportunities provided by current software-rich reconfigurable Systems-on-Chips. Relying on this prototype, we also collected extensive quantitative results demonstrating the limited overhead incurred by the secure distribution architecture.
数字版权管理(DRM)的概念在当前的移动环境中变得极其重要。本文展示了部分比特流加密如何允许类似于传统软件DRM机制的硬件应用程序的安全分发。基于硬件核心安全分发的最新发展,本文演示了支持这种分发机制的用户移动设备的原型实现。该原型扩展了Android操作系统,支持硬件可重构性,并展示了硬件DRM支持的新型安全概念的相互作用,基于高级合成的设计流程的优势,以及当前软件丰富的可重构芯片系统提供的机会。依靠这个原型,我们还收集了大量的定量结果,展示了安全分发架构带来的有限开销。
{"title":"Partial FPGA bitstream encryption enabling hardware DRM in mobile environments","authors":"M. Barbareschi, A. Cilardo, A. Mazzeo","doi":"10.1145/2903150.2911711","DOIUrl":"https://doi.org/10.1145/2903150.2911711","url":null,"abstract":"The concept of digital right management (DRM) has become extremely important in current mobile environments. This paper shows how partial bitstream encryption can allow the secure distribution of hardware applications resembling the mechanisms of traditional software DRM. Building on the recent developments towards the secure distribution of hardware cores, the paper demonstrates a prototypical implementation of a user mobile device supporting such distribution mechanisms. The prototype extends the Android operating system with support for hardware reconfigurability and showcases the interplay of novel security concepts enabled by hardware DRM, the advantages of a design flow based on high-level synthesis, and the opportunities provided by current software-rich reconfigurable Systems-on-Chips. Relying on this prototype, we also collected extensive quantitative results demonstrating the limited overhead incurred by the secure distribution architecture.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131723924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An in-memory based framework for scientific data analytics 一个基于内存的科学数据分析框架
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2911719
D. Elia, S. Fiore, Alessandro D'Anca, Cosimo Palazzo, Ian T Foster, Dean N. Williams, G. Aloisio
This work presents the I/O in-memory server implemented in the context of the Ophidia framework, a big data analytics stack addressing scientific data analysis of n-dimensional datasets. The provided I/O server represents a key component in the Ophidia 2.0 architecture proposed in this paper. It exploits (i) a NoSQL approach to manage scientific data at the storage level, (ii) user-defined functions to perform array-based analytics, (iii) the Ophidia Storage API to manage heterogeneous back-ends through a plugin-based approach, and (iv) an in-memory and parallel analytics engine to address high scalability and performance. Preliminary performance results about a statistical analytics kernel benchmark performed on a HPC cluster running at the CMCC SuperComputing Centre are provided in this paper.
这项工作展示了在Ophidia框架背景下实现的I/O内存服务器,这是一个解决n维数据集的科学数据分析的大数据分析堆栈。本文提供的I/O服务器是本文提出的Ophidia 2.0架构中的一个关键组件。它利用(i) NoSQL方法来管理存储级别的科学数据,(ii)用户定义函数来执行基于数组的分析,(iii) Ophidia存储API通过基于插件的方法来管理异构后端,以及(iv)内存和并行分析引擎来解决高可扩展性和性能。本文提供了在CMCC超级计算中心运行的高性能计算集群上执行统计分析内核基准测试的初步性能结果。
{"title":"An in-memory based framework for scientific data analytics","authors":"D. Elia, S. Fiore, Alessandro D'Anca, Cosimo Palazzo, Ian T Foster, Dean N. Williams, G. Aloisio","doi":"10.1145/2903150.2911719","DOIUrl":"https://doi.org/10.1145/2903150.2911719","url":null,"abstract":"This work presents the I/O in-memory server implemented in the context of the Ophidia framework, a big data analytics stack addressing scientific data analysis of n-dimensional datasets. The provided I/O server represents a key component in the Ophidia 2.0 architecture proposed in this paper. It exploits (i) a NoSQL approach to manage scientific data at the storage level, (ii) user-defined functions to perform array-based analytics, (iii) the Ophidia Storage API to manage heterogeneous back-ends through a plugin-based approach, and (iv) an in-memory and parallel analytics engine to address high scalability and performance. Preliminary performance results about a statistical analytics kernel benchmark performed on a HPC cluster running at the CMCC SuperComputing Centre are provided in this paper.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122655782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Decoding EEG and LFP signals using deep learning: heading TrueNorth 使用深度学习解码EEG和LFP信号:航向trunorth
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2903159
E. Nurse, B. Mashford, Antonio Jimeno-Yepes, Isabell Kiral-Kornek, S. Harrer, D. Freestone
Deep learning technology is uniquely suited to analyse neurophysiological signals such as the electroencephalogram (EEG) and local field potentials (LFP) and promises to outperform traditional machine-learning based classification and feature extraction algorithms. Furthermore, novel cognitive computing platforms such as IBM's recently introduced neuromorphic TrueNorth chip allow for deploying deep learning techniques in an ultra-low power environment with a minimum device footprint. Merging deep learning and TrueNorth technologies for real-time analysis of brain-activity data at the point of sensing will create the next generation of wearables at the intersection of neurobionics and artificial intelligence.
深度学习技术特别适合分析脑电图(EEG)和局部场电位(LFP)等神经生理信号,并有望超越传统的基于机器学习的分类和特征提取算法。此外,新的认知计算平台,如IBM最近推出的神经形态TrueNorth芯片,允许在超低功耗环境中以最小的设备占用空间部署深度学习技术。将深度学习和TrueNorth技术结合起来,在感应点实时分析大脑活动数据,将在神经仿生学和人工智能的交汇处创造下一代可穿戴设备。
{"title":"Decoding EEG and LFP signals using deep learning: heading TrueNorth","authors":"E. Nurse, B. Mashford, Antonio Jimeno-Yepes, Isabell Kiral-Kornek, S. Harrer, D. Freestone","doi":"10.1145/2903150.2903159","DOIUrl":"https://doi.org/10.1145/2903150.2903159","url":null,"abstract":"Deep learning technology is uniquely suited to analyse neurophysiological signals such as the electroencephalogram (EEG) and local field potentials (LFP) and promises to outperform traditional machine-learning based classification and feature extraction algorithms. Furthermore, novel cognitive computing platforms such as IBM's recently introduced neuromorphic TrueNorth chip allow for deploying deep learning techniques in an ultra-low power environment with a minimum device footprint. Merging deep learning and TrueNorth technologies for real-time analysis of brain-activity data at the point of sensing will create the next generation of wearables at the intersection of neurobionics and artificial intelligence.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125291490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
Adaptable AES implementation with power-gating support
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2903488
S. Banik, A. Bogdanov, Tiziana Fanni, Carlo Sau, L. Raffo, F. Palumbo, F. Regazzoni
In this paper, we propose a reconfigurable design of the Advanced Encryption Standard capable of adapting at runtime to the requirements of the target application. Reconfiguration is achieved by activating only a specific subset of all the instantiated processing elements. Further, we explore the effectiveness of power gating and clock gating methodologies to minimize the energy consumption of the processing elements not involved in computation.
在本文中,我们提出了一种高级加密标准的可重构设计,能够在运行时适应目标应用程序的需求。通过仅激活所有实例化处理元素的特定子集来实现重新配置。此外,我们探讨了功率门控和时钟门控方法的有效性,以尽量减少不涉及计算的处理元素的能量消耗。
{"title":"Adaptable AES implementation with power-gating support","authors":"S. Banik, A. Bogdanov, Tiziana Fanni, Carlo Sau, L. Raffo, F. Palumbo, F. Regazzoni","doi":"10.1145/2903150.2903488","DOIUrl":"https://doi.org/10.1145/2903150.2903488","url":null,"abstract":"In this paper, we propose a reconfigurable design of the Advanced Encryption Standard capable of adapting at runtime to the requirements of the target application. Reconfiguration is achieved by activating only a specific subset of all the instantiated processing elements. Further, we explore the effectiveness of power gating and clock gating methodologies to minimize the energy consumption of the processing elements not involved in computation.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125109166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Large transfers for data analytics on shared wide-area networks 在共享广域网上进行数据分析的大传输
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2911718
Hamidreza Anvari, P. Lu
One part of large-scale data analytics is the problem of transferring the data across wide-area networks (WANs). Often, the data must be gathered (e.g., from remote sites), processed, possibly transferred (e.g., for further processing), and then possibly disseminated. If the data-transfer stages are bottlenecks, the overall data analytics pipeline will be affected. Although a variety of tools and protocols have been developed for large data transfers on WANs, most of the related work has been in the context of dedicated or non-shared networks. However, in practice, most networks are likely to be shared. We consider and evaluate the problem of large data transfers on shared networks and large round-trip-times (RTT) as are found on many WANs. Using a variety of synthetic background network traffic (e.g., uniform, TCP, UDP, square waveform, bursty), we compare the performance of well-known protocols (e.g., GridFTP, UDT). On our emulated WAN network, both GridFTP and UDT perform well in all-TCP situations, but UDT performs better when UDP-based background traffic is prominent.
大规模数据分析的一部分是跨广域网(wan)传输数据的问题。通常,必须收集(例如,从远程站点)数据,进行处理,可能转移(例如,进一步处理),然后可能传播。如果数据传输阶段是瓶颈,那么整个数据分析管道将受到影响。尽管已经为广域网上的大数据传输开发了各种工具和协议,但大多数相关工作都是在专用或非共享网络的背景下进行的。然而,在实践中,大多数网络可能是共享的。我们考虑和评估了在共享网络上的大数据传输问题和在许多广域网上发现的大往返时间(RTT)。使用各种合成的背景网络流量(例如,统一,TCP, UDP,方波,突发),我们比较了众所周知的协议(例如,GridFTP, UDT)的性能。在我们模拟的WAN网络上,GridFTP和UDT在全tcp情况下都表现良好,但当基于udp的后台流量突出时,UDT表现更好。
{"title":"Large transfers for data analytics on shared wide-area networks","authors":"Hamidreza Anvari, P. Lu","doi":"10.1145/2903150.2911718","DOIUrl":"https://doi.org/10.1145/2903150.2911718","url":null,"abstract":"One part of large-scale data analytics is the problem of transferring the data across wide-area networks (WANs). Often, the data must be gathered (e.g., from remote sites), processed, possibly transferred (e.g., for further processing), and then possibly disseminated. If the data-transfer stages are bottlenecks, the overall data analytics pipeline will be affected. Although a variety of tools and protocols have been developed for large data transfers on WANs, most of the related work has been in the context of dedicated or non-shared networks. However, in practice, most networks are likely to be shared. We consider and evaluate the problem of large data transfers on shared networks and large round-trip-times (RTT) as are found on many WANs. Using a variety of synthetic background network traffic (e.g., uniform, TCP, UDP, square waveform, bursty), we compare the performance of well-known protocols (e.g., GridFTP, UDT). On our emulated WAN network, both GridFTP and UDT perform well in all-TCP situations, but UDT performs better when UDP-based background traffic is prominent.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117251867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Integrated measurement and modeling for performance and power 集成测量和建模的性能和功率
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2903912
A. Hoisie
In this presentation we will describe methodologies for integrated measurement and modeling of power and performance for extreme scale systems and applications.
在本次演讲中,我们将介绍用于极端规模系统和应用的功率和性能的集成测量和建模方法。
{"title":"Integrated measurement and modeling for performance and power","authors":"A. Hoisie","doi":"10.1145/2903150.2903912","DOIUrl":"https://doi.org/10.1145/2903150.2903912","url":null,"abstract":"In this presentation we will describe methodologies for integrated measurement and modeling of power and performance for extreme scale systems and applications.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129213397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating sync overhead in single-level store systems 减少单级存储系统中的同步开销
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2903161
Yuanchao Xu, Hu Wan, Zeyi Hou, Keni Qiu
Emerging non-volatile memory technologies offer the durability of disk and the byte-addressability of DRAM, which makes it feasible to build up single-level store systems. However, due to extremely low latency of persistent writes to non-volatile memory, software stack accounts for the majority of the overall performance overhead, one of which comes from crash consistency guarantees. In order to let persistent data structures survive power failures or system crashes, some measures, such as write-ahead logging or copy-on-write, along with frequent cacheline flushes, must be taken to ensure the consistency of durable data, thereby incurring non-trivial sync overhead. In this paper, we propose two techniques to mitigate the sync overhead. First, we leverage write-optimized non-volatile memory to store log entries on chip instead of off chip, thereby eliminating sync overhead. Second, we present an adaptive caching mode policy in terms of data access patterns to eliminate unnecessary sync overhead. Evaluation results indicate that the two techniques help improve the overall performance from 5.88x to 6.77x compared to conventional transactional persistent memory.
新兴的非易失性存储器技术提供了磁盘的耐用性和DRAM的字节寻址能力,这使得建立单级存储系统成为可能。然而,由于持久写入非易失性内存的延迟极低,软件堆栈占了总体性能开销的大部分,其中之一来自崩溃一致性保证。为了让持久数据结构能够在电源故障或系统崩溃中存活下来,必须采取一些措施,如预写日志记录或写时复制,以及频繁的缓存刷新,以确保持久数据的一致性,从而产生重要的同步开销。在本文中,我们提出了两种技术来减轻同步开销。首先,我们利用写优化的非易失性内存将日志条目存储在芯片上而不是芯片外,从而消除了同步开销。其次,我们在数据访问模式方面提出了一种自适应缓存模式策略,以消除不必要的同步开销。评估结果表明,与传统事务性持久性内存相比,这两种技术有助于将总体性能从5.88倍提高到6.77倍。
{"title":"Mitigating sync overhead in single-level store systems","authors":"Yuanchao Xu, Hu Wan, Zeyi Hou, Keni Qiu","doi":"10.1145/2903150.2903161","DOIUrl":"https://doi.org/10.1145/2903150.2903161","url":null,"abstract":"Emerging non-volatile memory technologies offer the durability of disk and the byte-addressability of DRAM, which makes it feasible to build up single-level store systems. However, due to extremely low latency of persistent writes to non-volatile memory, software stack accounts for the majority of the overall performance overhead, one of which comes from crash consistency guarantees. In order to let persistent data structures survive power failures or system crashes, some measures, such as write-ahead logging or copy-on-write, along with frequent cacheline flushes, must be taken to ensure the consistency of durable data, thereby incurring non-trivial sync overhead. In this paper, we propose two techniques to mitigate the sync overhead. First, we leverage write-optimized non-volatile memory to store log entries on chip instead of off chip, thereby eliminating sync overhead. Second, we present an adaptive caching mode policy in terms of data access patterns to eliminate unnecessary sync overhead. Evaluation results indicate that the two techniques help improve the overall performance from 5.88x to 6.77x compared to conventional transactional persistent memory.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127382149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Libra: an automated code generation and tuning framework for register-limited stencils on GPUs Libra:一个自动代码生成和调优框架,用于gpu上的寄存器限制模板
Pub Date : 2016-05-16 DOI: 10.1145/2903150.2903158
Mengyao Jin, H. Fu, Zihong Lv, Guangwen Yang
Stencils account for a significant part in many scientific computing applications. Besides simple stencils which can be completed with a few arithmetic operations, there are also many register-limited stencils with hundreds or thousands of variables and operations. The massive registers required by these stencils largely limit the parallelism of the programs on current many-core architectures, and consequently degrade the overall performance. Based on the register usage, which is the major constraining factor for most register-limited stencils, we propose a DDG (data-dependency-graph) oriented code transformation approach to improve the performance of these stencils. This approach analyzes, reorders and transforms the original program on GPUs, and further explores for the best tradeoff between the computation amount and the parallelism degree. Based on our graphoriented code transformation approach, we further design and implement an automated code generation and tuning framework called Libra, to improve the productivity and performance simultaneously. We apply Libra to 5 widely used stencils, and experiment results show that these stencils achieve a speedup of 1.12~2.16X when compared with the original fairly-optimized implementations.
模板在许多科学计算应用中占有重要的地位。除了可以用一些算术运算完成的简单模板外,还有许多具有数百或数千个变量和运算的寄存器限制模板。这些模板所需的大量寄存器在很大程度上限制了当前多核体系结构上程序的并行性,从而降低了整体性能。基于寄存器的使用是大多数寄存器受限模板的主要制约因素,我们提出了一种面向DDG(数据依赖图)的代码转换方法来提高这些模板的性能。该方法在gpu上对原程序进行分析、重排序和变换,并进一步探索计算量和并行度之间的最佳权衡。基于我们面向图形的代码转换方法,我们进一步设计并实现了一个名为Libra的自动代码生成和调优框架,以同时提高生产力和性能。我们将Libra应用于5种广泛使用的模板,实验结果表明,这些模板与原始的优化实现相比,速度提高了1.12~2.16倍。
{"title":"Libra: an automated code generation and tuning framework for register-limited stencils on GPUs","authors":"Mengyao Jin, H. Fu, Zihong Lv, Guangwen Yang","doi":"10.1145/2903150.2903158","DOIUrl":"https://doi.org/10.1145/2903150.2903158","url":null,"abstract":"Stencils account for a significant part in many scientific computing applications. Besides simple stencils which can be completed with a few arithmetic operations, there are also many register-limited stencils with hundreds or thousands of variables and operations. The massive registers required by these stencils largely limit the parallelism of the programs on current many-core architectures, and consequently degrade the overall performance. Based on the register usage, which is the major constraining factor for most register-limited stencils, we propose a DDG (data-dependency-graph) oriented code transformation approach to improve the performance of these stencils. This approach analyzes, reorders and transforms the original program on GPUs, and further explores for the best tradeoff between the computation amount and the parallelism degree. Based on our graphoriented code transformation approach, we further design and implement an automated code generation and tuning framework called Libra, to improve the productivity and performance simultaneously. We apply Libra to 5 widely used stencils, and experiment results show that these stencils achieve a speedup of 1.12~2.16X when compared with the original fairly-optimized implementations.","PeriodicalId":226569,"journal":{"name":"Proceedings of the ACM International Conference on Computing Frontiers","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124087798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the ACM International Conference on Computing Frontiers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1