首页 > 最新文献

Proceedings of the Computing Frontiers Conference最新文献

英文 中文
ExanaDBT: A Dynamic Compilation System for Transparent Polyhedral Optimizations at Runtime ExanaDBT:运行时透明多面体优化的动态编译系统
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3077627
Yukinori Sato, Tomoya Yuki, Toshio Endo
In this paper, we present a dynamic compilation system called ExanaDBT for transparently optimizing and parallelizing binaries at runtime based on the polyhedral model. Starting from hot spot detection of the execution, ExanaDBT dynamically estimates gains for optimization, translates the target region into highly optimized code, and switches the execution of original code to optimized one. To realize advanced loop-level optimizations beyond trace- or instruction-level, ExanaDBT uses a polyhedral optimizer and performs loop transformation for rewarding sustainable performance gain on systems with deeper memory hierarchy. Especially for successful optimizations, we reveal that a simple conversion from the original binaries to LLVM IR will not enough for representing the code in polyhedral model, and then investigate a feasible way to lift binaries to the IR capable of polyhedral optimizations. We implement a proof-of-concept design of ExanaDBT and evaluate it. From the evaluation results, we confirm that ExanaDBT realizes dynamic optimization in a fully automated fashion. The results also show that ExanaDBT can contribute to speeding up the execution in average 3.2 times from unoptimized serial code in single thread execution and 11.9 times in 16 thread parallel execution.
在本文中,我们提出了一种名为 ExanaDBT 的动态编译系统,用于在运行时基于多面体模型透明地优化和并行化二进制文件。从执行热点检测开始,ExanaDBT 动态估计优化收益,将目标区域转化为高度优化的代码,并将原始代码的执行切换到优化代码。为了实现跟踪级或指令级之外的高级循环级优化,ExanaDBT 使用了多面体优化器,并在内存层次结构较深的系统上执行循环转换,以获得可持续的性能增益。特别是对于成功的优化,我们发现从原始二进制文件到 LLVM IR 的简单转换不足以用多面体模型表示代码,因此我们研究了一种可行的方法,将二进制文件提升到能够进行多面体优化的 IR。我们实现了 ExanaDBT 的概念验证设计并对其进行了评估。从评估结果来看,我们确认 ExanaDBT 以完全自动化的方式实现了动态优化。结果还显示,ExanaDBT 在单线程执行中比未经优化的串行代码平均加快了 3.2 倍,在 16 线程并行执行中加快了 11.9 倍。
{"title":"ExanaDBT: A Dynamic Compilation System for Transparent Polyhedral Optimizations at Runtime","authors":"Yukinori Sato, Tomoya Yuki, Toshio Endo","doi":"10.1145/3075564.3077627","DOIUrl":"https://doi.org/10.1145/3075564.3077627","url":null,"abstract":"In this paper, we present a dynamic compilation system called ExanaDBT for transparently optimizing and parallelizing binaries at runtime based on the polyhedral model. Starting from hot spot detection of the execution, ExanaDBT dynamically estimates gains for optimization, translates the target region into highly optimized code, and switches the execution of original code to optimized one. To realize advanced loop-level optimizations beyond trace- or instruction-level, ExanaDBT uses a polyhedral optimizer and performs loop transformation for rewarding sustainable performance gain on systems with deeper memory hierarchy. Especially for successful optimizations, we reveal that a simple conversion from the original binaries to LLVM IR will not enough for representing the code in polyhedral model, and then investigate a feasible way to lift binaries to the IR capable of polyhedral optimizations. We implement a proof-of-concept design of ExanaDBT and evaluate it. From the evaluation results, we confirm that ExanaDBT realizes dynamic optimization in a fully automated fashion. The results also show that ExanaDBT can contribute to speeding up the execution in average 3.2 times from unoptimized serial code in single thread execution and 11.9 times in 16 thread parallel execution.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129231547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Evolution of Friendship: a case study of MobiClique 友谊的演变:MobiClique的案例研究
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075595
Jooyoung Lee, Konstantin Lopatin, Rasheed Hussain, Waqas Nawaz
Understanding the evolution of relationship among users, through generic interactions, is the key driving force to this study. We model the evolution of friendship in the social network of MobiClique using observations of interactions among users. MobiClique is a mobile ad-hoc network setting where Bluetooth enabled mobile devices communicate directly with each other as they meet opportunistically. We first apply existing topological methods to predict future friendship in MobiClique and then compare the results with the proposed interaction-based method. Our approach combines four types of user activity information to measure the similarity between users at any specific time. We also define the temporal accuracy evaluation metric and show that interaction data with temporal information is a good indicator to predict temporal social ties. The experimental evaluation suggests that the well-known static topological metrics do not perform well in ad-hoc network scenario. The results suggest that to accurately predict evolution of friendship, or topology of the network, it is necessary to utilise some interaction information.
通过通用交互了解用户之间关系的演变是本研究的关键驱动力。我们通过观察用户之间的互动来模拟MobiClique社交网络中友谊的演变。MobiClique是一种移动自组织网络设置,支持蓝牙的移动设备在偶然相遇时可以直接相互通信。我们首先应用现有的拓扑方法来预测MobiClique中未来的友谊,然后将结果与提出的基于交互的方法进行比较。我们的方法结合了四种类型的用户活动信息来衡量用户在任何特定时间的相似性。我们还定义了时间精度评价指标,并表明具有时间信息的交互数据是预测时间社会关系的良好指标。实验评估表明,已知的静态拓扑度量在自组织网络场景中表现不佳。结果表明,为了准确地预测友谊的演变,或网络的拓扑结构,有必要利用一些交互信息。
{"title":"Evolution of Friendship: a case study of MobiClique","authors":"Jooyoung Lee, Konstantin Lopatin, Rasheed Hussain, Waqas Nawaz","doi":"10.1145/3075564.3075595","DOIUrl":"https://doi.org/10.1145/3075564.3075595","url":null,"abstract":"Understanding the evolution of relationship among users, through generic interactions, is the key driving force to this study. We model the evolution of friendship in the social network of MobiClique using observations of interactions among users. MobiClique is a mobile ad-hoc network setting where Bluetooth enabled mobile devices communicate directly with each other as they meet opportunistically. We first apply existing topological methods to predict future friendship in MobiClique and then compare the results with the proposed interaction-based method. Our approach combines four types of user activity information to measure the similarity between users at any specific time. We also define the temporal accuracy evaluation metric and show that interaction data with temporal information is a good indicator to predict temporal social ties. The experimental evaluation suggests that the well-known static topological metrics do not perform well in ad-hoc network scenario. The results suggest that to accurately predict evolution of friendship, or topology of the network, it is necessary to utilise some interaction information.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129462065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems 选择性卸载到内存:支持pim的异构系统的任务分区和映射
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075584
Dawen Xu, Yi Liao, Ying Wang, Huawei Li, Xiaowei Li
Processing-in-Memory (PIM) is returning as a promising solution to address the issue of memory wall as computing systems gradually step into the big data era. Researchers continually proposed various PIM architecture combined with novel memory device or 3D integration technology, but it is still a lack of universal task scheduling method in terms of the new heterogeneous platform. In this paper, we propose a formalized model to quantify the performance and energy of the PIM+CPU heterogeneous parallel system. In addition, we are the first to build a task partitioning and mapping framework to exploit different PIM engines. In this framework, an application is divided into subtasks and mapped onto appropriate execution units based on the proposed PIM-oriented Earliest-Finish-Time (PEFT) algorithm to maximize the performance gains brought by PIM. Experimental evaluations show our PIM-aware framework significantly improves the system performance compared to conventional processor architectures.
随着计算系统逐渐步入大数据时代,内存处理(PIM)作为解决内存墙问题的一种有前途的解决方案正在回归。结合新颖的存储设备或三维集成技术,研究者不断提出各种PIM架构,但在新的异构平台上仍然缺乏通用的任务调度方法。在本文中,我们提出了一个形式化的模型来量化PIM+CPU异构并行系统的性能和能量。此外,我们是第一个构建任务划分和映射框架来利用不同PIM引擎的人。在这个框架中,应用程序被划分为子任务,并根据提出的面向PIM的最早完成时间(PEFT)算法映射到适当的执行单元,以最大限度地提高PIM带来的性能收益。实验评估表明,与传统的处理器架构相比,我们的pim感知框架显着提高了系统性能。
{"title":"Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems","authors":"Dawen Xu, Yi Liao, Ying Wang, Huawei Li, Xiaowei Li","doi":"10.1145/3075564.3075584","DOIUrl":"https://doi.org/10.1145/3075564.3075584","url":null,"abstract":"Processing-in-Memory (PIM) is returning as a promising solution to address the issue of memory wall as computing systems gradually step into the big data era. Researchers continually proposed various PIM architecture combined with novel memory device or 3D integration technology, but it is still a lack of universal task scheduling method in terms of the new heterogeneous platform. In this paper, we propose a formalized model to quantify the performance and energy of the PIM+CPU heterogeneous parallel system. In addition, we are the first to build a task partitioning and mapping framework to exploit different PIM engines. In this framework, an application is divided into subtasks and mapped onto appropriate execution units based on the proposed PIM-oriented Earliest-Finish-Time (PEFT) algorithm to maximize the performance gains brought by PIM. Experimental evaluations show our PIM-aware framework significantly improves the system performance compared to conventional processor architectures.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125331195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Instruction level energy model for the Adapteva Epiphany multi-core processor Adapteva Epiphany多核处理器的指令级能量模型
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3078892
Gabriel Ortiz, L. Svensson, Erik Alveflo, P. Larsson-Edefors
Processor energy models can be used by developers to estimate, without the need of hardware implementation or additional measurement setups, the power consumption of software applications. Furthermore, these energy models can be used for energy-aware compiler optimization. This paper presents a measurement-based instruction-level energy characterization for the Adapteva Epiphany processor, which is a 16-core shared-memory architecture connected by a 2D network-on-chip. Based on a number of microbenchmarks, the instruction-level characterization was used to build an energy model that includes essential Epiphany instructions such as remote memory loads and stores. To validate the model, an FFT application was developed. This validation showed that the energy estimated by the model is within 0.4% of the measured energy.
开发人员可以使用处理器能量模型来估计软件应用程序的功耗,而不需要硬件实现或额外的测量设置。此外,这些能量模型可用于能量感知的编译器优化。本文介绍了Adapteva Epiphany处理器的基于测量的指令级能量表征,该处理器是一个16核共享内存架构,通过2D片上网络连接。基于许多微基准测试,使用指令级表征来构建能量模型,该模型包括基本的顿悟指令,如远程内存负载和存储。为了验证该模型,开发了一个FFT应用程序。验证表明,模型估算的能量与实测能量的误差在0.4%以内。
{"title":"Instruction level energy model for the Adapteva Epiphany multi-core processor","authors":"Gabriel Ortiz, L. Svensson, Erik Alveflo, P. Larsson-Edefors","doi":"10.1145/3075564.3078892","DOIUrl":"https://doi.org/10.1145/3075564.3078892","url":null,"abstract":"Processor energy models can be used by developers to estimate, without the need of hardware implementation or additional measurement setups, the power consumption of software applications. Furthermore, these energy models can be used for energy-aware compiler optimization. This paper presents a measurement-based instruction-level energy characterization for the Adapteva Epiphany processor, which is a 16-core shared-memory architecture connected by a 2D network-on-chip. Based on a number of microbenchmarks, the instruction-level characterization was used to build an energy model that includes essential Epiphany instructions such as remote memory loads and stores. To validate the model, an FFT application was developed. This validation showed that the energy estimated by the model is within 0.4% of the measured energy.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116383567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Optimal On-Line Computation of Stack Distances for MIN and OPT 最小最小和最优选择的堆栈距离在线优化计算
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075571
G. Bilardi, K. Ekanadham, P. Pattnaik
The replacement policies known as MIN and OPT are optimal for a two-level memory hierarchy. The computation of the cache content for these policies requires the off-line knowledge of the entire address trace. However, the stack distance of a given access, that is, the smallest capacity of a cache for which that access results in a hit, is independent of future accesses and can be computed on-line. Off-line and on-line algorithms to compute the stack distance in time O(V) per access have been known for several decades, where V denotes the number of distinct addresses within the trace. The off-line time bound was recently improved to O(√V log V). This paper introduces the Critical Stack Algorithm for the online computation of the stack distance of MIN and OPT, in time O(log V) per access. The result exploits a novel analysis of properties of OPT and data structures based on balanced binary trees. A corresponding Ω(log V) lower bound is derived by a reduction from element distinctness; this bound holds in a variety of models of computation and applies even to the off-line simulation of just one cache capacity.
对于两级内存层次结构,称为MIN和OPT的替换策略是最优的。这些策略的缓存内容的计算需要了解整个地址跟踪的离线知识。但是,给定访问的堆栈距离,即该访问导致命中的缓存的最小容量,与未来的访问无关,并且可以在线计算。计算每次访问时间O(V)的堆栈距离的离线和在线算法已经存在了几十年,其中V表示跟踪中不同地址的数量。最近将离线时间限制改进为O(√V log V).本文介绍了在每次访问O(log V)时间内在线计算MIN和OPT的堆栈距离的临界堆栈算法。结果利用了一种新的基于平衡二叉树的OPT和数据结构的特性分析。通过对元素区别度的约简,推导出相应的Ω(log V)下界;这个界限适用于各种计算模型,甚至适用于仅一个缓存容量的离线模拟。
{"title":"Optimal On-Line Computation of Stack Distances for MIN and OPT","authors":"G. Bilardi, K. Ekanadham, P. Pattnaik","doi":"10.1145/3075564.3075571","DOIUrl":"https://doi.org/10.1145/3075564.3075571","url":null,"abstract":"The replacement policies known as MIN and OPT are optimal for a two-level memory hierarchy. The computation of the cache content for these policies requires the off-line knowledge of the entire address trace. However, the stack distance of a given access, that is, the smallest capacity of a cache for which that access results in a hit, is independent of future accesses and can be computed on-line. Off-line and on-line algorithms to compute the stack distance in time O(V) per access have been known for several decades, where V denotes the number of distinct addresses within the trace. The off-line time bound was recently improved to O(√V log V). This paper introduces the Critical Stack Algorithm for the online computation of the stack distance of MIN and OPT, in time O(log V) per access. The result exploits a novel analysis of properties of OPT and data structures based on balanced binary trees. A corresponding Ω(log V) lower bound is derived by a reduction from element distinctness; this bound holds in a variety of models of computation and applies even to the off-line simulation of just one cache capacity.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"52 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126005457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
RAGuard: A Hardware Based Mechanism for Backward-Edge Control-Flow Integrity rguard:一种基于硬件的后边缘控制流完整性机制
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075570
Jun Zhang, Rui Hou, Junfeng Fan, KeKe Liu, Lixin Zhang, S. Mckee
Control-flow integrity (CFI) is considered as a general and promising method to prevent code-reuse attacks, which utilize benign code sequences to realize arbitrary computation. Current approaches can efficiently protect control-flow transfers caused by indirect jumps and function calls (forward-edge CFI). However, they cannot effectively protect control-flow caused by the function return (backward-edge CFI). The reason is that the set of return addresses of the functions that are frequently called can be very large, which might bend the backward-edge CFI. We address this backward-edge CFI problem by proposing a novel hardware-assisted mechanism (RAGuard) that binds a message authentication code to each return address and enhances security via a physical unclonable function and a hardware hash function. The message authentication codes can be stored on the program stack with return address. RAGuard hardware automatically verifies the integrity of return addresses. Our experiments show that for a subset of the SPEC CPU2006 benchmarks, RAGuard incurs 1.86% runtime overheads on average with no need for OS support.
控制流完整性(CFI)被认为是一种通用的、有前途的防止代码重用攻击的方法,它利用良性代码序列来实现任意计算。目前的方法可以有效地保护由间接跳转和函数调用引起的控制流转移(前沿CFI)。但是,它们不能有效地保护由函数返回(后缘CFI)引起的控制流。原因是经常调用的函数的返回地址集可能非常大,这可能会弯曲后边缘CFI。我们通过提出一种新的硬件辅助机制(RAGuard)来解决这个后端CFI问题,该机制将消息认证码绑定到每个返回地址,并通过物理不可克隆函数和硬件哈希函数增强安全性。消息身份验证码可以存储在带有返回地址的程序堆栈中。rguard硬件自动验证返回地址的完整性。我们的实验表明,对于SPEC CPU2006基准测试的一个子集,在不需要操作系统支持的情况下,RAGuard平均会产生1.86%的运行时开销。
{"title":"RAGuard: A Hardware Based Mechanism for Backward-Edge Control-Flow Integrity","authors":"Jun Zhang, Rui Hou, Junfeng Fan, KeKe Liu, Lixin Zhang, S. Mckee","doi":"10.1145/3075564.3075570","DOIUrl":"https://doi.org/10.1145/3075564.3075570","url":null,"abstract":"Control-flow integrity (CFI) is considered as a general and promising method to prevent code-reuse attacks, which utilize benign code sequences to realize arbitrary computation. Current approaches can efficiently protect control-flow transfers caused by indirect jumps and function calls (forward-edge CFI). However, they cannot effectively protect control-flow caused by the function return (backward-edge CFI). The reason is that the set of return addresses of the functions that are frequently called can be very large, which might bend the backward-edge CFI. We address this backward-edge CFI problem by proposing a novel hardware-assisted mechanism (RAGuard) that binds a message authentication code to each return address and enhances security via a physical unclonable function and a hardware hash function. The message authentication codes can be stored on the program stack with return address. RAGuard hardware automatically verifies the integrity of return addresses. Our experiments show that for a subset of the SPEC CPU2006 benchmarks, RAGuard incurs 1.86% runtime overheads on average with no need for OS support.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127387059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
The Future of Deep Learning: Challenges & Solutions 深度学习的未来:挑战与解决方案
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3097267
M. Robins
Mark will begin with a brief overview of deep learning and what has led to its recent popularity. He will provide a few demonstrations and examples of deep learning applications based on recent work at Intel Nervana. He will explain some of the challenges to continued progress in deep learning - such as high compute requirements and lengthy training time - and will discuss some of the solutions (e.g. custom deep learning hardware) that Intel Nervana is developing to usher in a new era of even more powerful AI.
Mark将从深度学习的简要概述开始,以及它最近流行的原因。他将提供一些基于英特尔Nervana最近工作的深度学习应用的演示和示例。他将解释深度学习持续发展的一些挑战,例如高计算要求和冗长的训练时间,并将讨论英特尔Nervana正在开发的一些解决方案(例如定制深度学习硬件),以迎接更强大的人工智能的新时代。
{"title":"The Future of Deep Learning: Challenges & Solutions","authors":"M. Robins","doi":"10.1145/3075564.3097267","DOIUrl":"https://doi.org/10.1145/3075564.3097267","url":null,"abstract":"Mark will begin with a brief overview of deep learning and what has led to its recent popularity. He will provide a few demonstrations and examples of deep learning applications based on recent work at Intel Nervana. He will explain some of the challenges to continued progress in deep learning - such as high compute requirements and lengthy training time - and will discuss some of the solutions (e.g. custom deep learning hardware) that Intel Nervana is developing to usher in a new era of even more powerful AI.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129944172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding the I/O Behavior of Desktop Applications in Virtualization 理解虚拟化中桌面应用程序的I/O行为
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3076263
Yan Sui, Chun Yang, Xu Cheng
Input/Output (I/O) performance is very important when running desktop applications in virtualized environments. Previous research has focused on cold execution or installation of desktop applications, where the I/O requests are obvious; in many other scenarios such as warm launch or web page browsing however, I/O behaviors are less clear, and in this paper, we analyze the I/O behavior of these desktop scenarios. Our analysis reveals several interesting I/O behaviors of desktop applications; for example, we show that many warm applications will send random read requests during their launch, which leads to storage-sensitivity of these applications. We also find that the write requests from web page browsing generates considerable I/O pressure, even when the users only open a simple news page and take no further action. Our results have strong ramifications for the management of storage systems and the deployment of virtual machines in virtualized environments.
在虚拟化环境中运行桌面应用程序时,输入/输出(I/O)性能非常重要。以前的研究主要集中在桌面应用程序的冷执行或安装上,其中的I/O请求是显而易见的;然而,在许多其他场景中,如热启动或网页浏览,I/O行为不太清楚,在本文中,我们分析了这些桌面场景的I/O行为。我们的分析揭示了桌面应用程序的几个有趣的I/O行为;例如,我们展示了许多热应用程序在启动期间会发送随机读取请求,这导致这些应用程序的存储敏感性。我们还发现,来自网页浏览的写请求会产生相当大的I/O压力,即使用户只打开一个简单的新闻页面而不做进一步的操作。我们的研究结果对存储系统的管理和虚拟化环境中虚拟机的部署有很大的影响。
{"title":"Understanding the I/O Behavior of Desktop Applications in Virtualization","authors":"Yan Sui, Chun Yang, Xu Cheng","doi":"10.1145/3075564.3076263","DOIUrl":"https://doi.org/10.1145/3075564.3076263","url":null,"abstract":"Input/Output (I/O) performance is very important when running desktop applications in virtualized environments. Previous research has focused on cold execution or installation of desktop applications, where the I/O requests are obvious; in many other scenarios such as warm launch or web page browsing however, I/O behaviors are less clear, and in this paper, we analyze the I/O behavior of these desktop scenarios. Our analysis reveals several interesting I/O behaviors of desktop applications; for example, we show that many warm applications will send random read requests during their launch, which leads to storage-sensitivity of these applications. We also find that the write requests from web page browsing generates considerable I/O pressure, even when the users only open a simple news page and take no further action. Our results have strong ramifications for the management of storage systems and the deployment of virtual machines in virtualized environments.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121192289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-Scale Plant Classification with Deep Neural Networks 基于深度神经网络的大规模植物分类
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075590
Ignacio Heredia
This paper discusses the potential of applying deep learning techniques for plant classification and its usage for citizen science in large-scale biodiversity monitoring. We show that plant classification using near state-of-the-art convolutional network architectures like ResNet50 achieves significant improvements in accuracy compared to the most widespread plant classification application in test sets composed of thousands of different species labels. We find that the predictions can be confidently used as a baseline classification in citizen science communities like iNaturalist (or its Spanish fork, Natusfera) which in turn can share their data with biodiversity portals like GBIF.
本文讨论了深度学习技术在植物分类中的应用潜力及其在大规模生物多样性监测中的公民科学应用。我们表明,与由数千个不同物种标签组成的测试集中最广泛的植物分类应用相比,使用接近最先进的卷积网络架构(如ResNet50)的植物分类在准确性方面取得了显着提高。我们发现,这些预测可以被iNaturalist(或其西班牙分支Natusfera)等公民科学社区自信地用作基线分类,而这些社区又可以与GBIF等生物多样性门户网站分享他们的数据。
{"title":"Large-Scale Plant Classification with Deep Neural Networks","authors":"Ignacio Heredia","doi":"10.1145/3075564.3075590","DOIUrl":"https://doi.org/10.1145/3075564.3075590","url":null,"abstract":"This paper discusses the potential of applying deep learning techniques for plant classification and its usage for citizen science in large-scale biodiversity monitoring. We show that plant classification using near state-of-the-art convolutional network architectures like ResNet50 achieves significant improvements in accuracy compared to the most widespread plant classification application in test sets composed of thousands of different species labels. We find that the predictions can be confidently used as a baseline classification in citizen science communities like iNaturalist (or its Spanish fork, Natusfera) which in turn can share their data with biodiversity portals like GBIF.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131321792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Improving Error Resilience Analysis Methodology of Iterative Workloads for Approximate Computing 改进近似计算迭代工作负载的误差恢复分析方法
Pub Date : 2017-05-15 DOI: 10.1145/3075564.3078891
G. Gillani, A. Kokkeler
Assessing error resilience inherent to the digital processing workloads provides application-specific insights towards approximate computing strategies for improving power efficiency and/or performance. With the case study of radio astronomy calibration, our contributions for improving the error resilience analysis are focused primarily on iterative methods that use a convergence criterion as a quality metric to terminate the iterative computations. We propose an adaptive statistical approximation model for high-level resilience analysis that provides an opportunity to divide a workload into exact and approximate iterations. This improves the existing error resilience analysis methodology by quantifying the number of approximate iterations (23% of the total iterations in our case study) in addition to other parameters used in the state-of-the-art techniques. This way heterogeneous architectures comprised of exact and inexact computing cores and adaptive accuracy architectures can be exploited efficiently. Moreover, we demonstrate the importance of quality function reconsideration for convergence based iterative processes as the original quality function (the convergence criterion) is not necessarily sufficient in the resilience analysis phase. If such is the case, an additional quality function has to be defined to assess the viability of the approximate techniques.
评估数字处理工作负载固有的错误恢复能力,为提高电源效率和/或性能的近似计算策略提供了特定于应用程序的见解。以射电天文校准为例,我们对改进误差恢复分析的贡献主要集中在迭代方法上,该方法使用收敛准则作为质量度量来终止迭代计算。我们提出了一个自适应的统计近似模型,用于高级弹性分析,该模型提供了将工作负载划分为精确迭代和近似迭代的机会。除了在最先进的技术中使用的其他参数之外,通过量化近似迭代的数量(在我们的案例研究中占总迭代的23%),这改进了现有的错误弹性分析方法。这种方法可以有效地利用由精确和不精确计算核心组成的异构体系结构以及自适应精度体系结构。此外,我们证明了质量函数重新考虑对于基于收敛的迭代过程的重要性,因为原始质量函数(收敛准则)在弹性分析阶段并不一定足够。如果是这种情况,则必须定义一个附加的质量函数来评估近似技术的可行性。
{"title":"Improving Error Resilience Analysis Methodology of Iterative Workloads for Approximate Computing","authors":"G. Gillani, A. Kokkeler","doi":"10.1145/3075564.3078891","DOIUrl":"https://doi.org/10.1145/3075564.3078891","url":null,"abstract":"Assessing error resilience inherent to the digital processing workloads provides application-specific insights towards approximate computing strategies for improving power efficiency and/or performance. With the case study of radio astronomy calibration, our contributions for improving the error resilience analysis are focused primarily on iterative methods that use a convergence criterion as a quality metric to terminate the iterative computations. We propose an adaptive statistical approximation model for high-level resilience analysis that provides an opportunity to divide a workload into exact and approximate iterations. This improves the existing error resilience analysis methodology by quantifying the number of approximate iterations (23% of the total iterations in our case study) in addition to other parameters used in the state-of-the-art techniques. This way heterogeneous architectures comprised of exact and inexact computing cores and adaptive accuracy architectures can be exploited efficiently. Moreover, we demonstrate the importance of quality function reconsideration for convergence based iterative processes as the original quality function (the convergence criterion) is not necessarily sufficient in the resilience analysis phase. If such is the case, an additional quality function has to be defined to assess the viability of the approximate techniques.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127534527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
Proceedings of the Computing Frontiers Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1