Proceedings of the Computing Frontiers Conference最新文献

英文中文

HELICoiD: interdisciplinary and collaborative project for real-time brain cancer detection: Invited Paper HELICoiD:实时脑癌检测的跨学科合作项目:特邀论文

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3076262

R. Salvador, S. Ortega, D. Madroñal, H. Fabelo, R. Lazcano, G. Callicó, E. Juárez, R. Sarmiento, C. Sanz

The HELICoiD project is a European FP7 FET Open funded project. It is an interdisciplinary work at the edge of the biomedical domain, bringing together neurosurgeons, computer scientists and electronic engineers. The main target of the project was to provide a working demonstrator of an intraoperative image-guided surgery system for real-time brain cancer detection, in order to assist neurosurgeons during tumour resection procedures. One of the main problems associated to brain tumours is its infiltrative nature, which makes complete tumour resection a highly difficult task. With the combination of Hyperspectral Imaging and Machine Learning techniques, the project aimed at demonstrating that a precise determination of tumour boundaries was possible, helping this way neurosurgeons to minimize the amount of removed healthy tissue. The project partners involved, besides different universities and companies, two hospitals where the demonstrator was tested during surgical procedures. This paper introduces the difficulties around brain tumor resection, stating the main objectives of the project and presenting the materials, methodologies and platforms used to propose a solution. A brief summary of the main results obtained is also included.

HELICoiD项目是欧洲FP7 FET Open资助的项目。它是生物医学领域边缘的一项跨学科工作，汇集了神经外科医生、计算机科学家和电子工程师。该项目的主要目标是提供术中图像引导手术系统的工作演示，用于实时脑癌检测，以便在肿瘤切除过程中协助神经外科医生。与脑肿瘤相关的主要问题之一是其浸润性，这使得完全切除肿瘤是一项非常困难的任务。通过结合高光谱成像和机器学习技术，该项目旨在证明精确确定肿瘤边界是可能的，从而帮助神经外科医生最大限度地减少切除健康组织的数量。除了不同的大学和公司外，项目合作伙伴还涉及两家医院，在那里演示器在手术过程中进行了测试。本文介绍了脑肿瘤切除的困难，说明了该项目的主要目标，并介绍了提出解决方案所使用的材料、方法和平台。本文还简要总结了所获得的主要结果。

{"title":"HELICoiD: interdisciplinary and collaborative project for real-time brain cancer detection: Invited Paper","authors":"R. Salvador, S. Ortega, D. Madroñal, H. Fabelo, R. Lazcano, G. Callicó, E. Juárez, R. Sarmiento, C. Sanz","doi":"10.1145/3075564.3076262","DOIUrl":"https://doi.org/10.1145/3075564.3076262","url":null,"abstract":"The HELICoiD project is a European FP7 FET Open funded project. It is an interdisciplinary work at the edge of the biomedical domain, bringing together neurosurgeons, computer scientists and electronic engineers. The main target of the project was to provide a working demonstrator of an intraoperative image-guided surgery system for real-time brain cancer detection, in order to assist neurosurgeons during tumour resection procedures. One of the main problems associated to brain tumours is its infiltrative nature, which makes complete tumour resection a highly difficult task. With the combination of Hyperspectral Imaging and Machine Learning techniques, the project aimed at demonstrating that a precise determination of tumour boundaries was possible, helping this way neurosurgeons to minimize the amount of removed healthy tissue. The project partners involved, besides different universities and companies, two hospitals where the demonstrator was tested during surgical procedures. This paper introduces the difficulties around brain tumor resection, stating the main objectives of the project and presenting the materials, methodologies and platforms used to propose a solution. A brief summary of the main results obtained is also included.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125530182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Self-Sustainability in Nano Unmanned Aerial Vehicles: A Blimp Case Study 纳米无人机的自我可持续性:一个飞艇案例研究

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075580

D. Palossi, Andres Gomez, Stefan Draskovic, K. Keller, L. Benini, L. Thiele

Nowadays nano Unmanned Aerial Vehicles (UAV's), such as quad-copters, have very limited flight times, tens of minutes at most. The main constraints are energy density of the batteries and the engine power required for flight. In this work, we present a nano-sized blimp platform, consisting of a helium balloon and a rotorcraft. Thanks to the lift provided by helium, the blimp requires relatively little energy to remain at a stable altitude. We also introduce the concept of duty-cycling high power actuators, to reduce the energy requirements for hovering even further. With the addition of a solar panel, it is even feasible to sustain tens or hundreds of flight hours in modest lighting conditions (including indoor usage). A functioning 52 gram prototype was thoroughly characterized and its lifetime was measured in different harvesting conditions. Both our system model and the experimental results indicate our proposed platform requires less than 200 mW to hover in a self sustainable fashion. This represents, to the best of our knowledge, the first nano-size UAV for long term hovering with low power requirements.

目前，纳米无人机(UAV)，如四旋翼飞行器，飞行时间非常有限，最多几十分钟。主要的限制因素是电池的能量密度和飞行所需的发动机功率。在这项工作中，我们提出了一个纳米级的飞艇平台，由一个氦气球和一个旋翼机组成。由于氦气提供的升力，飞艇需要相对较少的能量来保持在稳定的高度。我们还引入了占空比大功率执行器的概念，以进一步降低悬停时的能量需求。加上太阳能电池板，它甚至可以在适度的照明条件下(包括室内使用)维持数十或数百小时的飞行。一个功能52克原型进行了彻底表征，其寿命在不同的收获条件下进行了测量。我们的系统模型和实验结果都表明，我们提出的平台需要不到200兆瓦的能量才能以自我可持续的方式悬停。据我们所知，这是第一架纳米级无人机，可以在低功率条件下长期悬停。

{"title":"Self-Sustainability in Nano Unmanned Aerial Vehicles: A Blimp Case Study","authors":"D. Palossi, Andres Gomez, Stefan Draskovic, K. Keller, L. Benini, L. Thiele","doi":"10.1145/3075564.3075580","DOIUrl":"https://doi.org/10.1145/3075564.3075580","url":null,"abstract":"Nowadays nano Unmanned Aerial Vehicles (UAV's), such as quad-copters, have very limited flight times, tens of minutes at most. The main constraints are energy density of the batteries and the engine power required for flight. In this work, we present a nano-sized blimp platform, consisting of a helium balloon and a rotorcraft. Thanks to the lift provided by helium, the blimp requires relatively little energy to remain at a stable altitude. We also introduce the concept of duty-cycling high power actuators, to reduce the energy requirements for hovering even further. With the addition of a solar panel, it is even feasible to sustain tens or hundreds of flight hours in modest lighting conditions (including indoor usage). A functioning 52 gram prototype was thoroughly characterized and its lifetime was measured in different harvesting conditions. Both our system model and the experimental results indicate our proposed platform requires less than 200 mW to hover in a self sustainable fashion. This represents, to the best of our knowledge, the first nano-size UAV for long term hovering with low power requirements.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129547867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Cloud Workload Prediction by Means of Simulations 基于模拟的云工作负荷预测

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075589

G. Kecskeméti, A. Kertész, Z. Németh

Clouds hide the complexity of maintaining a physical infrastructure with a disadvantage: they also hide their internal workings. Should users need to know about these details e.g., to increase the reliability or performance of their applications, they would need to detect slight behavioural changes in the underlying system. Existing solutions for such purposes offer limited capabilities. This paper proposes a technique for predicting background workload by means of simulations that are providing knowledge of the underlying clouds to support activities like cloud orchestration or workflow enactment. We propose these predictions to select more suitable execution environments for scientific workflows. We validate the proposed prediction approach with a biochemical application.

云隐藏了维护物理基础设施的复杂性，但有一个缺点:它们也隐藏了内部工作。如果用户需要了解这些细节，例如，为了提高应用程序的可靠性或性能，他们将需要检测底层系统中的轻微行为变化。用于此类目的的现有解决方案提供的功能有限。本文提出了一种通过模拟来预测后台工作负载的技术，这种模拟提供了底层云的知识，以支持云编排或工作流制定等活动。我们提出这些预测来为科学工作流选择更合适的执行环境。我们用生化应用验证了提出的预测方法。

引用次数: 4

BC-AMAT: Considering Blocked Time in Memory System Measurement BC-AMAT:考虑内存系统测量中的阻塞时间

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3076264

Qi Yu, Libo Huang, Cheng Qian, Zhiying Wang

The "memory wall" problem requires not only the use of increasingly aggressive techniques designed to reduce the latency of memory system, but also the raise of more accurate memory metrics. C-AMAT, an extension of AMAT that considers both locality and concurrency of memory accesses, can evaluate the performance of modern memory system more accurately. However, C-AMAT only involves those cycles consumed by memory accesses, ignoring the blocked time caused by some techniques like hardware prefetch, which may result in inaccurate evaluation. In this paper, we propose a more comprehensive memory metric called Blocked C-AMAT (BC-AMAT). It extends C-AMAT to take the blocked cycles into consideration. Experimental results show that BC-AMAT correlates much better with IPC than C-AMAT does when a few prefetch strategies are applied both in single-core mode and multi-core mode. In addition, a case study is provided in which BC-AMAT is used to adjust prefetching degree dynamically. The result shows that BC-AMAT achieves higher performance improvement than C-AMAT, demonstrating its usefulness in system optimization. BC-AMAT is more accurate and comprehensive than C-AMAT in evaluating modern memory systems, meanwhile, provides more insight for architecture design.

“内存墙”问题不仅需要使用越来越激进的技术来减少内存系统的延迟，而且还需要提高更准确的内存指标。C-AMAT是AMAT的扩展，同时考虑了存储器访问的局部性和并发性，可以更准确地评估现代存储器系统的性能。然而，C-AMAT只涉及内存访问所消耗的周期，而忽略了某些技术(如硬件预取)造成的阻塞时间，这可能导致评估不准确。在本文中，我们提出了一个更全面的内存度量，称为阻塞C-AMAT (BC-AMAT)。它扩展了C-AMAT以考虑阻塞周期。实验结果表明，无论在单核模式还是多核模式下，采用几种预取策略时，BC-AMAT与IPC的相关性都比C-AMAT好得多。最后，给出了利用BC-AMAT动态调整预取度的实例研究。结果表明，BC-AMAT比C-AMAT获得了更高的性能提升，证明了其在系统优化中的实用性。BC-AMAT在评估现代存储系统方面比C-AMAT更为准确和全面，同时也为架构设计提供了更多的见解。

{"title":"BC-AMAT: Considering Blocked Time in Memory System Measurement","authors":"Qi Yu, Libo Huang, Cheng Qian, Zhiying Wang","doi":"10.1145/3075564.3076264","DOIUrl":"https://doi.org/10.1145/3075564.3076264","url":null,"abstract":"The \"memory wall\" problem requires not only the use of increasingly aggressive techniques designed to reduce the latency of memory system, but also the raise of more accurate memory metrics. C-AMAT, an extension of AMAT that considers both locality and concurrency of memory accesses, can evaluate the performance of modern memory system more accurately. However, C-AMAT only involves those cycles consumed by memory accesses, ignoring the blocked time caused by some techniques like hardware prefetch, which may result in inaccurate evaluation. In this paper, we propose a more comprehensive memory metric called Blocked C-AMAT (BC-AMAT). It extends C-AMAT to take the blocked cycles into consideration. Experimental results show that BC-AMAT correlates much better with IPC than C-AMAT does when a few prefetch strategies are applied both in single-core mode and multi-core mode. In addition, a case study is provided in which BC-AMAT is used to adjust prefetching degree dynamically. The result shows that BC-AMAT achieves higher performance improvement than C-AMAT, demonstrating its usefulness in system optimization. BC-AMAT is more accurate and comprehensive than C-AMAT in evaluating modern memory systems, meanwhile, provides more insight for architecture design.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124723343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Extending the comfort zone: DAVIDE 扩大舒适区:大卫

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3095084

F. Magugliani

Comfort zone is an artificial mental boundary within which you maintain a sense of security. A couple of years ago, PRACE (Partnership for Advanced Computing in Europe) challenged the technology providers of Europe in proposing new architectures, new concepts and building a High-Performance Computer system mixing old and proven technology with advanced new components. E4 took the challenge and proposed an innovative and uncomfortable approach: DAVIDE. The talk will present the rationale for the technological and architectural choices done for building DAVIDE, the key innovative concepts, the software ecosystems and some preliminary performance.

舒适区是一个人为的心理边界，在这个边界内你可以保持一种安全感。几年前，PRACE(欧洲高级计算伙伴关系)向欧洲的技术供应商提出了挑战，他们提出了新的架构、新的概念，并将旧的、经过验证的技术与先进的新组件混合在一起，构建了高性能计算机系统。E4接受了挑战，提出了一种创新而不舒服的方法:DAVIDE。讲座将介绍为构建DAVIDE所做的技术和架构选择的基本原理，关键的创新概念，软件生态系统和一些初步性能。

引用次数: 0

Work Stealing in a Shared Virtual-Memory Heterogeneous Environment: A Case Study with Betweenness Centrality 共享虚拟内存异构环境下的工作窃取:一个具有中间性的案例研究

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075567

Shuai Che, Marc S. Orr, J. Gallmeier

This paper uses betweenness centrality as a case study to research efficient work stealing in a heterogeneous system environment. Betweenness centrality is an important algorithm in graph processing. It presents multiple-level parallelism and is an interesting problem to exploit various optimizations. We investigate queue-based work stealing to distribute its tasks across GPU compute units (CUs) and across the CPU and the GPU, which has not been done by prior work. In particular, we demonstrate how to leverage the new platform-atomic operations on AMD Accelerated Processing Units (APUs) to operate cross-device queues in a lock-free manner in shared virtual memory. To make the work stealing runtime and the application more efficient, we apply new architectural features, including atomic operations with different memory scopes and or-derings for different synchronization scenarios. We implement our solution using heterogeneous system architecture (HSA). Our results show that betweenness centrality with CPU-GPU work stealing achieves an average of 15% (up to 30%) performance improvement over GPU-only execution for diverse graph inputs. Our work stealing solution can be applied widely to other applications too. Finally, we analyze important parameters critical for queuing and stealing.

本文以中间中心性为例，研究了异构系统环境下的高效工作窃取问题。中间中心性是图处理中的一种重要算法。它提供了多级并行性，并且是利用各种优化的有趣问题。我们研究了基于队列的工作窃取，以跨GPU计算单元(cu)以及CPU和GPU分配其任务，这是以前的工作没有完成的。特别是，我们演示了如何在AMD加速处理单元(apu)上利用新的平台原子操作在共享虚拟内存中以无锁的方式操作跨设备队列。为了使工作窃取运行时和应用程序更高效，我们应用了新的体系结构特性，包括具有不同内存作用域的原子操作和针对不同同步场景的命令。我们使用异构系统架构(HSA)实现我们的解决方案。我们的结果表明，对于不同的图形输入，CPU-GPU工作窃取的中间性中心性比仅gpu执行的性能提高了平均15%(最高30%)。我们的工作窃取解决方案也可以广泛应用于其他应用。最后，我们分析了排队和窃取的关键参数。

{"title":"Work Stealing in a Shared Virtual-Memory Heterogeneous Environment: A Case Study with Betweenness Centrality","authors":"Shuai Che, Marc S. Orr, J. Gallmeier","doi":"10.1145/3075564.3075567","DOIUrl":"https://doi.org/10.1145/3075564.3075567","url":null,"abstract":"This paper uses betweenness centrality as a case study to research efficient work stealing in a heterogeneous system environment. Betweenness centrality is an important algorithm in graph processing. It presents multiple-level parallelism and is an interesting problem to exploit various optimizations. We investigate queue-based work stealing to distribute its tasks across GPU compute units (CUs) and across the CPU and the GPU, which has not been done by prior work. In particular, we demonstrate how to leverage the new platform-atomic operations on AMD Accelerated Processing Units (APUs) to operate cross-device queues in a lock-free manner in shared virtual memory. To make the work stealing runtime and the application more efficient, we apply new architectural features, including atomic operations with different memory scopes and or-derings for different synchronization scenarios. We implement our solution using heterogeneous system architecture (HSA). Our results show that betweenness centrality with CPU-GPU work stealing achieves an average of 15% (up to 30%) performance improvement over GPU-only execution for diverse graph inputs. Our work stealing solution can be applied widely to other applications too. Finally, we analyze important parameters critical for queuing and stealing.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131968975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Big Data Analytics on Large-Scale Scientific Datasets in the INDIGO-DataCloud Project INDIGO-DataCloud项目中大型科学数据集的大数据分析

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3078884

S. Fiore, Cosimo Palazzo, Alessandro D'Anca, D. Elia, E. Londero, C. Knapic, S. Monna, N. Marcucci, F. Aguilar, M. Plóciennik, J. M. D. Lucas, G. Aloisio

In the context of the EU H2020 INDIGO-DataCloud project several use case on large scale scientific data analysis regarding different research communities have been implemented. All of them require the availability of large amount of data related to either output of simulations or observed data from sensors and need scientific (big) data solutions to run data analysis experiments. More specifically, the paper presents the case studies related to the following research communities: (i) the European Multidisciplinary Seafloor and water column Observatory (INGV-EMSO), (ii) the Large Binocular Telescope, (iii) LifeWatch, and (iv) the European Network for Earth System Modelling (ENES).

在欧盟H2020 INDIGO-DataCloud项目的背景下，已经实施了针对不同研究社区的大规模科学数据分析的几个用例。所有这些都需要与模拟输出或传感器观测数据相关的大量数据的可用性，并且需要科学的(大)数据解决方案来运行数据分析实验。更具体地说，本文介绍了与以下研究团体相关的案例研究:(i)欧洲多学科海底和水柱观测站(INGV-EMSO)， (ii)大型双筒望远镜，(iii)生命观察，(iv)欧洲地球系统建模网络(ENES)。

引用次数: 9

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems 选择性卸载到内存:支持pim的异构系统的任务分区和映射

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075584

Dawen Xu, Yi Liao, Ying Wang, Huawei Li, Xiaowei Li

Processing-in-Memory (PIM) is returning as a promising solution to address the issue of memory wall as computing systems gradually step into the big data era. Researchers continually proposed various PIM architecture combined with novel memory device or 3D integration technology, but it is still a lack of universal task scheduling method in terms of the new heterogeneous platform. In this paper, we propose a formalized model to quantify the performance and energy of the PIM+CPU heterogeneous parallel system. In addition, we are the first to build a task partitioning and mapping framework to exploit different PIM engines. In this framework, an application is divided into subtasks and mapped onto appropriate execution units based on the proposed PIM-oriented Earliest-Finish-Time (PEFT) algorithm to maximize the performance gains brought by PIM. Experimental evaluations show our PIM-aware framework significantly improves the system performance compared to conventional processor architectures.

随着计算系统逐渐步入大数据时代，内存处理(PIM)作为解决内存墙问题的一种有前途的解决方案正在回归。结合新颖的存储设备或三维集成技术，研究者不断提出各种PIM架构，但在新的异构平台上仍然缺乏通用的任务调度方法。在本文中，我们提出了一个形式化的模型来量化PIM+CPU异构并行系统的性能和能量。此外，我们是第一个构建任务划分和映射框架来利用不同PIM引擎的人。在这个框架中，应用程序被划分为子任务，并根据提出的面向PIM的最早完成时间(PEFT)算法映射到适当的执行单元，以最大限度地提高PIM带来的性能收益。实验评估表明，与传统的处理器架构相比，我们的pim感知框架显着提高了系统性能。

引用次数: 2

ExanaDBT: A Dynamic Compilation System for Transparent Polyhedral Optimizations at Runtime ExanaDBT：运行时透明多面体优化的动态编译系统

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3077627

Yukinori Sato, Tomoya Yuki, Toshio Endo

In this paper, we present a dynamic compilation system called ExanaDBT for transparently optimizing and parallelizing binaries at runtime based on the polyhedral model. Starting from hot spot detection of the execution, ExanaDBT dynamically estimates gains for optimization, translates the target region into highly optimized code, and switches the execution of original code to optimized one. To realize advanced loop-level optimizations beyond trace- or instruction-level, ExanaDBT uses a polyhedral optimizer and performs loop transformation for rewarding sustainable performance gain on systems with deeper memory hierarchy. Especially for successful optimizations, we reveal that a simple conversion from the original binaries to LLVM IR will not enough for representing the code in polyhedral model, and then investigate a feasible way to lift binaries to the IR capable of polyhedral optimizations. We implement a proof-of-concept design of ExanaDBT and evaluate it. From the evaluation results, we confirm that ExanaDBT realizes dynamic optimization in a fully automated fashion. The results also show that ExanaDBT can contribute to speeding up the execution in average 3.2 times from unoptimized serial code in single thread execution and 11.9 times in 16 thread parallel execution.

在本文中，我们提出了一种名为 ExanaDBT 的动态编译系统，用于在运行时基于多面体模型透明地优化和并行化二进制文件。从执行热点检测开始，ExanaDBT 动态估计优化收益，将目标区域转化为高度优化的代码，并将原始代码的执行切换到优化代码。为了实现跟踪级或指令级之外的高级循环级优化，ExanaDBT 使用了多面体优化器，并在内存层次结构较深的系统上执行循环转换，以获得可持续的性能增益。特别是对于成功的优化，我们发现从原始二进制文件到 LLVM IR 的简单转换不足以用多面体模型表示代码，因此我们研究了一种可行的方法，将二进制文件提升到能够进行多面体优化的 IR。我们实现了 ExanaDBT 的概念验证设计并对其进行了评估。从评估结果来看，我们确认 ExanaDBT 以完全自动化的方式实现了动态优化。结果还显示，ExanaDBT 在单线程执行中比未经优化的串行代码平均加快了 3.2 倍，在 16 线程并行执行中加快了 11.9 倍。

{"title":"ExanaDBT: A Dynamic Compilation System for Transparent Polyhedral Optimizations at Runtime","authors":"Yukinori Sato, Tomoya Yuki, Toshio Endo","doi":"10.1145/3075564.3077627","DOIUrl":"https://doi.org/10.1145/3075564.3077627","url":null,"abstract":"In this paper, we present a dynamic compilation system called ExanaDBT for transparently optimizing and parallelizing binaries at runtime based on the polyhedral model. Starting from hot spot detection of the execution, ExanaDBT dynamically estimates gains for optimization, translates the target region into highly optimized code, and switches the execution of original code to optimized one. To realize advanced loop-level optimizations beyond trace- or instruction-level, ExanaDBT uses a polyhedral optimizer and performs loop transformation for rewarding sustainable performance gain on systems with deeper memory hierarchy. Especially for successful optimizations, we reveal that a simple conversion from the original binaries to LLVM IR will not enough for representing the code in polyhedral model, and then investigate a feasible way to lift binaries to the IR capable of polyhedral optimizations. We implement a proof-of-concept design of ExanaDBT and evaluate it. From the evaluation results, we confirm that ExanaDBT realizes dynamic optimization in a fully automated fashion. The results also show that ExanaDBT can contribute to speeding up the execution in average 3.2 times from unoptimized serial code in single thread execution and 11.9 times in 16 thread parallel execution.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129231547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Evolution of Friendship: a case study of MobiClique 友谊的演变:MobiClique的案例研究

Proceedings of the Computing Frontiers Conference

Pub Date : 2017-05-15 DOI: 10.1145/3075564.3075595

Jooyoung Lee, Konstantin Lopatin, Rasheed Hussain, Waqas Nawaz

Understanding the evolution of relationship among users, through generic interactions, is the key driving force to this study. We model the evolution of friendship in the social network of MobiClique using observations of interactions among users. MobiClique is a mobile ad-hoc network setting where Bluetooth enabled mobile devices communicate directly with each other as they meet opportunistically. We first apply existing topological methods to predict future friendship in MobiClique and then compare the results with the proposed interaction-based method. Our approach combines four types of user activity information to measure the similarity between users at any specific time. We also define the temporal accuracy evaluation metric and show that interaction data with temporal information is a good indicator to predict temporal social ties. The experimental evaluation suggests that the well-known static topological metrics do not perform well in ad-hoc network scenario. The results suggest that to accurately predict evolution of friendship, or topology of the network, it is necessary to utilise some interaction information.

通过通用交互了解用户之间关系的演变是本研究的关键驱动力。我们通过观察用户之间的互动来模拟MobiClique社交网络中友谊的演变。MobiClique是一种移动自组织网络设置，支持蓝牙的移动设备在偶然相遇时可以直接相互通信。我们首先应用现有的拓扑方法来预测MobiClique中未来的友谊，然后将结果与提出的基于交互的方法进行比较。我们的方法结合了四种类型的用户活动信息来衡量用户在任何特定时间的相似性。我们还定义了时间精度评价指标，并表明具有时间信息的交互数据是预测时间社会关系的良好指标。实验评估表明，已知的静态拓扑度量在自组织网络场景中表现不佳。结果表明，为了准确地预测友谊的演变，或网络的拓扑结构，有必要利用一些交互信息。

引用次数: 6

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the Computing Frontiers Conference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀