快速和可扩展的多核架构线程迁移

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing Pub Date : 2015-10-21 DOI:10.1109/EUC.2015.36

Miguel Rodrigues, N. Roma, P. Tomás

{"title":"快速和可扩展的多核架构线程迁移","authors":"Miguel Rodrigues, N. Roma, P. Tomás","doi":"10.1109/EUC.2015.36","DOIUrl":null,"url":null,"abstract":"Heterogeneous computing is a promising approach to tackle the thermal, power and energy constraints posed by modern desktop and embedded computing systems. However, by also allowing the migration of application threads to the most appropriate cores, significant performance gains and energy efficiency levels can also be attained. Nevertheless, the considerably large overheads usually imposed by software-based thread migration procedures only allow exploiting migrations at a coarse-grained level, thus limiting the effectiveness of using such techniques. Accordingly, this paper proposes a fast and efficient hardware-based thread migration mechanism that can be easily plugged-in into any core architecture. To minimize the thread migration overhead and latency, the proposed approach considers both soft-and hard-migration procedures, and adopts a conventional \"most recently used\" prediction scheme to identify the cache blocks that should be migrated along with the thread context. Experimental results show that the proposed scheme is lightweight and requires limited hardware resources, while allowing to attain migration latencies below 100 clock cycles and to reduce post-migration overheads in up to 60%, making it particularly appropriate for exploiting short-lived application phases.","PeriodicalId":299207,"journal":{"name":"2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Fast and Scalable Thread Migration for Multi-core Architectures\",\"authors\":\"Miguel Rodrigues, N. Roma, P. Tomás\",\"doi\":\"10.1109/EUC.2015.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Heterogeneous computing is a promising approach to tackle the thermal, power and energy constraints posed by modern desktop and embedded computing systems. However, by also allowing the migration of application threads to the most appropriate cores, significant performance gains and energy efficiency levels can also be attained. Nevertheless, the considerably large overheads usually imposed by software-based thread migration procedures only allow exploiting migrations at a coarse-grained level, thus limiting the effectiveness of using such techniques. Accordingly, this paper proposes a fast and efficient hardware-based thread migration mechanism that can be easily plugged-in into any core architecture. To minimize the thread migration overhead and latency, the proposed approach considers both soft-and hard-migration procedures, and adopts a conventional \\\"most recently used\\\" prediction scheme to identify the cache blocks that should be migrated along with the thread context. Experimental results show that the proposed scheme is lightweight and requires limited hardware resources, while allowing to attain migration latencies below 100 clock cycles and to reduce post-migration overheads in up to 60%, making it particularly appropriate for exploiting short-lived application phases.\",\"PeriodicalId\":299207,\"journal\":{\"name\":\"2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing\",\"volume\":\"73 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EUC.2015.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUC.2015.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

异构计算是解决现代桌面和嵌入式计算系统所带来的热、功率和能量限制的一种很有前途的方法。然而，通过允许将应用程序线程迁移到最合适的核心，还可以获得显著的性能提升和能效水平。然而，基于软件的线程迁移过程通常带来的相当大的开销只允许在粗粒度级别上利用迁移，从而限制了使用此类技术的有效性。因此，本文提出了一种快速高效的基于硬件的线程迁移机制，该机制可以很容易地插入到任何核心体系结构中。为了最小化线程迁移开销和延迟，建议的方法同时考虑软迁移和硬迁移过程，并采用传统的“最近使用的”预测方案来识别应该随线程上下文一起迁移的缓存块。实验结果表明，所提出的方案是轻量级的，需要有限的硬件资源，同时允许获得低于100个时钟周期的迁移延迟，并减少迁移后开销高达60%，使其特别适合利用短暂的应用程序阶段。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Fast and Scalable Thread Migration for Multi-core Architectures

Heterogeneous computing is a promising approach to tackle the thermal, power and energy constraints posed by modern desktop and embedded computing systems. However, by also allowing the migration of application threads to the most appropriate cores, significant performance gains and energy efficiency levels can also be attained. Nevertheless, the considerably large overheads usually imposed by software-based thread migration procedures only allow exploiting migrations at a coarse-grained level, thus limiting the effectiveness of using such techniques. Accordingly, this paper proposes a fast and efficient hardware-based thread migration mechanism that can be easily plugged-in into any core architecture. To minimize the thread migration overhead and latency, the proposed approach considers both soft-and hard-migration procedures, and adopts a conventional "most recently used" prediction scheme to identify the cache blocks that should be migrated along with the thread context. Experimental results show that the proposed scheme is lightweight and requires limited hardware resources, while allowing to attain migration latencies below 100 clock cycles and to reduce post-migration overheads in up to 60%, making it particularly appropriate for exploiting short-lived application phases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE 13th International Conference on Embedded and Ubiquitous Computing

自引率

0.00%

发文量

期刊最新文献

Linux SCHED DEADLINE vs. MARTOP-EDF Context Aware Power Management Enhanced by Radio Wake Up in Body Area Networks A Holistic Approach for Advancing Robots in Ambient Assisted Living Environments A Self-Adaptive System for Vehicle Information Security Applications Automatic Design of Low-Power VLSI Circuits: Accurate and Approximate Multipliers