DARTS: Techniques and Tools for Predictably Fast Memory Using Integrated Data Allocation and Real-Time Task Scheduling

2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium Pub Date : 2010-04-12 DOI:10.1109/RTAS.2010.36

Sangyeol Kang, A. Dean

{"title":"DARTS: Techniques and Tools for Predictably Fast Memory Using Integrated Data Allocation and Real-Time Task Scheduling","authors":"Sangyeol Kang, A. Dean","doi":"10.1109/RTAS.2010.36","DOIUrl":null,"url":null,"abstract":"Hardware-managed caches introduce large amounts of timing variability, complicating real-time system design. One alternative is a memory system with scratchpad memories which improve system performance while eliminating such timing variability. Prior work introduced the DARTS approach, which combines static allocation of data into scratchpad memories, with task scheduling for preemptive multi-threaded, hard real-time embedded systems.This study offers several significant contributions. First, it introduces a method to split a stack frame across multiple memory units, offering fine-grain allocation of automatic memory variables with very low run-time overhead. This enables more effective use of fast memory, improving run-times. Second, it introduces the completed tool-chain based on DARTS, which reallocates static and automatic variables across multiple memory banks and now targets the ARM7 architecture. Third, it evaluates the performance improvement from DARTS using experimental results from the code running on real hardware in a preemptively scheduled RTOS-based multi-tasking environment. This hands-on experimental approach ensures a high level of confidence in the results; previous studies have generally stopped at estimating performance rather than building and measuring a real implementation.In our experiments the execution time of each task is reduced up to 24% from the baseline external SRAM configurations. We show that our methods improve task execution time to achieve 37% to 99% of the performance improvement of an ideal unlimited-capacity scratchpad memory system. Finally, we find our allocations provide on average 2/3 of the performance enhancement of the equivalently-sized cache yet with easily-predicted performance.","PeriodicalId":356388,"journal":{"name":"2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTAS.2010.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Hardware-managed caches introduce large amounts of timing variability, complicating real-time system design. One alternative is a memory system with scratchpad memories which improve system performance while eliminating such timing variability. Prior work introduced the DARTS approach, which combines static allocation of data into scratchpad memories, with task scheduling for preemptive multi-threaded, hard real-time embedded systems.This study offers several significant contributions. First, it introduces a method to split a stack frame across multiple memory units, offering fine-grain allocation of automatic memory variables with very low run-time overhead. This enables more effective use of fast memory, improving run-times. Second, it introduces the completed tool-chain based on DARTS, which reallocates static and automatic variables across multiple memory banks and now targets the ARM7 architecture. Third, it evaluates the performance improvement from DARTS using experimental results from the code running on real hardware in a preemptively scheduled RTOS-based multi-tasking environment. This hands-on experimental approach ensures a high level of confidence in the results; previous studies have generally stopped at estimating performance rather than building and measuring a real implementation.In our experiments the execution time of each task is reduced up to 24% from the baseline external SRAM configurations. We show that our methods improve task execution time to achieve 37% to 99% of the performance improvement of an ideal unlimited-capacity scratchpad memory system. Finally, we find our allocations provide on average 2/3 of the performance enhancement of the equivalently-sized cache yet with easily-predicted performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用集成数据分配和实时任务调度的可预测快速内存的技术和工具

硬件管理的缓存引入了大量的时间可变性，使实时系统设计复杂化。一种替代方案是带有刮擦板存储器的存储系统，它在消除这种时间可变性的同时提高了系统性能。先前的工作介绍了DARTS方法，该方法将数据静态分配到临时存储器中，并将任务调度用于抢占式多线程、硬实时嵌入式系统。这项研究提供了几个重要的贡献。首先，它引入了一种跨多个内存单元拆分堆栈帧的方法，以非常低的运行时开销为自动内存变量提供细粒度分配。这可以更有效地使用快速内存，改善运行时间。其次，介绍了基于dart的完整工具链，它可以跨多个内存库重新分配静态和自动变量，现在针对ARM7架构。第三，在基于抢占调度rtos的多任务环境中，使用实际硬件上运行的代码的实验结果来评估dart的性能改进。这种动手实验方法确保了对结果的高度信心;以前的研究通常停留在评估性能上，而不是构建和测量实际的实现。在我们的实验中，每个任务的执行时间比基线外部SRAM配置减少了24%。我们表明，我们的方法改进了任务执行时间，达到理想的无限容量刮记板存储系统性能改进的37%到99%。最后，我们发现我们的分配平均提供了同等大小缓存的2/3的性能增强，而且性能很容易预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium

自引率

0.00%

发文量