首页 > 最新文献

IEEE Transactions on Multi-Scale Computing Systems最新文献

英文 中文
A Health Decision Support System for Disease Diagnosis Based on Wearable Medical Sensors and Machine Learning Ensembles 基于可穿戴医疗传感器和机器学习集成的疾病诊断健康决策支持系统
Pub Date : 2017-03-31 DOI: 10.1109/TMSCS.2017.2710194
Hongxu Yin;Niraj K. Jha
Even with an annual expenditure of more than $3 trillion, the U.S. healthcare system is far from optimal. For example, the third leading cause of death in the U.S. is preventable medical error, immediately after heart disease and cancer. Computer-based clinical decision support systems (CDSSs) have been proposed to address such deficiencies and have significantly improved clinical practice over the past decade. However, they remain limited to clinics and hospitals, and do not take advantage of patient data that are obtained on a daily basis using wearable medical sensors (WMSs) that have the ability to bridge this information gap. WMSs can collect physiological signals from anyone anywhere anytime. Thus, they have the potential to usher in an era of pervasive healthcare. However, most prior work on WMSs only focuses on hardware and protocol design, and not on an information system that can fully utilize the collected signals for efficient disease diagnosis. In this paper, for the first time, we introduce a hierarchical health decision support system for disease diagnosis that integrates health data from WMSs into CDSSs. The proposed system has a multi-tier structure, starting with a WMS tier, backed by robust machine learning, that enables diseases to be tracked individually by a disease diagnosis module. We demonstrate the feasibility of such a system through six disease diagnosis modules aimed at four ICD-10-CM disease categories. We show that the system is scalable using five more disease categories. Just the WMS tier offers impressive diagnostic accuracies for various diseases: arrhythmia (86 percent), type-2 diabetes (78 percent), urinary bladder disorder (99 percent), renal pelvis nephritis (94 percent), and hypothyroid (95 percent). We estimate that the disease diagnosis modules of all known 69,000 human diseases would require just 62 GB of storage space in the WMS tier. This is practical even in today's cloud or base station oriented WMS systems.
即使每年的支出超过3万亿美元,美国的医疗体系也远非最佳。例如,在美国,第三大死亡原因是可预防的医疗错误,仅次于心脏病和癌症。基于计算机的临床决策支持系统(CDSS)已被提出来解决这些缺陷,并在过去十年中显著改善了临床实践。然而,它们仍然局限于诊所和医院,并且没有利用每天使用能够弥合这一信息差距的可穿戴医疗传感器(WMS)获得的患者数据。WMS可以随时随地收集任何人的生理信号。因此,他们有可能开创一个普及医疗保健的时代。然而,大多数先前关于WMS的工作只关注硬件和协议设计,而不关注能够充分利用收集的信号进行有效疾病诊断的信息系统。在本文中,我们首次介绍了一个用于疾病诊断的分层健康决策支持系统,该系统将来自WMS的健康数据集成到CDSS中。所提出的系统具有多层结构,从WMS层开始,由强大的机器学习支持,使疾病诊断模块能够单独跟踪疾病。我们通过针对四种ICD-10-CM疾病类别的六个疾病诊断模块来证明这种系统的可行性。我们表明,该系统可以使用另外五种疾病类别进行扩展。仅WMS级别就对各种疾病提供了令人印象深刻的诊断准确率:心律失常(86%)、2型糖尿病(78%)、膀胱疾病(99%)、肾盂肾炎(94%)和甲状腺功能减退(95%)。我们估计,所有已知69000种人类疾病的疾病诊断模块在WMS层中只需要62GB的存储空间。即使在当今面向云或基站的WMS系统中,这也是实用的。
{"title":"A Health Decision Support System for Disease Diagnosis Based on Wearable Medical Sensors and Machine Learning Ensembles","authors":"Hongxu Yin;Niraj K. Jha","doi":"10.1109/TMSCS.2017.2710194","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2710194","url":null,"abstract":"Even with an annual expenditure of more than $3 trillion, the U.S. healthcare system is far from optimal. For example, the third leading cause of death in the U.S. is preventable medical error, immediately after heart disease and cancer. Computer-based clinical decision support systems (CDSSs) have been proposed to address such deficiencies and have significantly improved clinical practice over the past decade. However, they remain limited to clinics and hospitals, and do not take advantage of patient data that are obtained on a daily basis using wearable medical sensors (WMSs) that have the ability to bridge this information gap. WMSs can collect physiological signals from anyone anywhere anytime. Thus, they have the potential to usher in an era of pervasive healthcare. However, most prior work on WMSs only focuses on hardware and protocol design, and not on an information system that can fully utilize the collected signals for efficient disease diagnosis. In this paper, for the first time, we introduce a hierarchical health decision support system for disease diagnosis that integrates health data from WMSs into CDSSs. The proposed system has a multi-tier structure, starting with a WMS tier, backed by robust machine learning, that enables diseases to be tracked individually by a disease diagnosis module. We demonstrate the feasibility of such a system through six disease diagnosis modules aimed at four ICD-10-CM disease categories. We show that the system is scalable using five more disease categories. Just the WMS tier offers impressive diagnostic accuracies for various diseases: arrhythmia (86 percent), type-2 diabetes (78 percent), urinary bladder disorder (99 percent), renal pelvis nephritis (94 percent), and hypothyroid (95 percent). We estimate that the disease diagnosis modules of all known 69,000 human diseases would require just 62 GB of storage space in the WMS tier. This is practical even in today's cloud or base station oriented WMS systems.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 4","pages":"228-241"},"PeriodicalIF":0.0,"publicationDate":"2017-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2710194","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68021199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 85
ARTEMIS: An Aging-Aware Runtime Application Mapping Framework for 3D NoC-Based Chip Multiprocessors ARTEMIS:一个适用于基于3D NoC的芯片多处理器的老化感知运行时应用映射框架
Pub Date : 2017-03-23 DOI: 10.1109/TMSCS.2017.2686856
Venkata Yaswanth Raparti;Nishit Kapadia;Sudeep Pasricha
In emerging 3D NoC-based chip multiprocessors (CMPs), aging in circuits due to bias temperature instability (BTI) stress is expected to cause gate-delay degradation that, if left unchecked, can lead to untimely failure. Simultaneously, the effects of electromigration (EM) induced aging in the on-chip wires, especially those in the 3D power delivery network (PDN), are expected to notably reduce chip lifetime. A commonly proposed solution to mitigate circuit-slowdown due to aging is to hike the supply voltage; however, this increases current-densities in the PDN due to the increased power consumption on the die, which in turn expedites PDN-aging. We thus note that mechanisms to enhance lifetime reliability in 3D NoC-based CMPs must consider circuit-aging together with PDN-aging. In this paper, we propose a novel runtime framework (ARTEMIS) for intelligent dynamic application-mapping and voltage-scaling to simultaneously manage aging in circuits and the PDN, and enhance the performance and lifetime of 3D NoC-based CMPs. We also propose an aging-enabled routing algorithm that balances the degree of aging between NoC routers and cores, thereby increasing the combined lifetime of both. Our framework also considers dark-silicon power constraints that are becoming a major design challenge in scaled technologies, particularly for 3D stacked CMPs. Our experimental results indicate that ARTEMIS enables the execution of 25 percent more applications over the chip lifetime compared to state-of-the-art prior work.
在新兴的基于3D NoC的芯片多处理器(CMPs)中,由于偏置温度不稳定性(BTI)应力导致的电路老化预计会导致栅极延迟退化,如果不加以控制,可能会导致不合时宜的故障。同时,电迁移(EM)引起的芯片上布线老化的影响,特别是3D功率传输网络(PDN)中的电迁移老化,预计将显著缩短芯片寿命。缓解由于老化而导致的电路减速的通常提出的解决方案是提高电源电压;然而,由于管芯上的功耗增加,这增加了PDN中的电流密度,这反过来又加速了PDN的老化。因此,我们注意到,在基于3D NoC的CMPs中提高寿命可靠性的机制必须将电路老化与PDN老化一起考虑。在本文中,我们提出了一种新的运行时框架(ARTEMIS),用于智能动态应用映射和电压缩放,以同时管理电路和PDN中的老化,并提高基于3D NoC的CMPs的性能和寿命。我们还提出了一种支持老化的路由算法,该算法平衡了NoC路由器和核心之间的老化程度,从而增加了两者的组合寿命。我们的框架还考虑了暗硅功率限制,这正成为规模化技术中的一个主要设计挑战,特别是对于3D堆叠CMPs。我们的实验结果表明,与现有技术相比,ARTEMIS能够在芯片寿命内多执行25%的应用程序。
{"title":"ARTEMIS: An Aging-Aware Runtime Application Mapping Framework for 3D NoC-Based Chip Multiprocessors","authors":"Venkata Yaswanth Raparti;Nishit Kapadia;Sudeep Pasricha","doi":"10.1109/TMSCS.2017.2686856","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2686856","url":null,"abstract":"In emerging 3D NoC-based chip multiprocessors (CMPs), aging in circuits due to bias temperature instability (BTI) stress is expected to cause gate-delay degradation that, if left unchecked, can lead to untimely failure. Simultaneously, the effects of electromigration (EM) induced aging in the on-chip wires, especially those in the 3D power delivery network (PDN), are expected to notably reduce chip lifetime. A commonly proposed solution to mitigate circuit-slowdown due to aging is to hike the supply voltage; however, this increases current-densities in the PDN due to the increased power consumption on the die, which in turn expedites PDN-aging. We thus note that mechanisms to enhance lifetime reliability in 3D NoC-based CMPs must consider circuit-aging together with PDN-aging. In this paper, we propose a novel runtime framework (ARTEMIS) for intelligent dynamic application-mapping and voltage-scaling to simultaneously manage aging in circuits and the PDN, and enhance the performance and lifetime of 3D NoC-based CMPs. We also propose an aging-enabled routing algorithm that balances the degree of aging between NoC routers and cores, thereby increasing the combined lifetime of both. Our framework also considers dark-silicon power constraints that are becoming a major design challenge in scaled technologies, particularly for 3D stacked CMPs. Our experimental results indicate that ARTEMIS enables the execution of 25 percent more applications over the chip lifetime compared to state-of-the-art prior work.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 2","pages":"72-85"},"PeriodicalIF":0.0,"publicationDate":"2017-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2686856","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68019438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Internet of Everything: A Large-Scale Autonomic IoT Gateway 万物互联:一个大型自主物联网网关
Pub Date : 2017-03-18 DOI: 10.1109/TMSCS.2017.2705683
Byungseok Kang;Daecheon Kim;Hyunseung Choo
Gateways are emerging as a key element of bringing legacy and next generation devices to the Internet of Things (IoT). They integrate protocols for networking, help manage storage and edge analytics on the data, and facilitate data flow securely between edge devices and the cloud. Current IoT gateways solve the communication gap between field control/sensor nodes and customer cloud, enabling field data to be harnessed for manufacturing process optimization, remote management, and preventive maintenance. However, these gateways do not support fully-automatic configuration of newly added IoT devices. In this paper, we proposed a self-configurable gateway featuring real time detection and configuration of smart things over the wireless networks. This novel gateway's main features are: dynamic discovery of home IoT device(s), automatic updates of hardware changes, connection management of smart things connected over AllJoyn. We use the `option' field for automatic configuration of IoT devices rather than modify standard format of CoAP protocol. Proposed gateway functionality has been validated over the large-scale IoT testbed.
网关正在成为将传统和下一代设备引入物联网(IoT)的关键要素。它们集成了网络协议,有助于管理数据的存储和边缘分析,并促进边缘设备和云之间的数据安全流动。当前的物联网网关解决了现场控制/传感器节点与客户云之间的通信差距,使现场数据能够用于制造流程优化、远程管理和预防性维护。然而,这些网关不支持新添加的物联网设备的全自动配置。在本文中,我们提出了一种自配置网关,该网关具有在无线网络上实时检测和配置智能事物的功能。这种新型网关的主要功能是:动态发现家庭物联网设备,自动更新硬件更改,通过AllJoyn连接的智能事物的连接管理。我们使用“选项”字段来自动配置物联网设备,而不是修改CoAP协议的标准格式。所提出的网关功能已在大规模物联网试验台上进行了验证。
{"title":"Internet of Everything: A Large-Scale Autonomic IoT Gateway","authors":"Byungseok Kang;Daecheon Kim;Hyunseung Choo","doi":"10.1109/TMSCS.2017.2705683","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2705683","url":null,"abstract":"Gateways are emerging as a key element of bringing legacy and next generation devices to the Internet of Things (IoT). They integrate protocols for networking, help manage storage and edge analytics on the data, and facilitate data flow securely between edge devices and the cloud. Current IoT gateways solve the communication gap between field control/sensor nodes and customer cloud, enabling field data to be harnessed for manufacturing process optimization, remote management, and preventive maintenance. However, these gateways do not support fully-automatic configuration of newly added IoT devices. In this paper, we proposed a self-configurable gateway featuring real time detection and configuration of smart things over the wireless networks. This novel gateway's main features are: dynamic discovery of home IoT device(s), automatic updates of hardware changes, connection management of smart things connected over AllJoyn. We use the `option' field for automatic configuration of IoT devices rather than modify standard format of CoAP protocol. Proposed gateway functionality has been validated over the large-scale IoT testbed.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 3","pages":"206-214"},"PeriodicalIF":0.0,"publicationDate":"2017-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2705683","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68070519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Thoroughly Exploring GPU Buffering Options for Stencil Code by Using an Efficiency Measure and a Performance Model 使用效率度量和性能模型深入探索用于模板代码的GPU缓冲选项
Pub Date : 2017-03-17 DOI: 10.1109/TMSCS.2017.2705139
Yue Hu;David M. Koppelman;Steven Robert Brandt
Stencil computations form the basis for computer simulations across almost every field of science, such as computational fluid dynamics, data mining, and image processing. Their mostly regular data access patterns potentially enable them to take advantage of the high computation and data bandwidth of GPUs, but only if data buffering and other issues are handled properly. Finding a good code generation strategy presents a number of challenges, one of which is the best way to make use of memory. GPUs have several types of on-chip storage including registers, shared memory, and a read-only cache. The choice of type of storage and how it’s used, a buffering strategy, for each stencil array (grid function, [GF]) not only requires a good understanding of its stencil pattern, but also the efficiency of each type of storage for the GF, to avoid squandering storage that would be more beneficial to another GF. For a stencil computation with $N$ GFs, the total number of possible assignments is $beta ^{N}$ where $beta$ is the number of buffering strategies. Our code-generation framework supports five buffering strategies ($beta =5$). Large, complex stencil kernels may consist of dozens of GFs, resulting in significant search overhead. In this work, we present an analytic performance model for stencil computations on GPUs and study the behavior of read-only cache and L2 cache. Next, we propose an efficiency-based assignment algorithm which operates by scoring a change in buffering strategy for a GF using a combination of (a) the predicted execution time and (b) on-chip storage usage. By using this scoring, an assignment for $N$ GFs can be determined in $(beta -1)N(N+1)/2$ steps. Results show that the performance model has good accuracy and that the assignment strategy is highly efficient.
模板计算构成了几乎所有科学领域的计算机模拟的基础,如计算流体力学、数据挖掘和图像处理。它们大多是规则的数据访问模式,这可能使它们能够利用GPU的高计算和数据带宽,但前提是数据缓冲和其他问题得到妥善处理。找到一个好的代码生成策略带来了许多挑战,其中之一是利用内存的最佳方式。GPU有几种类型的片上存储,包括寄存器、共享内存和只读缓存。每个模板阵列(网格函数,[GF])的存储类型及其使用方式的选择,缓冲策略,不仅需要对其模板模式有很好的了解,还需要对GF的每种存储类型的效率有很高的了解,以避免浪费对另一个GF更有利的存储。对于具有$N$GF的模板计算,可能分配的总数为$beta^{N}$,其中$beta$是缓冲策略的数量。我们的代码生成框架支持五种缓冲策略($beta=5$)。大型、复杂的模板内核可能由数十个GF组成,从而导致大量的搜索开销。在这项工作中,我们提出了一个GPU上模板计算的分析性能模型,并研究了只读缓存和二级缓存的行为。接下来,我们提出了一种基于效率的分配算法,该算法通过使用(a)预测的执行时间和(b)片上存储使用的组合来对GF的缓冲策略的变化进行评分。通过使用此评分,$N$GF的分配可以按$(β-1)N(N+1)/2$步确定。结果表明,该性能模型具有良好的准确性,分配策略是高效的。
{"title":"Thoroughly Exploring GPU Buffering Options for Stencil Code by Using an Efficiency Measure and a Performance Model","authors":"Yue Hu;David M. Koppelman;Steven Robert Brandt","doi":"10.1109/TMSCS.2017.2705139","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2705139","url":null,"abstract":"Stencil computations form the basis for computer simulations across almost every field of science, such as computational fluid dynamics, data mining, and image processing. Their mostly regular data access patterns potentially enable them to take advantage of the high computation and data bandwidth of GPUs, but only if data buffering and other issues are handled properly. Finding a good code generation strategy presents a number of challenges, one of which is the best way to make use of memory. GPUs have several types of on-chip storage including registers, shared memory, and a read-only cache. The choice of type of storage and how it’s used, a \u0000<i>buffering strategy</i>\u0000, for each stencil array (\u0000<i>grid function</i>\u0000, [GF]) not only requires a good understanding of its stencil pattern, but also the efficiency of each type of storage for the GF, to avoid squandering storage that would be more beneficial to another GF. For a stencil computation with \u0000<inline-formula><tex-math>$N$</tex-math> </inline-formula>\u0000 GFs, the total number of possible assignments is \u0000<inline-formula><tex-math>$beta ^{N}$</tex-math></inline-formula>\u0000 where \u0000<inline-formula> <tex-math>$beta$</tex-math></inline-formula>\u0000 is the number of buffering strategies. Our code-generation framework supports five buffering strategies (\u0000<inline-formula><tex-math>$beta =5$</tex-math></inline-formula>\u0000). Large, complex stencil kernels may consist of dozens of GFs, resulting in significant search overhead. In this work, we present an analytic performance model for stencil computations on GPUs and study the behavior of read-only cache and L2 cache. Next, we propose an efficiency-based assignment algorithm which operates by scoring a change in buffering strategy for a GF using a combination of (a) the predicted execution time and (b) on-chip storage usage. By using this scoring, an assignment for \u0000<inline-formula><tex-math>$N$</tex-math></inline-formula>\u0000 GFs can be determined in \u0000<inline-formula><tex-math>$(beta -1)N(N+1)/2$</tex-math></inline-formula>\u0000 steps. Results show that the performance model has good accuracy and that the assignment strategy is highly efficient.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 3","pages":"477-490"},"PeriodicalIF":0.0,"publicationDate":"2017-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2705139","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68024923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
2016 Reviewers List 2016年评审人名单
Pub Date : 2017-03-17 DOI: 10.1109/TMSCS.2017.2664638
Presents a listing of reviewers who contributed to this publication in 2016.
提供2016年为本出版物做出贡献的评审人员名单。
{"title":"2016 Reviewers List","authors":"","doi":"10.1109/TMSCS.2017.2664638","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2664638","url":null,"abstract":"Presents a listing of reviewers who contributed to this publication in 2016.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 1","pages":"62-63"},"PeriodicalIF":0.0,"publicationDate":"2017-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2664638","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68072480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2016 Index IEEE Transactions on Multi-Scale Computing Systems Vol. 2 2016年索引IEEE多尺度计算系统汇刊第2卷
Pub Date : 2017-03-17 DOI: 10.1109/TMSCS.2016.2647518
Presents the 2016 author/subject index for this publication.
提供本出版物2016年的作者/主题索引。
{"title":"2016 Index IEEE Transactions on Multi-Scale Computing Systems Vol. 2","authors":"","doi":"10.1109/TMSCS.2016.2647518","DOIUrl":"https://doi.org/10.1109/TMSCS.2016.2647518","url":null,"abstract":"Presents the 2016 author/subject index for this publication.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 1","pages":"64-69"},"PeriodicalIF":0.0,"publicationDate":"2017-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2016.2647518","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68072483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Power Budgeting for Mobile Systems Running Graphics Workloads 运行图形工作负载的移动系统的动态功率预算
Pub Date : 2017-03-16 DOI: 10.1109/TMSCS.2017.2683487
Ujjwal Gupta;Raid Ayoub;Michael Kishinevsky;David Kadjo;Niranjan Soundararajan;Ugurkan Tursun;Umit Y. Ogras
Competitive graphics performance is crucial for the success of state-of-the-art mobile processors. High graphics performance comes at the cost of higher power consumption, which elevates the temperature due to limited cooling solutions. To avoid thermal violations, the system needs to operate within a power budget. Since the power budget is a shared resource, there is a strong demand for effective dynamic power budgeting techniques. This paper presents a novel technique to efficiently distribute the power budget among the CPU and GPU cores, while maximizing performance. The proposed technique is evaluated using a state-of-the-art mobile platform using industrial benchmarks, and an in-house simulator. The experiments on the mobile platform show up to 15% increase in average frame rate compared to default power allocation algorithms.
具有竞争力的图形性能对于最先进的移动处理器的成功至关重要。高图形性能是以更高的功耗为代价的,由于有限的冷却解决方案,功耗会提高温度。为了避免热量违规,系统需要在电力预算内运行。由于电力预算是一种共享资源,因此对有效的动态电力预算技术有着强烈的需求。本文提出了一种新技术,可以在CPU和GPU核心之间有效地分配功率预算,同时最大限度地提高性能。使用最先进的移动平台、工业基准和内部模拟器对所提出的技术进行了评估。在移动平台上的实验表明,与默认的功率分配算法相比,平均帧速率提高了15%。
{"title":"Dynamic Power Budgeting for Mobile Systems Running Graphics Workloads","authors":"Ujjwal Gupta;Raid Ayoub;Michael Kishinevsky;David Kadjo;Niranjan Soundararajan;Ugurkan Tursun;Umit Y. Ogras","doi":"10.1109/TMSCS.2017.2683487","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2683487","url":null,"abstract":"Competitive graphics performance is crucial for the success of state-of-the-art mobile processors. High graphics performance comes at the cost of higher power consumption, which elevates the temperature due to limited cooling solutions. To avoid thermal violations, the system needs to operate within a power budget. Since the power budget is a shared resource, there is a strong demand for effective dynamic power budgeting techniques. This paper presents a novel technique to efficiently distribute the power budget among the CPU and GPU cores, while maximizing performance. The proposed technique is evaluated using a state-of-the-art mobile platform using industrial benchmarks, and an in-house simulator. The experiments on the mobile platform show up to 15% increase in average frame rate compared to default power allocation algorithms.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 1","pages":"30-40"},"PeriodicalIF":0.0,"publicationDate":"2017-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2683487","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68003396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Analytical Modeling of the SMART NoC SMART NoC的分析建模
Pub Date : 2017-03-15 DOI: 10.1109/TMSCS.2017.2704101
Debajit Bhattacharya;Niraj K. Jha
With the increasing number of components in multiprocessor systems-on-chip, standard bus based communication architectures face a formidable scalability challenge. Network-on-chip (NoC), a type of communication architecture, addresses this scalability problem by distributing the communication resources among the communicating components. A huge bottleneck to modern NoC design is its extremely slow simulation/prototyping phase. Traditionally, analytical performance models have been used to speed up this phase. However, the currently available analytical models are not applicable to state-of-the-art NoCs. This forces NoC designers to rely on simulation/prototyping. In this work, we propose an analytical NoC performance analysis methodology for modeling the state-of-the-art single-cycle multi-hop asynchronous repeated traversal (SMART) NoC that enables packets to partially or completely bypass routers from source to destination. To the best of our knowledge, this is the first work on analytical modeling of NoCs that enable bypassing of routers. Our method registers a prediction error in network latency that is as low as 1 percent, and on an average below 2.5 and 8.4 percent, respectively, compared with the cycle-accurate GARNET network simulator and the gem5 full-system simulator running the PARSEC benchmark suite. The method also leads to two orders of magnitude speedup in computation time. It can account for variations in NoC design parameters, such as the maximum number of hops per cycle, number of virtual channels, flit size, buffer depth per virtual channel, etc. Even when these NoC design parameters are varied, our method's results remain within 5 percent of GARNET's results.
随着片上多处理器系统中组件数量的增加,基于总线的标准通信架构面临着巨大的可扩展性挑战。片上网络(NoC)是一种通信架构,通过在通信组件之间分配通信资源来解决这种可扩展性问题。现代NoC设计的一个巨大瓶颈是其极其缓慢的模拟/原型阶段。传统上,分析性能模型被用来加速这一阶段。然而,目前可用的分析模型不适用于最先进的国家奥委会。这迫使NoC设计师依赖于模拟/原型设计。在这项工作中,我们提出了一种分析NoC性能分析方法,用于对最先进的单周期多跳异步重复遍历(SMART)NoC进行建模,该方法使数据包能够部分或完全绕过路由器从源到目的地。据我们所知,这是第一项对能够绕过路由器的NoC进行分析建模的工作。与运行PARSEC基准套件的周期准确的GARNET网络模拟器和gem5全系统模拟器相比,我们的方法记录的网络延迟预测误差低至1%,平均分别低于2.5%和8.4%。该方法还使计算时间加快了两个数量级。它可以考虑NoC设计参数的变化,例如每个周期的最大跳数、虚拟信道的数量、微片大小、每个虚拟信道的缓冲区深度等。即使这些NoC设计变量变化,我们的方法的结果也保持在GARNET结果的5%以内。
{"title":"Analytical Modeling of the SMART NoC","authors":"Debajit Bhattacharya;Niraj K. Jha","doi":"10.1109/TMSCS.2017.2704101","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2704101","url":null,"abstract":"With the increasing number of components in multiprocessor systems-on-chip, standard bus based communication architectures face a formidable scalability challenge. Network-on-chip (NoC), a type of communication architecture, addresses this scalability problem by distributing the communication resources among the communicating components. A huge bottleneck to modern NoC design is its extremely slow simulation/prototyping phase. Traditionally, analytical performance models have been used to speed up this phase. However, the currently available analytical models are not applicable to state-of-the-art NoCs. This forces NoC designers to rely on simulation/prototyping. In this work, we propose an analytical NoC performance analysis methodology for modeling the state-of-the-art single-cycle multi-hop asynchronous repeated traversal (SMART) NoC that enables packets to partially or completely bypass routers from source to destination. To the best of our knowledge, this is the first work on analytical modeling of NoCs that enable bypassing of routers. Our method registers a prediction error in network latency that is as low as 1 percent, and on an average below 2.5 and 8.4 percent, respectively, compared with the cycle-accurate GARNET network simulator and the gem5 full-system simulator running the PARSEC benchmark suite. The method also leads to two orders of magnitude speedup in computation time. It can account for variations in NoC design parameters, such as the maximum number of hops per cycle, number of virtual channels, flit size, buffer depth per virtual channel, etc. Even when these NoC design parameters are varied, our method's results remain within 5 percent of GARNET's results.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 4","pages":"242-254"},"PeriodicalIF":0.0,"publicationDate":"2017-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2704101","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68022239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Keep the Stress Away with SoDA: Stress Detection and Alleviation System 使用SoDA远离压力:压力检测和缓解系统
Pub Date : 2017-03-11 DOI: 10.1109/TMSCS.2017.2703613
Ayten Ozge Akmandor;Niraj K. Jha
Long-term exposure to stress may lead to serious health problems such as those related to the immune, cardiovascular, and endocrine systems. Once having arisen, these problems require a considerable investment of time and money to recover from. With early detection and treatment, however, these health problems may be nipped in the bud, thus improving quality of life. We present an automatic stress detection and alleviation system, called SoDA, to address this issue. SoDA takes advantage of emerging wearable medical sensors (WMSs), specifically, electrocardiogram (ECG), galvanic skin response (GSR), respiration rate, blood pressure, and blood oximeter, to continuously monitor human stress levels and mitigate stress as it arises. It performs stress detection and alleviation in a user-transparent manner, i.e., without the need for user intervention. When it detects stress, SoDA employs a stress alleviation technique in an adaptive manner based on the stress response of the user. We establish the effectiveness of the proposed system through a detailed analysis of data collected from 32 participants. A total of four stressors and three stress reduction techniques are employed. In the stress detection stage, SoDA achieves 95.8 percent accuracy with a distinct combination of supervised feature selection and unsupervised dimensionality reduction. In the stress alleviation stage, we compare SoDA with the `no alleviation' baseline and validate its efficacy in responding to and alleviating stress.
长期暴露在压力下可能会导致严重的健康问题,如与免疫、心血管和内分泌系统有关的问题。一旦出现这些问题,就需要投入大量的时间和金钱才能从中恢复过来。然而,通过早期发现和治疗,这些健康问题可能会被扼杀在萌芽状态,从而提高生活质量。为了解决这个问题,我们提出了一个名为SoDA的自动压力检测和缓解系统。SoDA利用新兴的可穿戴医疗传感器(WMS),特别是心电图(ECG)、皮肤电流反应(GSR)、呼吸率、血压和血氧计,持续监测人类压力水平,并在压力出现时缓解压力。它以用户透明的方式进行压力检测和缓解,即无需用户干预。当检测到压力时,SoDA基于用户的压力反应以自适应的方式采用压力缓解技术。我们通过对32名参与者的数据进行详细分析,确定了拟议系统的有效性。总共采用了四种压力源和三种减压技术。在压力检测阶段,SoDA通过监督特征选择和无监督降维的独特组合,实现了95.8%的准确率。在压力缓解阶段,我们将SoDA与“不缓解”基线进行比较,并验证其在应对和缓解压力方面的功效。
{"title":"Keep the Stress Away with SoDA: Stress Detection and Alleviation System","authors":"Ayten Ozge Akmandor;Niraj K. Jha","doi":"10.1109/TMSCS.2017.2703613","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2703613","url":null,"abstract":"Long-term exposure to stress may lead to serious health problems such as those related to the immune, cardiovascular, and endocrine systems. Once having arisen, these problems require a considerable investment of time and money to recover from. With early detection and treatment, however, these health problems may be nipped in the bud, thus improving quality of life. We present an automatic stress detection and alleviation system, called SoDA, to address this issue. SoDA takes advantage of emerging wearable medical sensors (WMSs), specifically, electrocardiogram (ECG), galvanic skin response (GSR), respiration rate, blood pressure, and blood oximeter, to continuously monitor human stress levels and mitigate stress as it arises. It performs stress detection and alleviation in a user-transparent manner, i.e., without the need for user intervention. When it detects stress, SoDA employs a stress alleviation technique in an adaptive manner based on the stress response of the user. We establish the effectiveness of the proposed system through a detailed analysis of data collected from 32 participants. A total of four stressors and three stress reduction techniques are employed. In the stress detection stage, SoDA achieves 95.8 percent accuracy with a distinct combination of supervised feature selection and unsupervised dimensionality reduction. In the stress alleviation stage, we compare SoDA with the `no alleviation' baseline and validate its efficacy in responding to and alleviating stress.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 4","pages":"269-282"},"PeriodicalIF":0.0,"publicationDate":"2017-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2703613","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68022240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
PowerTrader: Enforcing Autonomous Power Management for Future Large-Scale Many-Core Processors PowerTrader:为未来大规模多核处理器实施自主电源管理
Pub Date : 2017-03-05 DOI: 10.1109/TMSCS.2017.2701795
Hang Lu;Guihai Yan;Yinhe Han;Xiaowei Li
Existing power management approaches for modern many-core processors resort to “centralized” design concept, aiming to optimize chip performance under fixed power budget. Unfortunately, the centralized power management approach, which usually relies on a dedicated on-chip power manager, faces various limitations such as poor scalability and high implementation overhead, and hence cannot be deployed in future large-scale manycores. This article proposes PowerTrader, an autonomous power management scheme. PowerTrader endows each core with self autonomy to issue the power control at any time to harvest the desirable power quota through negotiating with vicinity cores. It does not incur the overheads introduced by power allocation and statistics collection that are inevitable in centralized approaches, meanwhile chip power consumption could be well kept beneath the preset power budget. This article also elaborates on the key design tradeoff in autonomous power management (i.e., Mean-Time-to-Stable versus application power efficiency), and provides thorough design space exploration to justify the efficacy of the proposed approach. Experimental results show that PowerTrader achieves substantial improvements in both performance and power, and exhibits superior scalability compared with the state-of-the-arts.
现代多核处理器的现有电源管理方法采用“集中式”设计理念,旨在在固定的电源预算下优化芯片性能。不幸的是,通常依赖于专用芯片上电源管理器的集中式电源管理方法面临着各种限制,如可扩展性差和实现开销高,因此无法在未来的大规模存储器中部署。本文提出了一种自主电源管理方案PowerTrader。PowerTrader赋予每个核心在任何时候发布功率控制的自主权,通过与邻近核心的谈判来获得理想的功率配额。它不会产生集中方法中不可避免的功率分配和统计收集带来的开销,同时芯片功耗可以很好地保持在预设的功率预算之下。本文还阐述了自主电源管理中的关键设计权衡(即平均稳定时间与应用程序电源效率),并提供了彻底的设计空间探索,以证明所提出方法的有效性。实验结果表明,与现有技术相比,PowerTrader在性能和功率方面都有了显著的改进,并表现出了卓越的可扩展性。
{"title":"PowerTrader: Enforcing Autonomous Power Management for Future Large-Scale Many-Core Processors","authors":"Hang Lu;Guihai Yan;Yinhe Han;Xiaowei Li","doi":"10.1109/TMSCS.2017.2701795","DOIUrl":"https://doi.org/10.1109/TMSCS.2017.2701795","url":null,"abstract":"Existing power management approaches for modern many-core processors resort to “centralized” design concept, aiming to optimize chip performance under fixed power budget. Unfortunately, the centralized power management approach, which usually relies on a dedicated on-chip power manager, faces various limitations such as poor scalability and high implementation overhead, and hence cannot be deployed in future large-scale manycores. This article proposes PowerTrader, an autonomous power management scheme. PowerTrader endows each core with self autonomy to issue the power control at any time to harvest the desirable power quota through negotiating with vicinity cores. It does not incur the overheads introduced by power allocation and statistics collection that are inevitable in centralized approaches, meanwhile chip power consumption could be well kept beneath the preset power budget. This article also elaborates on the key design tradeoff in autonomous power management (i.e., Mean-Time-to-Stable versus application power efficiency), and provides thorough design space exploration to justify the efficacy of the proposed approach. Experimental results show that PowerTrader achieves substantial improvements in both performance and power, and exhibits superior scalability compared with the state-of-the-arts.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"3 4","pages":"283-295"},"PeriodicalIF":0.0,"publicationDate":"2017-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2701795","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68021197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
IEEE Transactions on Multi-Scale Computing Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1