首页 > 最新文献

2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)最新文献

英文 中文
A New Lower Bound of Privacy Budget for Distributed Differential Privacy 一种新的分布式差分隐私预算下界
Zhigang Lu, Hong Shen
Distributed data aggregation via summation (counting) helped us to learn the insights behind the raw data. However, such computing suffered from a high privacy risk of malicious collusion attacks. That is, the colluding adversaries infer a victim's privacy from the gaps between the aggregation outputs and their source data. Among the solutions against such collusion attacks, Distributed Differential Privacy (DDP) shows a significant effect of privacy preservation. Specifically, a DDP scheme guarantees the global differential privacy (the presence or absence of any data curator barely impacts the aggregation outputs) by ensuring local differential privacy at the end of each data curator. To guarantee an overall privacy performance of a distributed data aggregation system against malicious collusion attacks, part of the existing work on such DDP scheme aim to provide an estimated lower bound of privacy budget for the global differential privacy. However, there are two main problems: low data utility from using a large global function sensitivity; unknown privacy guarantee when the aggregation sensitivity of the whole system is less than the sum of the data curator's aggregation sensitivity. To address these problems while ensuring distributed differential privacy, we provide a new lower bound of privacy budget, which works with an unconditional aggregation sensitivity of the whole distributed system. Moreover, we study the performance of our privacy bound in different scenarios of data updates. Both theoretical and experimental evaluations show that our privacy bound offers better global privacy performance than the existing work.
通过求和(计数)进行的分布式数据聚合帮助我们了解原始数据背后的见解。然而,这种计算遭受了恶意串通攻击的高隐私风险。也就是说,串通的对手从聚合输出与其源数据之间的间隙推断受害者的隐私。在针对这种合谋攻击的解决方案中,分布式差分隐私(DDP)在隐私保护方面表现出了显著的效果。具体来说,DDP方案通过确保每个数据管理器末端的本地差异隐私来保证全局差异隐私(任何数据管理器的存在或不存在几乎不会影响聚合输出)。为了保证分布式数据聚合系统的整体隐私性能不受恶意串谋攻击,现有的部分DDP方案旨在为全局差分隐私提供一个估计的隐私预算下界。然而,存在两个主要问题:使用大全局函数灵敏度导致的数据利用率低;当整个系统的聚合灵敏度小于数据管理员的聚合灵敏度之和时,未知的隐私保证。为了在保证分布式差分隐私的同时解决这些问题,我们提出了一个新的隐私预算下界,该下界具有整个分布式系统的无条件聚合灵敏度。此外,我们还研究了隐私约束在不同数据更新场景下的性能。理论和实验评估都表明,我们的隐私约束比现有的工作提供了更好的全局隐私性能。
{"title":"A New Lower Bound of Privacy Budget for Distributed Differential Privacy","authors":"Zhigang Lu, Hong Shen","doi":"10.1109/PDCAT.2017.00014","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00014","url":null,"abstract":"Distributed data aggregation via summation (counting) helped us to learn the insights behind the raw data. However, such computing suffered from a high privacy risk of malicious collusion attacks. That is, the colluding adversaries infer a victim's privacy from the gaps between the aggregation outputs and their source data. Among the solutions against such collusion attacks, Distributed Differential Privacy (DDP) shows a significant effect of privacy preservation. Specifically, a DDP scheme guarantees the global differential privacy (the presence or absence of any data curator barely impacts the aggregation outputs) by ensuring local differential privacy at the end of each data curator. To guarantee an overall privacy performance of a distributed data aggregation system against malicious collusion attacks, part of the existing work on such DDP scheme aim to provide an estimated lower bound of privacy budget for the global differential privacy. However, there are two main problems: low data utility from using a large global function sensitivity; unknown privacy guarantee when the aggregation sensitivity of the whole system is less than the sum of the data curator's aggregation sensitivity. To address these problems while ensuring distributed differential privacy, we provide a new lower bound of privacy budget, which works with an unconditional aggregation sensitivity of the whole distributed system. Moreover, we study the performance of our privacy bound in different scenarios of data updates. Both theoretical and experimental evaluations show that our privacy bound offers better global privacy performance than the existing work.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124461884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Robust Low-Rank Approximation of Images for Background and Foreground Separation 鲁棒低秩逼近图像的背景和前景分离
H. Nakouri, Mhamed-Ali El-Aroui, M. Limam
Background and foreground separation is the major task in video surveillance system to detect moving or suspicious objects. Robust Principal Component Analysis, whose formulation relies on low-rank plus sparse matrices decomposition, shows an interestingly suitable framework to separate moving objects from the background. The optimization problem is transformed to a sequence of convex programs that minimize the sum of L1-norm and nuclear norm of the two component matrices, which are efficiently resolved by an Augmented Lagrangian Multiplierss based solver. In this paper, we propose two new robust schemas for low rank approximation of numerical matrices. The proposed algorithms allow batch and incremental robust low-rank approximal of matrices used in static and real-time foreground extraction to detect moving objects. Experiments reveal that the proposed method are both deterministic, converge decently and quickly; besides, they achieve an accurate background and foreground separation outcome.
背景与前景分离是视频监控系统检测运动或可疑物体的主要任务。鲁棒主成分分析,其公式依赖于低秩加稀疏矩阵分解,显示了一个有趣的合适的框架来分离运动目标和背景。将优化问题转化为两个分量矩阵l1范数和核范数之和最小的凸规划序列,并利用增广拉格朗日乘子求解器对其进行有效求解。本文提出了两种新的数值矩阵低秩逼近鲁棒模式。该算法允许对静态和实时前景提取中使用的矩阵进行批量和增量鲁棒低秩近似来检测运动目标。实验结果表明,该方法具有确定性,收敛速度快、性能好;此外,它们还实现了准确的背景和前景分离结果。
{"title":"Robust Low-Rank Approximation of Images for Background and Foreground Separation","authors":"H. Nakouri, Mhamed-Ali El-Aroui, M. Limam","doi":"10.1109/PDCAT.2017.00040","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00040","url":null,"abstract":"Background and foreground separation is the major task in video surveillance system to detect moving or suspicious objects. Robust Principal Component Analysis, whose formulation relies on low-rank plus sparse matrices decomposition, shows an interestingly suitable framework to separate moving objects from the background. The optimization problem is transformed to a sequence of convex programs that minimize the sum of L1-norm and nuclear norm of the two component matrices, which are efficiently resolved by an Augmented Lagrangian Multiplierss based solver. In this paper, we propose two new robust schemas for low rank approximation of numerical matrices. The proposed algorithms allow batch and incremental robust low-rank approximal of matrices used in static and real-time foreground extraction to detect moving objects. Experiments reveal that the proposed method are both deterministic, converge decently and quickly; besides, they achieve an accurate background and foreground separation outcome.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122953520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Deep Learning Based Framework for Power Demand Forecasting with Deep Belief Networks 基于深度学习的深度信念网络电力需求预测框架
Boyi Zhang, Xiaolin Xu, Hongwei Xing, Yidong Li
Power demand forecasting plays a very important role in many electricity-required industries, such as modern high-speed railways or urban railways. Accurate forecasting will guarantee that electrical equipments such as electric traction systems for trains work under safe, robust and efficient status. Recently, many studies adopt the learning-based methods to achieve the prediction of power demand. However, most of the studies use the traditional classification or clustering algorithms which may not satisfy the requirements of accuracy and efficiency due to the complex features in smart grid. In this paper, we focus on solving the power demand forecasting problem based on deep learning structures. We first propose a deep learning based framework for power demand forecasting with Deep Belief Network (DBN). Then, we use an algorithm called Adaboost to combine weak learners with strong learners, which can increase the accuracy significantly in real-world scenarios. The prediction of the load status is realized by analyzing the information of historical distribution transformer load, weather, electricity population and some other related information. It is also worth noting that the training process of these DBN networks can be parallel, which effectively shorten the processing time and provide the possible of real-time predicting. Our experiment on real-world data from the electrical company shows results that the deep leaning based methods can increase the accuracy of forecasting and significantly shorten the prediction time.
电力需求预测在现代高速铁路或城市铁路等用电行业中起着非常重要的作用。准确的预测将保证列车电力牵引系统等电气设备在安全、稳健、高效的状态下运行。近年来,许多研究采用基于学习的方法来实现电力需求的预测。然而,由于智能电网的复杂特性,大多数研究使用传统的分类或聚类算法,可能无法满足精度和效率的要求。本文主要研究基于深度学习结构的电力需求预测问题。我们首先提出了一个基于深度学习的基于深度信念网络(DBN)的电力需求预测框架。然后,我们使用一种名为Adaboost的算法将弱学习器与强学习器结合起来,在现实场景中可以显著提高准确率。通过分析历史配电变压器负荷、天气、用电人口等相关信息,实现对配电变压器负荷状态的预测。值得注意的是,这些DBN网络的训练过程可以并行,这有效缩短了处理时间,为实时预测提供了可能。我们对电力公司的实际数据进行了实验,结果表明基于深度学习的方法可以提高预测的准确性,并显着缩短预测时间。
{"title":"A Deep Learning Based Framework for Power Demand Forecasting with Deep Belief Networks","authors":"Boyi Zhang, Xiaolin Xu, Hongwei Xing, Yidong Li","doi":"10.1109/PDCAT.2017.00039","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00039","url":null,"abstract":"Power demand forecasting plays a very important role in many electricity-required industries, such as modern high-speed railways or urban railways. Accurate forecasting will guarantee that electrical equipments such as electric traction systems for trains work under safe, robust and efficient status. Recently, many studies adopt the learning-based methods to achieve the prediction of power demand. However, most of the studies use the traditional classification or clustering algorithms which may not satisfy the requirements of accuracy and efficiency due to the complex features in smart grid. In this paper, we focus on solving the power demand forecasting problem based on deep learning structures. We first propose a deep learning based framework for power demand forecasting with Deep Belief Network (DBN). Then, we use an algorithm called Adaboost to combine weak learners with strong learners, which can increase the accuracy significantly in real-world scenarios. The prediction of the load status is realized by analyzing the information of historical distribution transformer load, weather, electricity population and some other related information. It is also worth noting that the training process of these DBN networks can be parallel, which effectively shorten the processing time and provide the possible of real-time predicting. Our experiment on real-world data from the electrical company shows results that the deep leaning based methods can increase the accuracy of forecasting and significantly shorten the prediction time.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129307871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Load Regulation on Energy Saving Mechanisms of EPON Networks EPON网络节能机制的负荷调节
Chien-Ping Liu, Ho-Ting Wu, Chia-Chih Chien, Kai-Wei Ke
The Ethernet passive optical network (EPON) is one of the most efficient transmission technologies for broadband access. However, an unregulated transmission could waste energy if an ONU transmits an insufficient or excessive amount of packets in one time cycle on an energy saving EPON network. For an insufficient transmission, it invokes an unnecessary power consumption, since a power saving ONU has to wake up early for transmitting few packets only. For an excessive transmission, it allows an ONU to stay at the active mode for a longer time but to push other ONUs out of transmission cycle. Load regulation enables an energy saving EPON system to provide near constant loading. This paper studies an appropriate threshold for an ONU to switch between active or energy saving modes. Performance results reveal that satisfactory power saving effects can be achieved via proper load regulation applied on both upstream and downstream channels at the cost of the delay performances of low-priority packets.
以太网无源光网络(EPON)是宽带接入中最有效的传输技术之一。但是,在节能的EPON网络中,如果ONU在一个时间周期内发送的数据包数量不足或过多,则会造成能量浪费。对于传输不足,它会调用不必要的功耗,因为节电ONU必须提前唤醒以仅传输少量数据包。对于过度传输,它允许一个ONU在较长时间内保持主动模式,但将其他ONU挤出传输周期。负载调节使节能EPON系统能够提供接近恒定的负载。本文研究了ONU在主动模式和节能模式之间切换的合适阈值。性能结果表明,在牺牲低优先级数据包的延迟性能的前提下,通过对上下游信道进行适当的负载调节,可以达到令人满意的节能效果。
{"title":"Load Regulation on Energy Saving Mechanisms of EPON Networks","authors":"Chien-Ping Liu, Ho-Ting Wu, Chia-Chih Chien, Kai-Wei Ke","doi":"10.1109/PDCAT.2017.00060","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00060","url":null,"abstract":"The Ethernet passive optical network (EPON) is one of the most efficient transmission technologies for broadband access. However, an unregulated transmission could waste energy if an ONU transmits an insufficient or excessive amount of packets in one time cycle on an energy saving EPON network. For an insufficient transmission, it invokes an unnecessary power consumption, since a power saving ONU has to wake up early for transmitting few packets only. For an excessive transmission, it allows an ONU to stay at the active mode for a longer time but to push other ONUs out of transmission cycle. Load regulation enables an energy saving EPON system to provide near constant loading. This paper studies an appropriate threshold for an ONU to switch between active or energy saving modes. Performance results reveal that satisfactory power saving effects can be achieved via proper load regulation applied on both upstream and downstream channels at the cost of the delay performances of low-priority packets.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121823741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Algorithm for Readers Arrangement without Collision in RFID Networks RFID网络中读写器无碰撞排列算法
A. Meddeb, Atef Jaballah
Radio Frequency IDentification (RFID) was identified as one of the ten best technologies in the 21st century. This technology is frequently used in different sectors: industrial, agricultural and academic. In RFID networks, readers and tags communicate wirelessly through electromagnetic signals. Due to the optimized tag coverage, multiple readers must be deployed in the same working area, causing reader-to-reader or/and readerto- tag collisions. In addition, the RFID reader is characterized by a maximum number of tags that can read them and a maximum interrogation range. Then the problem of activating the RFID readers and adjusting their interrogation ranges in order to cover the maximum number of tags without collisions is one of hot spot researches in RFID networks. This problem is known as the Reader Coverage Collision Avoidance Arrangement (RCCAA) problem. In the literature, an algorithm called the Maximum-Weight-Independent-Set-Based Algorithm (MWISBA) was put forward to solve the RCCAA problem. In this algorithm, only the interrogation ranges of readers where adjusted. The interference range was not taken into account. Thus, a readerto- reader collision could occur if a reader interrogated a tag located in the overlap area of its interrogation area with the interference area of another reader. To fill in this gap, we propose an improvement of the MWISBA called the MWISBAII which is able to solve the RCCAA problem avoiding all types of collisions. The experimental results show the superiority of our algorithm compared with the state-of-the-art solutions.
无线射频识别(RFID)被认为是21世纪十大最佳技术之一。这项技术经常用于不同的部门:工业、农业和学术。在RFID网络中,阅读器和标签通过电磁信号进行无线通信。由于优化的标签覆盖范围,必须在同一工作区域部署多个阅读器,从而导致阅读器到阅读器或阅读器到标签的冲突。此外,RFID阅读器的特点是可以读取它们的最大标签数量和最大查询范围。因此,激活RFID读写器并调整其查询范围以覆盖最大数量的标签而不发生冲突的问题是RFID网络研究的热点之一。这个问题被称为Reader Coverage Collision Avoidance Arrangement (RCCAA)问题。文献中提出了一种基于最大权重独立集的算法(Maximum-Weight-Independent-Set-Based algorithm, MWISBA)来解决RCCAA问题。在该算法中,只调整了读者的询问范围。干扰范围没有考虑在内。因此,如果阅读器询问位于其询问区域与另一个阅读器的干扰区域重叠区域的标签,则可能发生阅读器-阅读器碰撞。为了填补这一空白,我们提出了一种改进的MWISBA,称为MWISBAII,它能够解决RCCAA问题,避免所有类型的碰撞。实验结果表明了该算法与现有算法相比的优越性。
{"title":"Algorithm for Readers Arrangement without Collision in RFID Networks","authors":"A. Meddeb, Atef Jaballah","doi":"10.1109/PDCAT.2017.00059","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00059","url":null,"abstract":"Radio Frequency IDentification (RFID) was identified as one of the ten best technologies in the 21st century. This technology is frequently used in different sectors: industrial, agricultural and academic. In RFID networks, readers and tags communicate wirelessly through electromagnetic signals. Due to the optimized tag coverage, multiple readers must be deployed in the same working area, causing reader-to-reader or/and readerto- tag collisions. In addition, the RFID reader is characterized by a maximum number of tags that can read them and a maximum interrogation range. Then the problem of activating the RFID readers and adjusting their interrogation ranges in order to cover the maximum number of tags without collisions is one of hot spot researches in RFID networks. This problem is known as the Reader Coverage Collision Avoidance Arrangement (RCCAA) problem. In the literature, an algorithm called the Maximum-Weight-Independent-Set-Based Algorithm (MWISBA) was put forward to solve the RCCAA problem. In this algorithm, only the interrogation ranges of readers where adjusted. The interference range was not taken into account. Thus, a readerto- reader collision could occur if a reader interrogated a tag located in the overlap area of its interrogation area with the interference area of another reader. To fill in this gap, we propose an improvement of the MWISBA called the MWISBAII which is able to solve the RCCAA problem avoiding all types of collisions. The experimental results show the superiority of our algorithm compared with the state-of-the-art solutions.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"29 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120972737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Case of Electrical Circuit Switched Interconnection Network for Parallel Computers 并行计算机电路交换互连网络实例
Yao Hu, T. Kudoh, M. Koibuchi
Circuit switching is a way to minimize network latency and maximize network bandwidth when a limited number of source-and-destination pairs exchange messages which are predictable. Although there are a large number of studies of optical circuit switching (OCS) on HPC systems and datacenters, it is still not mature. In this context, we explore the use of electrical circuit switching (ECS) for the low-latency purpose on HPC systems and datacenters. ECS has the same link bandwidth as existing electrical packet switched networks, and inherits quick update of input-and-output connections from electrical switches. We develop a network topology generator for ECS to minimize the number of time slots optimized to target applications whose traffic patterns are predictable. By performing a quantitative discrete-event simulation, we present that an ideal ECS network outperforms counterpart EPS networks. Evaluation results show that the minimum necessary number of slots (MNNS) can be reduced to a small number in a generated topology while keeping resource amount less than that in a standard mesh network.
电路交换是当有限数量的源和目的对交换可预测的消息时,最小化网络延迟和最大化网络带宽的一种方法。虽然在高性能计算系统和数据中心上对光电路交换(OCS)进行了大量的研究,但目前还不成熟。在这种情况下,我们探讨了在高性能计算系统和数据中心中使用电路交换(ECS)来实现低延迟目的。ECS具有与现有电分组交换网络相同的链路带宽,并继承了电开关输入输出连接的快速更新。我们为ECS开发了一个网络拓扑生成器,以最大限度地减少针对流量模式可预测的应用程序优化的时隙数量。通过进行定量离散事件模拟,我们提出了理想的ECS网络优于对应的EPS网络。评估结果表明,在生成的拓扑结构中,最小必要槽数(MNNS)可以减少到较小的数量,同时使资源数量少于标准网状网络。
{"title":"A Case of Electrical Circuit Switched Interconnection Network for Parallel Computers","authors":"Yao Hu, T. Kudoh, M. Koibuchi","doi":"10.1109/PDCAT.2017.00052","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00052","url":null,"abstract":"Circuit switching is a way to minimize network latency and maximize network bandwidth when a limited number of source-and-destination pairs exchange messages which are predictable. Although there are a large number of studies of optical circuit switching (OCS) on HPC systems and datacenters, it is still not mature. In this context, we explore the use of electrical circuit switching (ECS) for the low-latency purpose on HPC systems and datacenters. ECS has the same link bandwidth as existing electrical packet switched networks, and inherits quick update of input-and-output connections from electrical switches. We develop a network topology generator for ECS to minimize the number of time slots optimized to target applications whose traffic patterns are predictable. By performing a quantitative discrete-event simulation, we present that an ideal ECS network outperforms counterpart EPS networks. Evaluation results show that the minimum necessary number of slots (MNNS) can be reduced to a small number in a generated topology while keeping resource amount less than that in a standard mesh network.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126853209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An Efficient Locality-Aware Task Assignment Algorithm for Minimizing Shared Cache Contention 最小化共享缓存争用的高效位置感知任务分配算法
Song Liu, Xiao Xie, Yuanzhen Cui, Weiguo Wu
Task scheduling can improve the performance of parallel execution through optimizing the utilization of on-chip computing resources, and thus it has been widely studied. Most of the previous work uses data access locality to predict cache behaviors for task scheduling, but usually suffering accuracy and computational time complexity issues. This paper proposes an efficient task assignment algorithm to minimize the contention for shared caches on multi-core processors among parallel independent process level tasks. The proposed algorithm leverages the property of footprint to approximately estimate the locality parameter of parallel tasks, choosing the best grouping of tasks with minimum locality value in a quick way for task assignment. The calculation time is therefore significantly reduced and the algorithm complexity is O(nlog2n). Meanwhile, the algorithm accuracy is very high. On an Intel 8 cores dual-processor system, the experimental results show that the task assignment algorithm achieves over 99% of the actual optimal performance on average and outperforms the default Linux task scheduling method by an average of over 5% for two sets of different parallel tasks.
任务调度可以通过优化片上计算资源的利用来提高并行执行的性能,因此得到了广泛的研究。以前的大多数工作使用数据访问局部性来预测任务调度的缓存行为,但通常存在准确性和计算时间复杂性问题。本文提出了一种有效的任务分配算法,以减少并行独立进程级任务对多核处理器上共享缓存的争用。该算法利用占用空间的特性对并行任务的局部性参数进行近似估计,快速选择局部性值最小的最佳任务组进行任务分配。因此大大减少了计算时间,算法复杂度为0 (nlog2n)。同时,算法的精度也很高。在Intel 8核双处理器系统上,实验结果表明,对于两组不同的并行任务,任务分配算法平均达到实际最优性能的99%以上,比Linux默认的任务调度方法平均高出5%以上。
{"title":"An Efficient Locality-Aware Task Assignment Algorithm for Minimizing Shared Cache Contention","authors":"Song Liu, Xiao Xie, Yuanzhen Cui, Weiguo Wu","doi":"10.1109/PDCAT.2017.00017","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00017","url":null,"abstract":"Task scheduling can improve the performance of parallel execution through optimizing the utilization of on-chip computing resources, and thus it has been widely studied. Most of the previous work uses data access locality to predict cache behaviors for task scheduling, but usually suffering accuracy and computational time complexity issues. This paper proposes an efficient task assignment algorithm to minimize the contention for shared caches on multi-core processors among parallel independent process level tasks. The proposed algorithm leverages the property of footprint to approximately estimate the locality parameter of parallel tasks, choosing the best grouping of tasks with minimum locality value in a quick way for task assignment. The calculation time is therefore significantly reduced and the algorithm complexity is O(nlog2n). Meanwhile, the algorithm accuracy is very high. On an Intel 8 cores dual-processor system, the experimental results show that the task assignment algorithm achieves over 99% of the actual optimal performance on average and outperforms the default Linux task scheduling method by an average of over 5% for two sets of different parallel tasks.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122212338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilevel Concatenated Codes with Parallel Construction 并行结构的多级串联代码
Shang-Chih Ma, Chang-Hong Lee, Hong Chang
Using the multilevel concatenation, long block codes can be constructed from shorter component codes, resulting in much less decoding complexity. The component codes can also be constructed from multilevel concatenation so that a hierarchical multilevel concatenated structure is built. The decoding complexity of the hierarchical scheme can be further decreased.
使用多层连接,可以从较短的组件码构造长块码,从而大大降低解码复杂度。组件代码也可以通过多层连接来构造,这样就可以构建分层的多层连接结构。可以进一步降低分层方案的解码复杂度。
{"title":"Multilevel Concatenated Codes with Parallel Construction","authors":"Shang-Chih Ma, Chang-Hong Lee, Hong Chang","doi":"10.1109/PDCAT.2017.00058","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00058","url":null,"abstract":"Using the multilevel concatenation, long block codes can be constructed from shorter component codes, resulting in much less decoding complexity. The component codes can also be constructed from multilevel concatenation so that a hierarchical multilevel concatenated structure is built. The decoding complexity of the hierarchical scheme can be further decreased.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128608872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamically Improving Resiliency to Timing Errors for Stream Processing Workloads 动态提高对流处理工作负载的定时错误的弹性
Geoffrey Phi C. Tran, J. Walters, S. Crago
Large-scale data processing paradigms, such as stream processing, are widespread in academic and corporate workloads. These environments are commonly subject to real-time requirements, such as latency and throughput, and resiliency requirements to node or network failures. These requirements have generally been approached as separate problems. Intermittent timing delays due to factors such as garbage collection can further complicate the management of the stream processing workload. Insufficient resource allocations can also lead to poor performance. Currently, tuning these applications is done manually. We show that improper configuration can greatly affect performance. It is reported that even 100ms of increased latency in online sales platforms can potentially result in lower sales. In this paper we propose Dynamo, a framework and monitor that implements a methodology for addressing both the performance and timing error problems by increasing the resiliency of stream processing frameworks to timing delays. Dynamo autonomously adjusts the resource allocation by using a performance profile that is generated through application profiling. Dynamo partitions an application’s allocated resources into active and passive partitions that are dynamically adjusted to match an application’s multi-modal behavior. The distribution of resources determines the amount of computation that Dynamo can duplicate and process redundantly, thereby reducing the probability of timing errors that affect a tuple’s total execution time. In our experiments, we observed improvements in the number of tuples with missed deadlines. Our results show that Dynamo is able to consistently improve the resiliency to timing errors over a number of differing occurrence rates. Furthermore, we show that the improvement in the number of missed deadlines increases with the amount of spare resources, with a 71.40% reduction in the best case.
大规模数据处理范例,如流处理,在学术和企业工作负载中广泛使用。这些环境通常受制于实时需求,例如延迟和吞吐量,以及对节点或网络故障的弹性需求。这些要求通常被视为单独的问题。由于诸如垃圾收集之类的因素造成的间歇性定时延迟会使流处理工作负载的管理进一步复杂化。资源分配不足也会导致性能不佳。目前,这些应用程序的调优是手动完成的。我们展示了不正确的配置会极大地影响性能。据报道,在线销售平台即使延迟增加100毫秒,也可能导致销售额下降。在本文中,我们提出了Dynamo,这是一个框架和监视器,它实现了一种方法,通过增加流处理框架对时序延迟的弹性来解决性能和时序错误问题。Dynamo通过使用通过应用程序分析生成的性能配置文件来自主调整资源分配。Dynamo将应用程序分配的资源分为主动分区和被动分区,动态调整以匹配应用程序的多模式行为。资源的分布决定了Dynamo可以复制和冗余处理的计算量,从而减少了影响元组总执行时间的计时错误的概率。在我们的实验中,我们观察到错过截止日期的元组数量有所改善。我们的结果表明,Dynamo能够在许多不同的发生率上持续提高对定时错误的弹性。此外,我们还表明,错过最后期限的数量随着空闲资源的数量而增加,在最佳情况下减少了71.40%。
{"title":"Dynamically Improving Resiliency to Timing Errors for Stream Processing Workloads","authors":"Geoffrey Phi C. Tran, J. Walters, S. Crago","doi":"10.1109/PDCAT.2017.00080","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00080","url":null,"abstract":"Large-scale data processing paradigms, such as stream processing, are widespread in academic and corporate workloads. These environments are commonly subject to real-time requirements, such as latency and throughput, and resiliency requirements to node or network failures. These requirements have generally been approached as separate problems. Intermittent timing delays due to factors such as garbage collection can further complicate the management of the stream processing workload. Insufficient resource allocations can also lead to poor performance. Currently, tuning these applications is done manually. We show that improper configuration can greatly affect performance. It is reported that even 100ms of increased latency in online sales platforms can potentially result in lower sales. In this paper we propose Dynamo, a framework and monitor that implements a methodology for addressing both the performance and timing error problems by increasing the resiliency of stream processing frameworks to timing delays. Dynamo autonomously adjusts the resource allocation by using a performance profile that is generated through application profiling. Dynamo partitions an application’s allocated resources into active and passive partitions that are dynamically adjusted to match an application’s multi-modal behavior. The distribution of resources determines the amount of computation that Dynamo can duplicate and process redundantly, thereby reducing the probability of timing errors that affect a tuple’s total execution time. In our experiments, we observed improvements in the number of tuples with missed deadlines. Our results show that Dynamo is able to consistently improve the resiliency to timing errors over a number of differing occurrence rates. Furthermore, we show that the improvement in the number of missed deadlines increases with the amount of spare resources, with a 71.40% reduction in the best case.","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121403125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Energy-Efficient Multi-Level Cell STT-MRAM Caches with Content Awareness 具有内容感知的高效多级Cell STT-MRAM缓存
Qi Zeng, R. Jha, J. Peir
Spin-Transfer Torque Magnetoresistive Random Access Memory (STT-MRAM) is a promising memory technology, which has high density, low leakage power, fast read speed, and non-volatility, and is suitable for on-chip last-level caches with large capacity. Recently, Multi-Level Cell (MLC) STT-MRAM records two bits in a single cell to further improve the density for building even bigger on-chip caches. However, MLC worsens write energy consumption and endurance. The magnetization directions of its hard and soft domains cannot be flipped to two opposite directions simultaneously, which leads to the two-step transition problem for certain combinations of updating the 2-bit value. The two-step transition incurs extra flip in the soft domains, consume high energy, and negatively impact the life time of MLC STT-MRAM. In this paper, we present a new dimension to alleviate high write energy issue in MLC. During cache replacement, we select among a few candidates blocks close to the LRU position for replacement to lower the write energy with minimum impact on cache performance. We propose a novel block content encoding method to represent whole block with a few bits and use the encoding bits for better cache replacement. After picking the replacement block, we apply intelligent remap of each updated 2-bit values to further reduce the write energy. Performance evaluation results show this content-aware cache replacement can lower the write energy by 26.1% in comparison with MLC caches using regular Pseudo-LRU replacement polic
自旋转移转矩磁阻随机存取存储器(STT-MRAM)具有高密度、低漏功率、读取速度快、无易失性等优点,是一种很有前途的存储技术,适用于大容量的片上级缓存。最近,多层单元(MLC) STT-MRAM在单个单元中记录两个比特,以进一步提高密度,以便构建更大的片上缓存。然而,MLC使写入能耗和续航力恶化。其硬畴和软畴的磁化方向不能同时翻转到两个相反的方向,这导致某些2位值更新组合存在两步过渡问题。两步跃迁会导致软畴的额外翻转,消耗高能量,并对MLC STT-MRAM的寿命产生负面影响。在本文中,我们提出了一个新的维度来缓解MLC中的高写入能量问题。在缓存替换过程中,我们从靠近LRU位置的几个候选块中选择替换块,以降低写能量,同时对缓存性能的影响最小。我们提出了一种新的块内容编码方法,用几个比特来表示整个块,并使用编码比特来更好地替换缓存。在选择替换块后,我们对每个更新的2位值进行智能重映射,以进一步减少写能量。性能评估结果表明,与使用常规Pseudo-LRU替换策略的MLC缓存相比,这种内容感知缓存替换可以降低26.1%的写能量
{"title":"Towards Energy-Efficient Multi-Level Cell STT-MRAM Caches with Content Awareness","authors":"Qi Zeng, R. Jha, J. Peir","doi":"10.1109/PDCAT.2017.00062","DOIUrl":"https://doi.org/10.1109/PDCAT.2017.00062","url":null,"abstract":"Spin-Transfer Torque Magnetoresistive Random Access Memory (STT-MRAM) is a promising memory technology, which has high density, low leakage power, fast read speed, and non-volatility, and is suitable for on-chip last-level caches with large capacity. Recently, Multi-Level Cell (MLC) STT-MRAM records two bits in a single cell to further improve the density for building even bigger on-chip caches. However, MLC worsens write energy consumption and endurance. The magnetization directions of its hard and soft domains cannot be flipped to two opposite directions simultaneously, which leads to the two-step transition problem for certain combinations of updating the 2-bit value. The two-step transition incurs extra flip in the soft domains, consume high energy, and negatively impact the life time of MLC STT-MRAM. In this paper, we present a new dimension to alleviate high write energy issue in MLC. During cache replacement, we select among a few candidates blocks close to the LRU position for replacement to lower the write energy with minimum impact on cache performance. We propose a novel block content encoding method to represent whole block with a few bits and use the encoding bits for better cache replacement. After picking the replacement block, we apply intelligent remap of each updated 2-bit values to further reduce the write energy. Performance evaluation results show this content-aware cache replacement can lower the write energy by 26.1% in comparison with MLC caches using regular Pseudo-LRU replacement polic","PeriodicalId":119197,"journal":{"name":"2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127142821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1