首页 > 最新文献

2015 IEEE International Parallel and Distributed Processing Symposium Workshop最新文献

英文 中文
On the Impact of Execution Models: A Case Study in Computational Chemistry 论执行模型的影响:以计算化学为例
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.111
D. Chavarría-Miranda, M. Halappanavar, S. Krishnamoorthy, J. Manzano, Abhinav Vishnu, A. Hoisie
Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore the impact of execution models on the utilization of an HPC system. We demonstrate a 50 percent improvement in performance by using work stealing relative to a more traditional static scheduling approach. We also use a novel semi-matching technique for load balancing that has comparable performance to a traditional hyper graph-based partitioning implementation, which is computationally expensive. Using this study, we found that execution model design choices and assumptions can limit critical optimizations such as global, dynamic load balancing and finding the correct balance between available work units and different system and runtime overheads. With the emergence of multi- and many-core architectures and the consequent growth in the complexity of HPC platforms, we believe that these lessons will be beneficial to researchers tuning diverse applications on modern HPC platforms, especially on emerging dynamic platforms with energy-induced performance variability.
高效利用高性能计算平台是一个重要而复杂的问题。执行模型是对执行栈动态运行时行为的抽象描述,对高性能计算系统的利用率有重要影响。我们以计算化学内核为例,结合负载平衡技术,探讨了各种执行模型对高性能计算系统利用率的影响。我们演示了与传统的静态调度方法相比,使用工作窃取可以提高50%的性能。我们还使用了一种新颖的半匹配技术来实现负载平衡,其性能可与传统的基于超图的分区实现相媲美,后者的计算成本很高。通过这项研究,我们发现执行模型的设计选择和假设可能会限制关键的优化,比如全局的、动态的负载平衡,以及在可用的工作单元和不同的系统和运行时开销之间找到正确的平衡。随着多核和多核架构的出现,以及随之而来的高性能计算平台复杂性的增长,我们相信这些经验教训将有助于研究人员在现代高性能计算平台上调整各种应用程序,特别是在具有能量诱导性能变化的新兴动态平台上。
{"title":"On the Impact of Execution Models: A Case Study in Computational Chemistry","authors":"D. Chavarría-Miranda, M. Halappanavar, S. Krishnamoorthy, J. Manzano, Abhinav Vishnu, A. Hoisie","doi":"10.1109/IPDPSW.2015.111","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.111","url":null,"abstract":"Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore the impact of execution models on the utilization of an HPC system. We demonstrate a 50 percent improvement in performance by using work stealing relative to a more traditional static scheduling approach. We also use a novel semi-matching technique for load balancing that has comparable performance to a traditional hyper graph-based partitioning implementation, which is computationally expensive. Using this study, we found that execution model design choices and assumptions can limit critical optimizations such as global, dynamic load balancing and finding the correct balance between available work units and different system and runtime overheads. With the emergence of multi- and many-core architectures and the consequent growth in the complexity of HPC platforms, we believe that these lessons will be beneficial to researchers tuning diverse applications on modern HPC platforms, especially on emerging dynamic platforms with energy-induced performance variability.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"750 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126941519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Combining Backward and Forward Recovery to Cope with Silent Errors in Iterative Solvers 结合后向恢复和前向恢复处理迭代求解中的无声错误
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.22
M. Fasi, Y. Robert, B. Uçar
Several recent papers have introduced a periodic verification mechanism to detect silent errors in iterative solvers. Chen [PPoPP'13, pp. 167 -- 176] has shown how to combine such a verification mechanism (a stability test checking the orthogonality of two vectors and recomputing the residual) with check pointing: the idea is to verify every d iterations, and to checkpoint every c × d iterations. When a silent error is detected by the verification mechanism, one can rollback to, and re-execute from, the last checkpoint. In this paper, we also propose to combine check pointing and verification, but we use ABFT rather than stability tests. ABFT can be used for error detection, but also for error detection and correction, allowing a forward recovery (and no rollback nor re-execution) when a single error is detected. We introduce an abstract performance model to compute the performance of all schemes, and we instantiate it using the Conjugate Gradient algorithm. Finally, we validate our new approach through a set of simulations.
最近的几篇论文介绍了一种周期性验证机制来检测迭代求解器中的无声错误。Chen [PPoPP'13, pp. 167—176]已经展示了如何将这种验证机制(检查两个向量的正交性并重新计算残差的稳定性测试)与检查点结合起来:其思想是每d次迭代验证一次,并且每c × d次迭代检查点。当验证机制检测到无声错误时,可以回滚到最后一个检查点,并从该检查点重新执行。在本文中,我们也建议将检查指向和验证结合起来,但我们使用ABFT而不是稳定性测试。ABFT可用于错误检测,也可用于错误检测和纠正,当检测到单个错误时,允许向前恢复(不回滚也不重新执行)。我们引入了一个抽象的性能模型来计算所有方案的性能,并使用共轭梯度算法对其进行了实例化。最后,我们通过一组仿真验证了我们的新方法。
{"title":"Combining Backward and Forward Recovery to Cope with Silent Errors in Iterative Solvers","authors":"M. Fasi, Y. Robert, B. Uçar","doi":"10.1109/IPDPSW.2015.22","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.22","url":null,"abstract":"Several recent papers have introduced a periodic verification mechanism to detect silent errors in iterative solvers. Chen [PPoPP'13, pp. 167 -- 176] has shown how to combine such a verification mechanism (a stability test checking the orthogonality of two vectors and recomputing the residual) with check pointing: the idea is to verify every d iterations, and to checkpoint every c × d iterations. When a silent error is detected by the verification mechanism, one can rollback to, and re-execute from, the last checkpoint. In this paper, we also propose to combine check pointing and verification, but we use ABFT rather than stability tests. ABFT can be used for error detection, but also for error detection and correction, allowing a forward recovery (and no rollback nor re-execution) when a single error is detected. We introduce an abstract performance model to compute the performance of all schemes, and we instantiate it using the Conjugate Gradient algorithm. Finally, we validate our new approach through a set of simulations.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127006321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures 基于矩阵分解的多核可扩展任务并行SGD
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.135
Yusuke Nishioka, K. Taura
Recommendation is an indispensable technique especially in e-commerce services such as Amazon or Netflix to provide more preferable items to users. Matrix factorization is a well-known algorithm for recommendation which estimates affinities between users and items solely based on ratings explicitly given by users. To handle the large amounts of data, stochastic gradient descent (SGD), which is an online loss minimization algorithm, can be applied to matrix factorization. SGD is an effective method in terms of both convergence speed and memory consumption, but is difficult to be parallelized due to its essential sequentiality. FPSGD by Zhuang et al. Cite fpsgd is an existing parallel SGD method for matrix factorization by dividing the rating matrix into many small blocks. Threads work on blocks, so that they do not update the same rows or columns of the factor matrices. Because of this technique FPSGD achieves higher convergence speed than other existing methods. Still, as we demonstrate in this paper, FPSGD does not scale beyond 32 cores with 1.4GB Netflix dataset because assigning non-conflicting blocks to threads needs a lock operation. In this work, we propose an alternative approach of SGD for matrix factorization using task parallel programming model. As a result, we have successfully overcome the bottleneck of FPSGD and achieved higher scalability with 64 cores.
推荐是一种必不可少的技术,特别是在亚马逊或Netflix等电子商务服务中,为用户提供更喜欢的商品。矩阵分解是一种著名的推荐算法,它仅根据用户明确给出的评分来估计用户和物品之间的亲和力。为了处理大量数据,随机梯度下降算法(SGD)是一种在线损失最小化算法,可以应用于矩阵分解。SGD在收敛速度和内存消耗方面都是一种有效的方法,但由于其本质上的顺序性而难以并行化。FPSGD(庄等)Cite fpsgd是一种现有的并行SGD方法,通过将评级矩阵划分为许多小块来进行矩阵分解。线程在块上工作,因此它们不会更新因子矩阵的相同行或列。由于这种技术,FPSGD的收敛速度比其他现有方法要快。尽管如此,正如我们在本文中所演示的那样,FPSGD在使用1.4GB Netflix数据集时不能扩展到32核以上,因为将不冲突的块分配给线程需要锁操作。在这项工作中,我们提出了一种使用任务并行编程模型进行矩阵分解的SGD替代方法。因此,我们成功地克服了FPSGD的瓶颈,并在64核下实现了更高的可扩展性。
{"title":"Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures","authors":"Yusuke Nishioka, K. Taura","doi":"10.1109/IPDPSW.2015.135","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.135","url":null,"abstract":"Recommendation is an indispensable technique especially in e-commerce services such as Amazon or Netflix to provide more preferable items to users. Matrix factorization is a well-known algorithm for recommendation which estimates affinities between users and items solely based on ratings explicitly given by users. To handle the large amounts of data, stochastic gradient descent (SGD), which is an online loss minimization algorithm, can be applied to matrix factorization. SGD is an effective method in terms of both convergence speed and memory consumption, but is difficult to be parallelized due to its essential sequentiality. FPSGD by Zhuang et al. Cite fpsgd is an existing parallel SGD method for matrix factorization by dividing the rating matrix into many small blocks. Threads work on blocks, so that they do not update the same rows or columns of the factor matrices. Because of this technique FPSGD achieves higher convergence speed than other existing methods. Still, as we demonstrate in this paper, FPSGD does not scale beyond 32 cores with 1.4GB Netflix dataset because assigning non-conflicting blocks to threads needs a lock operation. In this work, we propose an alternative approach of SGD for matrix factorization using task parallel programming model. As a result, we have successfully overcome the bottleneck of FPSGD and achieved higher scalability with 64 cores.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130687982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Storm Pub-Sub: High Performance, Scalable Content Based Event Matching System Using Storm Storm Pub-Sub:使用Storm的高性能、可扩展的基于内容的事件匹配系统
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.95
M. Shah, D. Kulkarni
Storm pub-sub is a novel high performance publish subscribe system designed to efficiently match events and the subscriptions with high throughput. Moving a content based pub-sub system first to a local cluster and then to a distributed cluster framework is for high performance and scalability. We depart from the use of broker overlays, where each server must support the whole range of operations of a pub-sub service, as well as overlay management and routing functionality. In this system different operations involved in pub-sub are separated to leverage their natural potential for parallelization using bolts. The storm pub-sub is compared with the traditional pub-sub system Siena, a broker based architecture. Through experimentation on local cluster as well as on distributed cluster we show that our approach of designing publish subscribe system on storm scales well for high volume of data. Storm pub-sub system approximately produces 2200 event/s on distributed cluster. In this paper we describe design and implementation of storm pub-sub and evaluate it in terms of scalability and throughput.
Storm发布订阅系统是一种新型的高性能发布订阅系统,旨在实现事件与高吞吐量订阅的高效匹配。首先将基于内容的发布-子系统移动到本地集群,然后再移动到分布式集群框架是为了获得高性能和可伸缩性。我们不再使用代理覆盖,在这种情况下,每个服务器必须支持发布-订阅服务的全部操作范围,以及覆盖管理和路由功能。在这个系统中,pub-sub中涉及的不同操作被分开,以利用它们使用螺栓进行并行化的自然潜力。将风暴发布-订阅系统与传统的基于代理的发布-订阅系统Siena进行了比较。通过在本地集群和分布式集群上的实验表明,我们设计的风暴级发布订阅系统可以很好地满足大数据量的需求。Storm pub-sub系统在分布式集群上大约每秒产生2200个事件。本文描述了风暴发布-订阅的设计和实现,并从可扩展性和吞吐量方面对其进行了评估。
{"title":"Storm Pub-Sub: High Performance, Scalable Content Based Event Matching System Using Storm","authors":"M. Shah, D. Kulkarni","doi":"10.1109/IPDPSW.2015.95","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.95","url":null,"abstract":"Storm pub-sub is a novel high performance publish subscribe system designed to efficiently match events and the subscriptions with high throughput. Moving a content based pub-sub system first to a local cluster and then to a distributed cluster framework is for high performance and scalability. We depart from the use of broker overlays, where each server must support the whole range of operations of a pub-sub service, as well as overlay management and routing functionality. In this system different operations involved in pub-sub are separated to leverage their natural potential for parallelization using bolts. The storm pub-sub is compared with the traditional pub-sub system Siena, a broker based architecture. Through experimentation on local cluster as well as on distributed cluster we show that our approach of designing publish subscribe system on storm scales well for high volume of data. Storm pub-sub system approximately produces 2200 event/s on distributed cluster. In this paper we describe design and implementation of storm pub-sub and evaluate it in terms of scalability and throughput.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"27 26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131451897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Branch-and-Estimate Heuristic Procedure for Solving Nonconvex Integer Optimization Problems 求解非凸整数优化问题的分支估计启发式方法
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.43
Prashant Palkar, Ashutosh Mahajan
We present a method for solving nonconvex mixed-integer nonlinear programs using a branch-and-bound framework. At each node in the search tree, we solve the continuous nonlinear relaxation multiple times using an existing non-linear solver. Since the relaxation we create is in general not convex, this method may not find an optimal solution. In order to mitigate this difficulty, we solve the relaxation multiple times in parallel starting from different initial points. Our preliminary computational experiments show that this approach gives optimal or near-optimal solutions on benchmark problems, and that the method benefits well from parallelism.
给出了一种用分支定界框架求解非凸混合整数非线性规划的方法。在搜索树的每个节点上,我们使用已有的非线性求解器求解连续非线性松弛问题多次。由于我们创建的松弛通常不是凸的,所以这种方法可能找不到最优解。为了减轻这一困难,我们从不同的初始点开始并行求解多次松弛。我们的初步计算实验表明,该方法在基准问题上给出了最优或接近最优的解决方案,并且该方法从并行性中获益颇多。
{"title":"A Branch-and-Estimate Heuristic Procedure for Solving Nonconvex Integer Optimization Problems","authors":"Prashant Palkar, Ashutosh Mahajan","doi":"10.1109/IPDPSW.2015.43","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.43","url":null,"abstract":"We present a method for solving nonconvex mixed-integer nonlinear programs using a branch-and-bound framework. At each node in the search tree, we solve the continuous nonlinear relaxation multiple times using an existing non-linear solver. Since the relaxation we create is in general not convex, this method may not find an optimal solution. In order to mitigate this difficulty, we solve the relaxation multiple times in parallel starting from different initial points. Our preliminary computational experiments show that this approach gives optimal or near-optimal solutions on benchmark problems, and that the method benefits well from parallelism.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134072351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementing Uniform Reliable Broadcast in Anonymous Distributed Systems with Fair Lossy Channels 在具有公平有损信道的匿名分布式系统中实现统一可靠广播
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.23
Jian Tang, M. Larrea, S. Arévalo, Ernesto Jiménez
Uniform Reliable Broadcast (URB) is an important abstraction in distributed systems, offering delivery guarantee when spreading messages among processes. Informally, URB guarantees that if a process (correct or not) delivers a message m, then all correct processes deliver m. This abstraction has been extensively investigated in distributed systems where all processes have different identifiers. Furthermore, the majority of papers in the literature usually assume that the communication channels of the system are reliable, which is not always the case in real systems. In this paper, the URB abstraction is investigated in anonymous asynchronous message passing systems with fair lossy communication channels. Firstly, a simple algorithm is given to solve URB in such system model assuming a majority of correct processes. Then a new failure detector class AT is proposed. With AT, URB can be implemented with any number of correct processes. Due to the message loss caused by fair lossy communication channels, every correct process in this first algorithm has to broadcast all URB delivered messages forever, which makes the algorithm to be non-quiescent. In order to get a quiescent URB algorithm in anonymous asynchronous systems, a perfect anonymous failure detector AP* is proposed. Finally, a quiescent URB algorithm using AT and AP* is given.
统一可靠广播(Uniform Reliable Broadcast, URB)是分布式系统中的一个重要抽象,为消息在进程间传播提供了传递保证。非正式地说,URB保证如果一个进程(正确与否)交付消息m,那么所有正确的进程都会交付消息m。这种抽象已经在分布式系统中得到了广泛的研究,其中所有进程都有不同的标识符。此外,文献中的大多数论文通常假设系统的通信信道是可靠的,而在实际系统中并不总是如此。本文研究了具有公平损耗通信信道的匿名异步消息传递系统中的URB抽象问题。首先,给出了一种简单的算法来求解该系统模型中大多数正确过程的URB。然后提出了一种新的故障检测器AT类。使用AT, URB可以使用任意数量的正确流程来实现。由于公平有损通信信道导致的消息丢失,第一种算法中的每个正确进程都必须永远广播所有URB传递的消息,这使得该算法是非静态的。为了在匿名异步系统中实现静态URB算法,提出了一种完善的匿名故障检测器AP*。最后,给出了一种基于AT和AP*的静态URB算法。
{"title":"Implementing Uniform Reliable Broadcast in Anonymous Distributed Systems with Fair Lossy Channels","authors":"Jian Tang, M. Larrea, S. Arévalo, Ernesto Jiménez","doi":"10.1109/IPDPSW.2015.23","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.23","url":null,"abstract":"Uniform Reliable Broadcast (URB) is an important abstraction in distributed systems, offering delivery guarantee when spreading messages among processes. Informally, URB guarantees that if a process (correct or not) delivers a message m, then all correct processes deliver m. This abstraction has been extensively investigated in distributed systems where all processes have different identifiers. Furthermore, the majority of papers in the literature usually assume that the communication channels of the system are reliable, which is not always the case in real systems. In this paper, the URB abstraction is investigated in anonymous asynchronous message passing systems with fair lossy communication channels. Firstly, a simple algorithm is given to solve URB in such system model assuming a majority of correct processes. Then a new failure detector class AT is proposed. With AT, URB can be implemented with any number of correct processes. Due to the message loss caused by fair lossy communication channels, every correct process in this first algorithm has to broadcast all URB delivered messages forever, which makes the algorithm to be non-quiescent. In order to get a quiescent URB algorithm in anonymous asynchronous systems, a perfect anonymous failure detector AP* is proposed. Finally, a quiescent URB algorithm using AT and AP* is given.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134482985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards Detecting Patterns in Failure Logs of Large-Scale Distributed Systems 大规模分布式系统故障日志模式检测研究
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.109
Nentawe Gurumdimma, A. Jhumka, Maria Liakata, Edward Chuah, J. Browne
The ability to automatically detect faults or fault patterns to enhance system reliability is important for system administrators in reducing system failures. To achieve this objective, the message logs from cluster system are augmented with failure information, i.e., The raw log data is labelled. However, tagging or labelling of raw log data is very costly. In this paper, our objective is to detect failure patterns in the message logs using unlabelled data. To achieve our aim, we propose a methodology whereby a pre-processing step is first performed where redundant data is removed. A clustering algorithm is then executed on the resulting logs, and we further developed an unsupervised algorithm to detect failure patterns in the clustered log by harnessing the characteristics of these sequences. We evaluated our methodology on large production data, and results shows that, on average, an f-measure of 78% can be obtained without having data labels. The implication of our methodology is that a system administrator with little knowledge of the system can detect failure runs with reasonably high accuracy.
自动检测故障或故障模式以增强系统可靠性的能力对于系统管理员减少系统故障非常重要。为了实现这一目标,将来自集群系统的消息日志添加故障信息,即标记原始日志数据。然而,对原始日志数据进行标记是非常昂贵的。在本文中,我们的目标是使用未标记的数据检测消息日志中的故障模式。为了实现我们的目标,我们提出了一种方法,即首先执行预处理步骤,其中删除冗余数据。然后在生成的日志上执行聚类算法,我们进一步开发了一种无监督算法,通过利用这些序列的特征来检测聚类日志中的故障模式。我们在大量生产数据上评估了我们的方法,结果表明,平均而言,在没有数据标签的情况下可以获得78%的f-measure。我们的方法的含义是,对系统知之甚少的系统管理员可以以相当高的准确性检测故障运行。
{"title":"Towards Detecting Patterns in Failure Logs of Large-Scale Distributed Systems","authors":"Nentawe Gurumdimma, A. Jhumka, Maria Liakata, Edward Chuah, J. Browne","doi":"10.1109/IPDPSW.2015.109","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.109","url":null,"abstract":"The ability to automatically detect faults or fault patterns to enhance system reliability is important for system administrators in reducing system failures. To achieve this objective, the message logs from cluster system are augmented with failure information, i.e., The raw log data is labelled. However, tagging or labelling of raw log data is very costly. In this paper, our objective is to detect failure patterns in the message logs using unlabelled data. To achieve our aim, we propose a methodology whereby a pre-processing step is first performed where redundant data is removed. A clustering algorithm is then executed on the resulting logs, and we further developed an unsupervised algorithm to detect failure patterns in the clustered log by harnessing the characteristics of these sequences. We evaluated our methodology on large production data, and results shows that, on average, an f-measure of 78% can be obtained without having data labels. The implication of our methodology is that a system administrator with little knowledge of the system can detect failure runs with reasonably high accuracy.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132690601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Bulk GCD Computation Using a GPU to Break Weak RSA Keys 使用GPU批量GCD计算破解弱RSA密钥
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.54
Toru Fujita, K. Nakano, Yasuaki Ito
RSA is one the most well-known public-key cryptosystems widely used for secure data transfer. An RSA encryption key includes a modulus n which is the product of two large prime numbers p and q. If an RSA modulus n can be decomposed into p and q, the corresponding decryption key can be computed easily from them and the original message can be obtained using it. RSA cryptosystem relies on the hardness of factorization of RSA modulus. Suppose that we have a lot of encryption keys collected from the Web. If some of them are inappropriately generated so that they share the same prime number, then they can be decomposed by computing their GCD (Greatest Common Divisor). Actually, a previously published investigation showed that a certain ratio of RSA moduli in encryption keys in the Web are sharing prime numbers. We may find such weak RSA moduli n by computing the GCD of many pairs of RSA moduli. The main contribution of this paper is to present a new Euclidean algorithm for computing the GCD of all pairs of encryption moduli. The idea of our new Euclidean algorithm that we call Approximate Euclidean algorithm is to compute an approximation of quotient by just one 64-bit division and to use it for reducing the number of iterations of the Euclidean algorithm. We also present an implementation of Approximate Euclidean algorithm optimized for CUDA-enabled GPUs. The experimental results show that our implementation for 1024-bit GCD on GeForce GTX 780Ti runs more than 80 times faster than the Intel Xeon CPU implementation. Further, our GPU implementation is more than 9 times faster than the best known published GCD computation using the same generation GPU.
RSA是最著名的公钥密码系统之一,广泛用于安全数据传输。RSA加密密钥包含一个模n,它是两个大素数p和q的乘积。如果RSA模n可以分解为p和q,则可以很容易地从它们中计算出相应的解密密钥,并可以利用它获得原始消息。RSA密码系统依赖于RSA模的分解硬度。假设我们从Web上收集了很多加密密钥。如果它们中的一些被不恰当地生成,以至于它们共享相同的素数,那么它们可以通过计算它们的GCD(最大公约数)来分解。实际上,之前发表的一项调查表明,在Web上的加密密钥中,有一定比例的RSA模共享素数。通过计算多对RSA模的GCD,我们可以找到这样的弱RSA模n。本文的主要贡献是提出了一种新的计算所有加密模对的GCD的欧几里得算法。我们新的欧几里得算法的思想,我们称之为近似欧几里得算法,是通过一个64位除法来计算商的近似值,并使用它来减少欧几里得算法的迭代次数。我们还提出了一种近似欧几里得算法的实现,该算法针对支持cuda的gpu进行了优化。实验结果表明,我们在GeForce GTX 780Ti上实现的1024位GCD运行速度比Intel Xeon CPU实现的速度快80倍以上。此外,我们的GPU实现比使用同一代GPU的最知名的GCD计算快9倍以上。
{"title":"Bulk GCD Computation Using a GPU to Break Weak RSA Keys","authors":"Toru Fujita, K. Nakano, Yasuaki Ito","doi":"10.1109/IPDPSW.2015.54","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.54","url":null,"abstract":"RSA is one the most well-known public-key cryptosystems widely used for secure data transfer. An RSA encryption key includes a modulus n which is the product of two large prime numbers p and q. If an RSA modulus n can be decomposed into p and q, the corresponding decryption key can be computed easily from them and the original message can be obtained using it. RSA cryptosystem relies on the hardness of factorization of RSA modulus. Suppose that we have a lot of encryption keys collected from the Web. If some of them are inappropriately generated so that they share the same prime number, then they can be decomposed by computing their GCD (Greatest Common Divisor). Actually, a previously published investigation showed that a certain ratio of RSA moduli in encryption keys in the Web are sharing prime numbers. We may find such weak RSA moduli n by computing the GCD of many pairs of RSA moduli. The main contribution of this paper is to present a new Euclidean algorithm for computing the GCD of all pairs of encryption moduli. The idea of our new Euclidean algorithm that we call Approximate Euclidean algorithm is to compute an approximation of quotient by just one 64-bit division and to use it for reducing the number of iterations of the Euclidean algorithm. We also present an implementation of Approximate Euclidean algorithm optimized for CUDA-enabled GPUs. The experimental results show that our implementation for 1024-bit GCD on GeForce GTX 780Ti runs more than 80 times faster than the Intel Xeon CPU implementation. Further, our GPU implementation is more than 9 times faster than the best known published GCD computation using the same generation GPU.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131597628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Optimizing Defensive Investments in Energy-Based Cyber-Physical Systems 优化基于能源的信息物理系统的防御投资
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.112
Paul C. Wood, S. Bagchi, Alefiya Hussain
Interdependent cyber-physical systems (CPS) connect physical resources between independently motivated actors who seek to maximize profits while providing physical services to consumers. Cyber attacks in seemingly distant parts of these systems have local consequences, and techniques are needed to analyze and optimize defensive costs in the face of increasing cyber threats. This paper presents a technique for transforming physical interconnections between independent actors into a dependency analysis that can be applied to find optimal defensive investment strategies to protect assets from financially motivated adversaries in electric power grids.
相互依赖的网络物理系统(CPS)将物理资源连接在独立动机的参与者之间,这些参与者在为消费者提供物理服务的同时寻求利润最大化。在这些系统看似遥远的部分发生的网络攻击会造成局部后果,面对日益增加的网络威胁,需要技术来分析和优化防御成本。本文提出了一种将独立参与者之间的物理互连转换为依赖分析的技术,该技术可用于寻找最佳防御性投资策略,以保护资产免受电网中具有财务动机的对手的攻击。
{"title":"Optimizing Defensive Investments in Energy-Based Cyber-Physical Systems","authors":"Paul C. Wood, S. Bagchi, Alefiya Hussain","doi":"10.1109/IPDPSW.2015.112","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.112","url":null,"abstract":"Interdependent cyber-physical systems (CPS) connect physical resources between independently motivated actors who seek to maximize profits while providing physical services to consumers. Cyber attacks in seemingly distant parts of these systems have local consequences, and techniques are needed to analyze and optimize defensive costs in the face of increasing cyber threats. This paper presents a technique for transforming physical interconnections between independent actors into a dependency analysis that can be applied to find optimal defensive investment strategies to protect assets from financially motivated adversaries in electric power grids.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"232 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114423952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Phylogenetic Analysis Using MapReduce Programming Model 基于MapReduce编程模型的系统发育分析
Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.57
G. Siddesh, K. Srinivasa, Ishank Mishra, Abhinav Anurag, E. Uppal
Phylogenetic analysis has become essential part of research on the evolutionary tree of life. Distance-matrix methods of phylogenetic analysis explicitly rely on a measure of "genetic distance" between the sequences being classified, and therefore they require multiple sequence alignments as an input. Distance methods attempt to construct an all-to-all matrix from the sequence query set describing the distance between each sequence pair. Dynamic algorithms like Needleman-Wunsch algorithm (NWA) and Smith-Waterman algorithm (SWA) produce accurate alignments, but are computation intensive and are limited to the number and size of the sequences. The paper focuses towards optimizing phylogenetic analysis of large quantities of data using the hadoop Map/Reduce programming model. The proposed approach depends on NWA to produce sequence alignments and neighbor-joining methods, specifically UPGMA (Unweighted Pair Group Method with Arithmetic mean) to produce rooted trees. The experimental results demonstrate that proposed solution achieve significant improvements with respect to performance and throughput. The dynamic nature of the NWA coupled with data and computational parallelism of hadoop MapReduce programming model improves the throughput and accuracy of sequence alignment. Hence the proposed approach intends to carve out a new methodology towards optimizing phylogenetic analysis by achieving significant performance gain.
系统发育分析已成为研究生命进化树的重要组成部分。系统发育分析的距离矩阵方法明确地依赖于被分类序列之间的“遗传距离”测量,因此它们需要多个序列比对作为输入。距离方法试图从序列查询集构造一个全对全矩阵,描述每个序列对之间的距离。像Needleman-Wunsch算法(NWA)和Smith-Waterman算法(SWA)这样的动态算法可以产生精确的比对,但计算量大,并且受序列数量和大小的限制。本文的重点是使用hadoop Map/Reduce编程模型对大量数据进行优化系统发育分析。该方法依赖于NWA生成序列比对和邻居连接方法,特别是UPGMA (Unweighted Pair Group Method with Arithmetic mean)生成根树。实验结果表明,该方案在性能和吞吐量方面都有显著提高。NWA的动态性与hadoop MapReduce编程模型的数据和计算并行性相结合,提高了序列对齐的吞吐量和准确性。因此,提出的方法旨在开拓出一种新的方法,通过实现显著的性能增益来优化系统发育分析。
{"title":"Phylogenetic Analysis Using MapReduce Programming Model","authors":"G. Siddesh, K. Srinivasa, Ishank Mishra, Abhinav Anurag, E. Uppal","doi":"10.1109/IPDPSW.2015.57","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.57","url":null,"abstract":"Phylogenetic analysis has become essential part of research on the evolutionary tree of life. Distance-matrix methods of phylogenetic analysis explicitly rely on a measure of \"genetic distance\" between the sequences being classified, and therefore they require multiple sequence alignments as an input. Distance methods attempt to construct an all-to-all matrix from the sequence query set describing the distance between each sequence pair. Dynamic algorithms like Needleman-Wunsch algorithm (NWA) and Smith-Waterman algorithm (SWA) produce accurate alignments, but are computation intensive and are limited to the number and size of the sequences. The paper focuses towards optimizing phylogenetic analysis of large quantities of data using the hadoop Map/Reduce programming model. The proposed approach depends on NWA to produce sequence alignments and neighbor-joining methods, specifically UPGMA (Unweighted Pair Group Method with Arithmetic mean) to produce rooted trees. The experimental results demonstrate that proposed solution achieve significant improvements with respect to performance and throughput. The dynamic nature of the NWA coupled with data and computational parallelism of hadoop MapReduce programming model improves the throughput and accuracy of sequence alignment. Hence the proposed approach intends to carve out a new methodology towards optimizing phylogenetic analysis by achieving significant performance gain.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122728336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2015 IEEE International Parallel and Distributed Processing Symposium Workshop
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1