SECO is going to present how UE funded AXIOM project makes it possible to approach edge computing platforms and technologies in order to bring them to the industrial embedded market.
SECO将介绍UE资助的AXIOM项目如何使边缘计算平台和技术成为可能,并将其带入工业嵌入式市场。
{"title":"AXIOM project: from applied research towards embedded systems","authors":"Davide Catani","doi":"10.1145/3075564.3095085","DOIUrl":"https://doi.org/10.1145/3075564.3095085","url":null,"abstract":"SECO is going to present how UE funded AXIOM project makes it possible to approach edge computing platforms and technologies in order to bring them to the industrial embedded market.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"CE-32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126546575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the Computing Frontiers Conference","authors":"R. Giorgi, M. Becchi, F. Palumbo","doi":"10.1145/3075564","DOIUrl":"https://doi.org/10.1145/3075564","url":null,"abstract":"","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133513482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alejandro Rico, José A. Joao, Chris Adeniyi-Jones, E. V. Hensbergen
ARM's involvement in funded international projects has helped pave the road towards ARM-based supercomputers. ARM and its partners have collaborately grown an HPC ecosystem with software and hardware solutions that provide choice in a unified software ecosystem. Partners have announced important HPC deployments resulting from collaborations around the globe. One of the key enabling technologies for ARM in HPC is the Scalable Vector Extension, an instruction set extension for vector processing. This paper discusses ARM's journey into HPC, the current state of the ARM HPC ecosystem, the approach to HPC node architecture co-design, and details on the Scalable Vector Extension as a future technology representing the reemergence of vectors.
{"title":"ARM HPC Ecosystem and the Reemergence of Vectors: Invited Paper","authors":"Alejandro Rico, José A. Joao, Chris Adeniyi-Jones, E. V. Hensbergen","doi":"10.1145/3075564.3095086","DOIUrl":"https://doi.org/10.1145/3075564.3095086","url":null,"abstract":"ARM's involvement in funded international projects has helped pave the road towards ARM-based supercomputers. ARM and its partners have collaborately grown an HPC ecosystem with software and hardware solutions that provide choice in a unified software ecosystem. Partners have announced important HPC deployments resulting from collaborations around the globe. One of the key enabling technologies for ARM in HPC is the Scalable Vector Extension, an instruction set extension for vector processing. This paper discusses ARM's journey into HPC, the current state of the ARM HPC ecosystem, the approach to HPC node architecture co-design, and details on the Scalable Vector Extension as a future technology representing the reemergence of vectors.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124881172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Side channel analysis and active fault analysis are now major threats to even mathematically robust cryptographic algorithms that are otherwise resistant to classical cryptanalysis. It is necessary to design suitable countermeasures to protect cryptographic primitives against such attacks. This paper focuses on designing encryption schemes that are innately secure against fault analysis. The paper formally proves that one such design strategy namely the use of key-dependent S-Boxes, is only partially secure against DFA. The paper then examines the fault tolerance of encryption schemes that use a key-independent secret tweak value for randomization. In particular, the paper focuses on a linear tweak based and a non-linear tweak based version of a recently proposed block cipher DRECON. The paper demonstrates that while both versions are secure against classical DFA, the non-linear tweak based version provides greater fault coverage against stronger fault models. This fact, together with the DPA resistance provided by the use of variable S-Boxes, makes DRECON a strong candidate for the design of secure cryptographic primitives. All claims have been validated by experimental results on a SASEBO GII platform.
{"title":"Using Tweaks To Design Fault Resistant Ciphers (Full Version)","authors":"Sikhar Patranabis, Debapriya Basu Roy, Debdeep Mukhopadhyay","doi":"10.1145/3075564.3091965","DOIUrl":"https://doi.org/10.1145/3075564.3091965","url":null,"abstract":"Side channel analysis and active fault analysis are now major threats to even mathematically robust cryptographic algorithms that are otherwise resistant to classical cryptanalysis. It is necessary to design suitable countermeasures to protect cryptographic primitives against such attacks. This paper focuses on designing encryption schemes that are innately secure against fault analysis. The paper formally proves that one such design strategy namely the use of key-dependent S-Boxes, is only partially secure against DFA. The paper then examines the fault tolerance of encryption schemes that use a key-independent secret tweak value for randomization. In particular, the paper focuses on a linear tweak based and a non-linear tweak based version of a recently proposed block cipher DRECON. The paper demonstrates that while both versions are secure against classical DFA, the non-linear tweak based version provides greater fault coverage against stronger fault models. This fact, together with the DPA resistance provided by the use of variable S-Boxes, makes DRECON a strong candidate for the design of secure cryptographic primitives. All claims have been validated by experimental results on a SASEBO GII platform.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127671335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillaume Chapuis, H. Djidjev, Georg Hahn, Guillaume Rizk
This paper assesses the performance of the D-Wave 2X (DW) quantum annealer for finding a maximum clique in a graph, one of the most fundamental and important NP-hard problems. Because the size of the largest graphs DW can directly solve is quite small (usually around 45 vertices), we also consider decomposition algorithms intended for larger graphs and analyze their performance. For smaller graphs that fit DW, we provide formulations of the maximum clique problem as a quadratic unconstrained binary optimization (QUBO) problem, which is one of the two input types (together with the Ising model) acceptable by the machine, and compare several quantum implementations to current classical algorithms such as simulated annealing, Gurobi, and third-party clique finding heuristics. We further estimate the contributions of the quantum phase of the quantum annealer and the classical post-processing phase typically used to enhance each solution returned by DW. We demonstrate that on random graphs that fit DW, no quantum speedup can be observed compared with the classical algorithms. On the other hand, for instances specifically designed to fit well the DW qubit interconnection network, we observe substantial speed-ups in computing time over classical approaches.
{"title":"Finding Maximum Cliques on a Quantum Annealer","authors":"Guillaume Chapuis, H. Djidjev, Georg Hahn, Guillaume Rizk","doi":"10.1145/3075564.3075575","DOIUrl":"https://doi.org/10.1145/3075564.3075575","url":null,"abstract":"This paper assesses the performance of the D-Wave 2X (DW) quantum annealer for finding a maximum clique in a graph, one of the most fundamental and important NP-hard problems. Because the size of the largest graphs DW can directly solve is quite small (usually around 45 vertices), we also consider decomposition algorithms intended for larger graphs and analyze their performance. For smaller graphs that fit DW, we provide formulations of the maximum clique problem as a quadratic unconstrained binary optimization (QUBO) problem, which is one of the two input types (together with the Ising model) acceptable by the machine, and compare several quantum implementations to current classical algorithms such as simulated annealing, Gurobi, and third-party clique finding heuristics. We further estimate the contributions of the quantum phase of the quantum annealer and the classical post-processing phase typically used to enhance each solution returned by DW. We demonstrate that on random graphs that fit DW, no quantum speedup can be observed compared with the classical algorithms. On the other hand, for instances specifically designed to fit well the DW qubit interconnection network, we observe substantial speed-ups in computing time over classical approaches.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125914552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hadoop has been used widely for data analytic tasks in various domains. At the same time, data volume is expected to grow even further in the next years. Hadoop recently introduced the concept Archival Storage, an automated tiered storage technique for increasing storage capacity for long-term storage. However, Hadoop Distributed File System's scalability is limited by the total number of files that can be stored, and it is likely that the number of files increases fast when using it for archival purposes. This paper presents an approach for improving HDFS' scalability when using it as an archival storage. We present a tool that extends Hadoop Archive to an appendable file format. New files are appended to one of the existing archive data files efficiently without rewriting the whole archive. Therefore, a first fit algorithm is used to fill up the often not fully utilized fixed-sized data blocks of the archive data files. Index files are updated using a red-black tree providing guaranteed fast lookup and insert performance. We show that the tool performs well for different sizes of archives and number of files to add. By distributing new files efficiently, we also reduce the number of data blocks needed for archiving and, thus, reduce the memory footprint on the NameNode.
{"title":"Addressing Hadoop's Small File Problem With an Appendable Archive File Format","authors":"T. Renner, Johannes Müller, L. Thamsen, O. Kao","doi":"10.1145/3075564.3078888","DOIUrl":"https://doi.org/10.1145/3075564.3078888","url":null,"abstract":"Hadoop has been used widely for data analytic tasks in various domains. At the same time, data volume is expected to grow even further in the next years. Hadoop recently introduced the concept Archival Storage, an automated tiered storage technique for increasing storage capacity for long-term storage. However, Hadoop Distributed File System's scalability is limited by the total number of files that can be stored, and it is likely that the number of files increases fast when using it for archival purposes. This paper presents an approach for improving HDFS' scalability when using it as an archival storage. We present a tool that extends Hadoop Archive to an appendable file format. New files are appended to one of the existing archive data files efficiently without rewriting the whole archive. Therefore, a first fit algorithm is used to fill up the often not fully utilized fixed-sized data blocks of the archive data files. Index files are updated using a red-black tree providing guaranteed fast lookup and insert performance. We show that the tool performs well for different sizes of archives and number of files to add. By distributing new files efficiently, we also reduce the number of data blocks needed for archiving and, thus, reduce the memory footprint on the NameNode.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125916244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francis B. Moreira, M. Diener, P. Navaux, I. Koren
Detecting anomalous application executions is a challenging problem, due to the diversity of anomalies that can occur, such as programming bugs, silent data corruption, or even malicious code corruption. Moreover, the similarity to a regular execution that can occur in these cases, especially in silent data corruption, makes distinction from normal executions difficult. In this paper, we develop a mechanism that can detect such anomalous executions based on changes in the memory access pattern of an application. We analyze memory patterns using a two-level machine learning approach. First, we classify the behavior of different memory access periods within applications using Gaussian mixtures. Then, based on these classifications, we construct matrix representations of Markov chains to obtain information regarding the temporal behavior of these memory accesses. Based on metrics of matrix similarity, we can classify whether the application behaves as expected or anomalously. Using gradient boosting on the metrics of matrix similarity, our technique correctly classifies more than 85% of all executions, identifying instances of the same application and different applications. We can also detect a range of faulty executions caused by benign or malicious permanent bit flips in the code section.
{"title":"Data mining the memory access stream to detect anomalous application behavior","authors":"Francis B. Moreira, M. Diener, P. Navaux, I. Koren","doi":"10.1145/3075564.3075578","DOIUrl":"https://doi.org/10.1145/3075564.3075578","url":null,"abstract":"Detecting anomalous application executions is a challenging problem, due to the diversity of anomalies that can occur, such as programming bugs, silent data corruption, or even malicious code corruption. Moreover, the similarity to a regular execution that can occur in these cases, especially in silent data corruption, makes distinction from normal executions difficult. In this paper, we develop a mechanism that can detect such anomalous executions based on changes in the memory access pattern of an application. We analyze memory patterns using a two-level machine learning approach. First, we classify the behavior of different memory access periods within applications using Gaussian mixtures. Then, based on these classifications, we construct matrix representations of Markov chains to obtain information regarding the temporal behavior of these memory accesses. Based on metrics of matrix similarity, we can classify whether the application behaves as expected or anomalously. Using gradient boosting on the metrics of matrix similarity, our technique correctly classifies more than 85% of all executions, identifying instances of the same application and different applications. We can also detect a range of faulty executions caused by benign or malicious permanent bit flips in the code section.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128368770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Breadth-First Search (BFS) algorithm is an important building block for graph analysis of large datasets. The BFS parallelisation has been shown to be challenging because of its inherent characteristics, including irregular memory access patterns, data dependencies and workload imbalance, that limit its scalability. We investigate the optimisation and vectorisation of the hybrid BFS (a combination of top-down and bottom-up approaches for BFS) on the Xeon Phi, which has advanced vector processing capabilities. The results show that our new implementation improves by 33%, for a one million vertices graph, compared to the state-of-the-art.
{"title":"Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi","authors":"Mireya Paredes, G. Riley, M. Luján","doi":"10.1145/3075564.3075573","DOIUrl":"https://doi.org/10.1145/3075564.3075573","url":null,"abstract":"The Breadth-First Search (BFS) algorithm is an important building block for graph analysis of large datasets. The BFS parallelisation has been shown to be challenging because of its inherent characteristics, including irregular memory access patterns, data dependencies and workload imbalance, that limit its scalability. We investigate the optimisation and vectorisation of the hybrid BFS (a combination of top-down and bottom-up approaches for BFS) on the Xeon Phi, which has advanced vector processing capabilities. The results show that our new implementation improves by 33%, for a one million vertices graph, compared to the state-of-the-art.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126783760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the last years the traditional ways to keep the increase of hardware performance to the rate predicted by the Moore's Law vanished. When uni-cores were the norm, hardware design was decoupled from the software stack thanks to a well defined Instruction Set Architecture (ISA). This simple interface allowed developing applications without worrying too much about the underlying hardware, while computer architects proposed techniques to aggressively exploit Instruction-Level Parallelism (ILP) in superscalar processors. Current multi-cores are designed as simple symmetric multiprocessors on a chip. While these designs are able to compensate the clock frequency stagnation, they face multiple problems in terms of power consumption, programmability, resilience or memory. The solution is to give more responsibility to the runtime system and to let it tightly collaborate with the hardware. The runtime has to drive the design of future multi-cores architectures. In this talk, we introduce an approach towards a Runtime-Aware Architecture (RAA), a massively parallel architecture designed from the runtime's perspective.
{"title":"Runtime Aware Architectures","authors":"M. Valero","doi":"10.1109/IPDPS.2017.130","DOIUrl":"https://doi.org/10.1109/IPDPS.2017.130","url":null,"abstract":"In the last years the traditional ways to keep the increase of hardware performance to the rate predicted by the Moore's Law vanished. When uni-cores were the norm, hardware design was decoupled from the software stack thanks to a well defined Instruction Set Architecture (ISA). This simple interface allowed developing applications without worrying too much about the underlying hardware, while computer architects proposed techniques to aggressively exploit Instruction-Level Parallelism (ILP) in superscalar processors. Current multi-cores are designed as simple symmetric multiprocessors on a chip. While these designs are able to compensate the clock frequency stagnation, they face multiple problems in terms of power consumption, programmability, resilience or memory. The solution is to give more responsibility to the runtime system and to let it tightly collaborate with the hardware. The runtime has to drive the design of future multi-cores architectures. In this talk, we introduce an approach towards a Runtime-Aware Architecture (RAA), a massively parallel architecture designed from the runtime's perspective.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122828762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}