B. E. Moutafis, C. Filelis-Papadopoulos, G. Gravvanis, J. Morrison
Over the last decade, Cloud environments have gained significant attention by the scientific community, due to their flexibility in the allocation of resources and the various applications hosted in such environments. Recently, high performance computing applications are migrating to Cloud environments. Efficient methods are sought for solving very large sparse linear systems occurring in various scientific fields such as Computational Fluid Dynamics, N-Body simulations and Computational Finance. Herewith, the parallel multi-projection type methods are reviewed and discussions concerning the implementation issues for IaaS-type Cloud environments are given. Moreover, phenomena occurring due to the "noisy neighbor" problem, varying interconnection speeds as well as load imbalance are studied. Furthermore, the level of exposure of specialized hardware residing in modern CPUs through the different layers of software is also examined. Finally, numerical results concerning the applicability and effectiveness of multi-projection type methods in Cloud environments based on OpenStack are presented.
{"title":"On Issues Concerning Cloud Environments in Scope of Scalable Multi-Projection Methods","authors":"B. E. Moutafis, C. Filelis-Papadopoulos, G. Gravvanis, J. Morrison","doi":"10.1109/SYNASC.2016.061","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.061","url":null,"abstract":"Over the last decade, Cloud environments have gained significant attention by the scientific community, due to their flexibility in the allocation of resources and the various applications hosted in such environments. Recently, high performance computing applications are migrating to Cloud environments. Efficient methods are sought for solving very large sparse linear systems occurring in various scientific fields such as Computational Fluid Dynamics, N-Body simulations and Computational Finance. Herewith, the parallel multi-projection type methods are reviewed and discussions concerning the implementation issues for IaaS-type Cloud environments are given. Moreover, phenomena occurring due to the \"noisy neighbor\" problem, varying interconnection speeds as well as load imbalance are studied. Furthermore, the level of exposure of specialized hardware residing in modern CPUs through the different layers of software is also examined. Finally, numerical results concerning the applicability and effectiveness of multi-projection type methods in Cloud environments based on OpenStack are presented.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122667713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays MapReduce and its open source implementation, Apache Hadoop, are the most widespread solutions for handling massive dataset on clusters of commodity hardware. At the expense of a somewhat reduced performance in comparison to HPC technologies, the MapReduce framework provides fault tolerance and automatic parallelization without any efforts by developers. Since in many cases Hadoop is adopted to support business critical activities, it is often important to predict with fair confidence the execution time of submitted jobs, for instance when SLAs are established with end-users. In this work, we propose and validate a hybrid approach exploiting both queuing networks and support vector regression, in order to achieve a good accuracy without too many costly experiments on a real setup. The experimental results show how the proposed approach attains a 21% improvement in accuracy over applying machine learning techniques without any support from analytical models.
{"title":"A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Cloud Environment","authors":"Ehsan Ataie, E. Gianniti, D. Ardagna, A. Movaghar","doi":"10.1109/SYNASC.2016.072","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.072","url":null,"abstract":"Nowadays MapReduce and its open source implementation, Apache Hadoop, are the most widespread solutions for handling massive dataset on clusters of commodity hardware. At the expense of a somewhat reduced performance in comparison to HPC technologies, the MapReduce framework provides fault tolerance and automatic parallelization without any efforts by developers. Since in many cases Hadoop is adopted to support business critical activities, it is often important to predict with fair confidence the execution time of submitted jobs, for instance when SLAs are established with end-users. In this work, we propose and validate a hybrid approach exploiting both queuing networks and support vector regression, in order to achieve a good accuracy without too many costly experiments on a real setup. The experimental results show how the proposed approach attains a 21% improvement in accuracy over applying machine learning techniques without any support from analytical models.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131643967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diana-Lucia Miholca, G. Czibula, Ioan-Gabriel Mircea, I. Czibula
In this paper we approach from a machine learningperspective the problem of identifying the sex of archaeologicalremains from anthropometric data, an important problem withinthe field of bioarchaeology. As the conditions for detecting thesex of a skeleton are not entirely known, machine learning baseddata mining models are appropriate to address this problem sincethey are able to capture unobservable patterns in data. Thesepatterns could be relevant for classifying a skeletal remain asmale or female. We propose two machine learning models basedon artificial neural networks for identifying the sex of humanskeletons from bone measurements. The proposed models areexperimentally evaluated on case studies generated from twodata sets publicly available in the archaeological literature. Theobtained results show that the proposed data mining modelsare effective for detecting the sex of archaeological remains, confirming the potential of our proposal.
{"title":"Machine Learning Based Approaches for Sex Identification in Bioarchaeology","authors":"Diana-Lucia Miholca, G. Czibula, Ioan-Gabriel Mircea, I. Czibula","doi":"10.1109/SYNASC.2016.056","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.056","url":null,"abstract":"In this paper we approach from a machine learningperspective the problem of identifying the sex of archaeologicalremains from anthropometric data, an important problem withinthe field of bioarchaeology. As the conditions for detecting thesex of a skeleton are not entirely known, machine learning baseddata mining models are appropriate to address this problem sincethey are able to capture unobservable patterns in data. Thesepatterns could be relevant for classifying a skeletal remain asmale or female. We propose two machine learning models basedon artificial neural networks for identifying the sex of humanskeletons from bone measurements. The proposed models areexperimentally evaluated on case studies generated from twodata sets publicly available in the archaeological literature. Theobtained results show that the proposed data mining modelsare effective for detecting the sex of archaeological remains, confirming the potential of our proposal.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130742580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We revisit the seminal classical work of T. Sundara Row on the geometry in paper folding published in 1893. After 123 years, the significance of the book remains. This note is intended to provide a short description of Sundara Row’s masterpiece from the viewpoint of the current mathematical theory of origami and show how various geometrical shapes that Sundara Row drew in his book can be produced by a modern tool of computational origami. Furthermore, the tool enables a reader to manipulate, by a simple scripting language, graphics of the produced shapes and the internal algebraic representations, as well as to perform algebraic proofs of the lemmas and theorems that Sundara Row wrote down in his book.
{"title":"Revisit of \"Geometric Exercise in Paper Folding\" from a Viewpoint of Computational Origami","authors":"T. Ida","doi":"10.1109/SYNASC.2016.017","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.017","url":null,"abstract":"We revisit the seminal classical work of T. Sundara Row on the geometry in paper folding published in 1893. After 123 years, the significance of the book remains. This note is intended to provide a short description of Sundara Row’s masterpiece from the viewpoint of the current mathematical theory of origami and show how various geometrical shapes that Sundara Row drew in his book can be produced by a modern tool of computational origami. Furthermore, the tool enables a reader to manipulate, by a simple scripting language, graphics of the produced shapes and the internal algebraic representations, as well as to perform algebraic proofs of the lemmas and theorems that Sundara Row wrote down in his book.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131242853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We show a simple, yet effective, method for storing images, such that retrieval of nearby images is both fast and accurate. The main ingredients are discrete Fourier transforms to extract low frequency components, principal components analysis (PCA) for further compression, and storage in k-D trees. We illustrate the quality of results on the MNIST digit suite and also apply it to chromosome segments.
{"title":"Linking Fourier and PCA Methods for Image Look-Up","authors":"Daniel Lichtblau","doi":"10.1109/SYNASC.2016.028","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.028","url":null,"abstract":"We show a simple, yet effective, method for storing images, such that retrieval of nearby images is both fast and accurate. The main ingredients are discrete Fourier transforms to extract low frequency components, principal components analysis (PCA) for further compression, and storage in k-D trees. We illustrate the quality of results on the MNIST digit suite and also apply it to chromosome segments.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133989679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Horácek, M. Kreuzer, Ange-Salomé Messeng Ekossono
Given a 0-dimensional polynomial system in a polynomial ring over F_2 having only F_2-rational solutions, we optimize the Border Basis Algorithm (BBA) for solving this system by introducing a Boolean BBA. This algorithm is further improved by optimizing the linear algebra steps. We discuss ways to combine it with SAT solvers, optimized methods for performing the combinatorial steps involved in the algorithm, and various approaches to implement the linear algebra steps. Based on our C++ implementation, we provide some timings to compare sparse and dense representations of the coefficient matrices and to Gröebner basis methods.
{"title":"Computing Boolean Border Bases","authors":"J. Horácek, M. Kreuzer, Ange-Salomé Messeng Ekossono","doi":"10.1109/SYNASC.2016.076","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.076","url":null,"abstract":"Given a 0-dimensional polynomial system in a polynomial ring over F_2 having only F_2-rational solutions, we optimize the Border Basis Algorithm (BBA) for solving this system by introducing a Boolean BBA. This algorithm is further improved by optimizing the linear algebra steps. We discuss ways to combine it with SAT solvers, optimized methods for performing the combinatorial steps involved in the algorithm, and various approaches to implement the linear algebra steps. Based on our C++ implementation, we provide some timings to compare sparse and dense representations of the coefficient matrices and to Gröebner basis methods.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116107286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated file analysis is important in malware research for identifying malicious files in large collection of samples. This paper describes an automatic system that can classify a file as infected based on the dynamic behavior of the file observed inside a controlled monitored environment. Based on features revealed at runtime, we train a Support Vector Machine classifier that can be further used to identify malicious files. The paper analyses the classifier performance based on several types of features, from raw runtime information to heuristics generated by expert systems and provides guidelines for the features selection process when dealing with this type of data. We show that by enlarging the features domain, our classifier gains proactivity and is able to detect previously unseen samples, even if they belong to different malware families.
{"title":"Malware Classification Based on Dynamic Behavior","authors":"George Cabau, Magda Buhu, Ciprian Oprișa","doi":"10.1109/SYNASC.2016.057","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.057","url":null,"abstract":"Automated file analysis is important in malware research for identifying malicious files in large collection of samples. This paper describes an automatic system that can classify a file as infected based on the dynamic behavior of the file observed inside a controlled monitored environment. Based on features revealed at runtime, we train a Support Vector Machine classifier that can be further used to identify malicious files. The paper analyses the classifier performance based on several types of features, from raw runtime information to heuristics generated by expert systems and provides guidelines for the features selection process when dealing with this type of data. We show that by enlarging the features domain, our classifier gains proactivity and is able to detect previously unseen samples, even if they belong to different malware families.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116579453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liviu Octavian Mafteiu-Scai, Calin Alexandru Cornigeanu
This paper proposes two parallel hybrid heuristics aiming for the reduction of the average bandwidth of sparse matrices, process used in systems of equations preconditioning. Based on a direct processing of the matrix, the first method combines a heuristic inspired from the laws of physics, with a greedy selection of rows/columns to be interchanged. The second one improves the previous heuristic through the use of an exact formula for determining the most favorable interchanges. Experimental results obtained on an IBM Blue Gene /P supercomputer illustrate the fact that the proposed parallel heuristics lead to better results, with respect to time efficiency, speedup, efficiency and solution.
本文提出了两种并行混合启发式算法,以减少方程预处理系统中稀疏矩阵的平均带宽。基于对矩阵的直接处理,第一种方法结合了受物理定律启发的启发式方法,以及要交换的行/列的贪婪选择。第二种方法通过使用精确的公式来确定最有利的交换,从而改进了前面的启发式方法。在IBM Blue Gene /P超级计算机上的实验结果表明,所提出的并行启发式算法在时间效率、加速、效率和求解方面都有较好的效果。
{"title":"Parallel Heuristics for Systems of Equations Preconditioning","authors":"Liviu Octavian Mafteiu-Scai, Calin Alexandru Cornigeanu","doi":"10.1109/SYNASC.2016.058","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.058","url":null,"abstract":"This paper proposes two parallel hybrid heuristics aiming for the reduction of the average bandwidth of sparse matrices, process used in systems of equations preconditioning. Based on a direct processing of the matrix, the first method combines a heuristic inspired from the laws of physics, with a greedy selection of rows/columns to be interchanged. The second one improves the previous heuristic through the use of an exact formula for determining the most favorable interchanges. Experimental results obtained on an IBM Blue Gene /P supercomputer illustrate the fact that the proposed parallel heuristics lead to better results, with respect to time efficiency, speedup, efficiency and solution.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129511613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artificial Neural Networks often suffer from overfitting, both when trained through backpropagation or evolved through a Genetic Algorithm. An attempt at mitigating the overfitting of GA-evolved ANNs is made by using High-Probability Mutation (≈0.95) on binary-encoded ANN weights. The benchmark used is predicting the evolution of an Internet social network using real-world data. A lower bound is put on the overfit, and both prediction error and overfit are further broken down according to ANN hidden-layers size.
{"title":"Lowering Evolved Artificial Neural Network Overfitting through High-Probability Mutation","authors":"Croitoru Nicolae-Eugen","doi":"10.1109/SYNASC.2016.059","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.059","url":null,"abstract":"Artificial Neural Networks often suffer from overfitting, both when trained through backpropagation or evolved through a Genetic Algorithm. An attempt at mitigating the overfitting of GA-evolved ANNs is made by using High-Probability Mutation (≈0.95) on binary-encoded ANN weights. The benchmark used is predicting the evolution of an Internet social network using real-world data. A lower bound is put on the overfit, and both prediction error and overfit are further broken down according to ANN hidden-layers size.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115520190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Customer Relationship Management (CRM) becamethe best practice for any business that wishes to create, develop and enhance the customer value and implicitly thebusiness shareholders value. Businesses became more aware that in the long term beyondthe first sale customer retention is of crucial importance. However, in most cases, the first sale creates the first impressionof the business. Being able to manage the customer expectationsthrough aspect level sentiment analysis and proper guidancetowards the first purchase, can make the difference between astrong retention rate and a weak retention rate. In this paper we present an approach for designing amulti-agent expert system using product aspect level sentimentanalysis. The goal is to ease the conversion of a prospect toa customer by giving proper recommendations to acceleratethe sale. Aspect level sentiment analysis takes into accountnot only the overall sentiment of the interaction but also thegranular sentiment on the feature level of the products to be soldlike for example price or quality. The multi-agent technologyextends the CRM Systems and provides scalability, robustnessand simplicity of design. Furthermore a prototype was developed and its design andresults are presented and discussed.
{"title":"Multi-Agent Aspect Level Sentiment Analysis in CRM Systems","authors":"Doru Rotovei","doi":"10.1109/SYNASC.2016.068","DOIUrl":"https://doi.org/10.1109/SYNASC.2016.068","url":null,"abstract":"Customer Relationship Management (CRM) becamethe best practice for any business that wishes to create, develop and enhance the customer value and implicitly thebusiness shareholders value. Businesses became more aware that in the long term beyondthe first sale customer retention is of crucial importance. However, in most cases, the first sale creates the first impressionof the business. Being able to manage the customer expectationsthrough aspect level sentiment analysis and proper guidancetowards the first purchase, can make the difference between astrong retention rate and a weak retention rate. In this paper we present an approach for designing amulti-agent expert system using product aspect level sentimentanalysis. The goal is to ease the conversion of a prospect toa customer by giving proper recommendations to acceleratethe sale. Aspect level sentiment analysis takes into accountnot only the overall sentiment of the interaction but also thegranular sentiment on the feature level of the products to be soldlike for example price or quality. The multi-agent technologyextends the CRM Systems and provides scalability, robustnessand simplicity of design. Furthermore a prototype was developed and its design andresults are presented and discussed.","PeriodicalId":268635,"journal":{"name":"2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"17 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130989680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}