Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365918
Aadil Alshammari, A. Rezgui
Online social networks have been increasingly growing over the past few years. One of the critical factors that drive these social networks’ success and growth is the friendship recommender algorithms, which are used to suggest relationships between users. Current friending algorithms are designed to recommend friendship connections that are easily accepted. Yet, most of these accepted relationships do not lead to any interactions. We refer to these relationships as weak connections. Facebook’s Friends-of-Friends (FoF) algorithm is an example of a friending algorithm that generates friendship recommendations with a high acceptance rate. However, a considerably high percentage of Facebook algorithm’s recommendations are of weak connections. The metric of measuring the accuracy of friendship recommender algorithms by acceptance rate does not correlate with the level of interactions, i.e., how much connected friends interact with one another. Consequently, new metrics and friendship recommenders are needed to form the next generation of social networks by generating better edges instead of merely growing the social graph with weak edges. This paper is a step towards this vision. We first introduce a new metric to measure the accuracy of friending recommendations by the probability that they lead to interactions. We then briefly investigate existing recommender systems and their limitations. We also highlight the consequences of recommending weak relationships within online social networks. To overcome the limitations of current friending algorithms, we present and evaluate a novel approach that generates friendship recommendations that have a higher probability of leading to interactions between users than existing friending algorithms.
{"title":"Better Edges not Bigger Graphs: An Interaction-Driven Friendship Recommender Algorithm for Social Networks","authors":"Aadil Alshammari, A. Rezgui","doi":"10.1109/CloudTech49835.2020.9365918","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365918","url":null,"abstract":"Online social networks have been increasingly growing over the past few years. One of the critical factors that drive these social networks’ success and growth is the friendship recommender algorithms, which are used to suggest relationships between users. Current friending algorithms are designed to recommend friendship connections that are easily accepted. Yet, most of these accepted relationships do not lead to any interactions. We refer to these relationships as weak connections. Facebook’s Friends-of-Friends (FoF) algorithm is an example of a friending algorithm that generates friendship recommendations with a high acceptance rate. However, a considerably high percentage of Facebook algorithm’s recommendations are of weak connections. The metric of measuring the accuracy of friendship recommender algorithms by acceptance rate does not correlate with the level of interactions, i.e., how much connected friends interact with one another. Consequently, new metrics and friendship recommenders are needed to form the next generation of social networks by generating better edges instead of merely growing the social graph with weak edges. This paper is a step towards this vision. We first introduce a new metric to measure the accuracy of friending recommendations by the probability that they lead to interactions. We then briefly investigate existing recommender systems and their limitations. We also highlight the consequences of recommending weak relationships within online social networks. To overcome the limitations of current friending algorithms, we present and evaluate a novel approach that generates friendship recommendations that have a higher probability of leading to interactions between users than existing friending algorithms.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116353118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365878
Badr Hirchoua, B. Ouhbi, B. Frikh
Automated trading is fully represented as an online decision-making problem, where agents desire to sell it at a higher price to buy at a low one. In financial theory, financial markets trading produces a noisy and random behavior involving highly imperfect information. Therefore, developing a profitable strategy is very complicated in dynamic and complex stock market environments.This paper introduces a new deep reinforcement learning (DRL) method based on the encouragement window policy for automatic stock trading. Motivated by the advantage function, the proposed approach trains a DRL agent to handle the trading environment’s dynamicity and generate huge profits. On the one hand, the advantage function tries to estimate the relative value of the current state’s selected actions. It consists of the discounted sum of rewards and the baseline estimate. On the other hand, the encouragement window is based only on the last rewards, providing a dense synthesized experience instead of a noisy signal. This process has progressively improved actions’ quality by balancing the action selection versus states’ uncertainty. The self-learned rules drive the agent’s policy to choose productive actions that produce a high achievement across the environment. Experimental results on four real-world stocks have proven the proposed system’s efficiency. Precisely, it has produced outstanding performances, executed more creative trades by a small number of transactions, and outperformed different baselines.
{"title":"Rules Based Policy for Stock Trading: A New Deep Reinforcement Learning Method","authors":"Badr Hirchoua, B. Ouhbi, B. Frikh","doi":"10.1109/CloudTech49835.2020.9365878","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365878","url":null,"abstract":"Automated trading is fully represented as an online decision-making problem, where agents desire to sell it at a higher price to buy at a low one. In financial theory, financial markets trading produces a noisy and random behavior involving highly imperfect information. Therefore, developing a profitable strategy is very complicated in dynamic and complex stock market environments.This paper introduces a new deep reinforcement learning (DRL) method based on the encouragement window policy for automatic stock trading. Motivated by the advantage function, the proposed approach trains a DRL agent to handle the trading environment’s dynamicity and generate huge profits. On the one hand, the advantage function tries to estimate the relative value of the current state’s selected actions. It consists of the discounted sum of rewards and the baseline estimate. On the other hand, the encouragement window is based only on the last rewards, providing a dense synthesized experience instead of a noisy signal. This process has progressively improved actions’ quality by balancing the action selection versus states’ uncertainty. The self-learned rules drive the agent’s policy to choose productive actions that produce a high achievement across the environment. Experimental results on four real-world stocks have proven the proposed system’s efficiency. Precisely, it has produced outstanding performances, executed more creative trades by a small number of transactions, and outperformed different baselines.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128546790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365897
Abdelhak Khalil, Mustapha Belaïssaoui
Nowadays, NoSQL technologies are gaining significant ground and considered as the future of data storage, especially when it comes to huge amount of data, which is the case of data warehouse solutions. NoSQL databases provide high scalability and good performance among relational ones, which are really time consuming and can’t handle large data volume. The growing popularity of the term NoSQL these days and vaguely related phrases like big data make us think about using this technology in decision support systems. The purpose of this paper is to investigate the possibility to instantiate a big data mart under one of the most popular and least complicated types of NoSQL databases; namely key-value store, the main challenge is to make a good correlation between the old-school approach of data warehousing based on traditional databases that favor data integrity, and interesting opportunities offered by new generation of database management systems. The paper describes the transformation process from multidimensional conceptual schema to the logical model following three approaches, and outlines a list of strengths and weaknesses for each one based on practical experience under Oracle NoSQL Database.
{"title":"New approach for implementing big datamart using NoSQL key-value stores","authors":"Abdelhak Khalil, Mustapha Belaïssaoui","doi":"10.1109/CloudTech49835.2020.9365897","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365897","url":null,"abstract":"Nowadays, NoSQL technologies are gaining significant ground and considered as the future of data storage, especially when it comes to huge amount of data, which is the case of data warehouse solutions. NoSQL databases provide high scalability and good performance among relational ones, which are really time consuming and can’t handle large data volume. The growing popularity of the term NoSQL these days and vaguely related phrases like big data make us think about using this technology in decision support systems. The purpose of this paper is to investigate the possibility to instantiate a big data mart under one of the most popular and least complicated types of NoSQL databases; namely key-value store, the main challenge is to make a good correlation between the old-school approach of data warehousing based on traditional databases that favor data integrity, and interesting opportunities offered by new generation of database management systems. The paper describes the transformation process from multidimensional conceptual schema to the logical model following three approaches, and outlines a list of strengths and weaknesses for each one based on practical experience under Oracle NoSQL Database.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132778661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/cloudtech49835.2020.9365910
{"title":"CloudTech 2020 Preface","authors":"","doi":"10.1109/cloudtech49835.2020.9365910","DOIUrl":"https://doi.org/10.1109/cloudtech49835.2020.9365910","url":null,"abstract":"","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126045938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365921
Manal Nejjari, A. Meziane
Over the last few years, many scientists have paid special attention to the Sentiment analysis (SA) area of research, thanks to its interesting uses in different domains. The most popular studies have tackled the issue of SA in the English language; however, those dealing with SA in the Arabic language are, up to now, limited due to the complexity of the computational processing of this Morphologically Rich Language (MRL). As a matter of fact, Deep learning and especially the use of Recurrent Neural networks (RNN) has recently proved to be an efficient tool for handling SA challenges. The recourse to some approaches, based on the long short-term memory (LSTM) architecture, has provided adequate solutions to the problems of SA in Arabic language. In our paper, we conduct a study on SA in the Arabic language. Therefore, we propose an enhanced LSTM based model for performing SA of Hotels’ Arabic reviews, called SAHAR-LSTM. This model is evaluated on a dataset containing Hotels’ reviews written in Modern Standard Arabic (MSA), and it is implemented together with two Dimensionality reduction techniques: Latent Semantic Analysis (LSA) and Chi-Square. The experimental results obtained in this work are promising, and demonstrate that our proposed approaches achieve an accuracy of 83.6% on LSA and Chi-Square methods and 92% on LSTM classification Model.
{"title":"SAHAR-LSTM: An enhanced Model for Sentiment Analysis of Hotels’Arabic Reviews based on LSTM","authors":"Manal Nejjari, A. Meziane","doi":"10.1109/CloudTech49835.2020.9365921","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365921","url":null,"abstract":"Over the last few years, many scientists have paid special attention to the Sentiment analysis (SA) area of research, thanks to its interesting uses in different domains. The most popular studies have tackled the issue of SA in the English language; however, those dealing with SA in the Arabic language are, up to now, limited due to the complexity of the computational processing of this Morphologically Rich Language (MRL). As a matter of fact, Deep learning and especially the use of Recurrent Neural networks (RNN) has recently proved to be an efficient tool for handling SA challenges. The recourse to some approaches, based on the long short-term memory (LSTM) architecture, has provided adequate solutions to the problems of SA in Arabic language. In our paper, we conduct a study on SA in the Arabic language. Therefore, we propose an enhanced LSTM based model for performing SA of Hotels’ Arabic reviews, called SAHAR-LSTM. This model is evaluated on a dataset containing Hotels’ reviews written in Modern Standard Arabic (MSA), and it is implemented together with two Dimensionality reduction techniques: Latent Semantic Analysis (LSA) and Chi-Square. The experimental results obtained in this work are promising, and demonstrate that our proposed approaches achieve an accuracy of 83.6% on LSA and Chi-Square methods and 92% on LSTM classification Model.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129189544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365922
N. Luwes, Sarel J.B. Lubbe
Photovoltaic generation is the proses used to convert solar radiation into electrical energy, at this stage in the development of photovoltaic technology the efficiency of such systems is low and any loss of efficiency should be prevented to create an optimal system. The proposal is an IoT (Internet of things) device that can be used to prevent power loss on large solar farms by monitoring each array separately and giving feedback on efficiency. It is also able to aid in the prevention of power loss by early detection of problematic arrays. A case study calculation is done on a 50MW solar farm to show the possible financial impact of the system, as well as describing the construction and operation of the system. The literature review section describes the equations to calculate the quality and accuracy of the instrument as well as a discussion on sensors and hardware used. It also discusses a real-world case study 50MW photovoltaic plant. The methodology explains the construction and evaluation of the instrument as well as how to calculate the cost impact if implemented on a photovoltaic generation station. The results explain what the relevance is of all the calculations. Conclusions are drawn discussing the outcome and overall relevance. and demonstrating the cost-saving that can be achieved at a typical 50 MW photovoltaic generation station. This instrument’s low production cost could mean that it can be incorporated in large or small scale Photovoltaic generation systems.
{"title":"An IoT data logging instrument for monitoring and early efficiency loss detection at a photovoltaic generation plant","authors":"N. Luwes, Sarel J.B. Lubbe","doi":"10.1109/CloudTech49835.2020.9365922","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365922","url":null,"abstract":"Photovoltaic generation is the proses used to convert solar radiation into electrical energy, at this stage in the development of photovoltaic technology the efficiency of such systems is low and any loss of efficiency should be prevented to create an optimal system. The proposal is an IoT (Internet of things) device that can be used to prevent power loss on large solar farms by monitoring each array separately and giving feedback on efficiency. It is also able to aid in the prevention of power loss by early detection of problematic arrays. A case study calculation is done on a 50MW solar farm to show the possible financial impact of the system, as well as describing the construction and operation of the system. The literature review section describes the equations to calculate the quality and accuracy of the instrument as well as a discussion on sensors and hardware used. It also discusses a real-world case study 50MW photovoltaic plant. The methodology explains the construction and evaluation of the instrument as well as how to calculate the cost impact if implemented on a photovoltaic generation station. The results explain what the relevance is of all the calculations. Conclusions are drawn discussing the outcome and overall relevance. and demonstrating the cost-saving that can be achieved at a typical 50 MW photovoltaic generation station. This instrument’s low production cost could mean that it can be incorporated in large or small scale Photovoltaic generation systems.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114338901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365902
R. E. Sibai, J. B. Abdo, C. A. Jaoude, J. Demerjian, Yousra Chabchoub, Raja Chiky
Water monitoring is one of the critical battles of sustainability for a better future of humanity. 44 countries are considered at high risk of the water crisis, 28 of which are developing countries and have limited capacity to deploy a national scale solution. As a response to the United Nation’s sustainability goals and initiatives, this paper proposes an intelligent water monitoring service which acts as a foundational infrastructure for all future water management systems. It also provides municipalities, Non-Governmental Organization and other private initiatives with the tools needed to establish local water monitoring in the scale of villages or rural areas with a very small initial investment.
{"title":"A cloud-based foundational infrastructure for water management ecosystem","authors":"R. E. Sibai, J. B. Abdo, C. A. Jaoude, J. Demerjian, Yousra Chabchoub, Raja Chiky","doi":"10.1109/CloudTech49835.2020.9365902","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365902","url":null,"abstract":"Water monitoring is one of the critical battles of sustainability for a better future of humanity. 44 countries are considered at high risk of the water crisis, 28 of which are developing countries and have limited capacity to deploy a national scale solution. As a response to the United Nation’s sustainability goals and initiatives, this paper proposes an intelligent water monitoring service which acts as a foundational infrastructure for all future water management systems. It also provides municipalities, Non-Governmental Organization and other private initiatives with the tools needed to establish local water monitoring in the scale of villages or rural areas with a very small initial investment.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114557019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365920
Olivier Debauche, S. Mahmoudi, Yahya Moussaoui
The Internet of Things (IoT) is becoming more and more present in our daily lives and affects all areas of activity. More and more devices capable of interacting with each other are being designed and appearing on the market. Learning about IoT technologies is becoming inevitable in education. In this article, we propose a demonstrator to learn, through use cases, the essential concepts of IoT applied to Smart Homes. From basic use cases implemented in a model building, the general public can more easily understand the operating principles of these new applications, which opens the door to the imagination of new ones.
{"title":"Internet of Things Learning: a Practical Case for Smart Building automation","authors":"Olivier Debauche, S. Mahmoudi, Yahya Moussaoui","doi":"10.1109/CloudTech49835.2020.9365920","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365920","url":null,"abstract":"The Internet of Things (IoT) is becoming more and more present in our daily lives and affects all areas of activity. More and more devices capable of interacting with each other are being designed and appearing on the market. Learning about IoT technologies is becoming inevitable in education. In this article, we propose a demonstrator to learn, through use cases, the essential concepts of IoT applied to Smart Homes. From basic use cases implemented in a model building, the general public can more easily understand the operating principles of these new applications, which opens the door to the imagination of new ones.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115619136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365872
Assia Brighen, Hachem Slimani, A. Rezgui, H. Kheddouci
Vertex graph coloring (VGC) is a well known problem in graph theory and has a large number of applications in various domains such as telecommunications, bioinformatics, and Internet. It is one of the 21 NP-complete problems of Karp. Several large graph treatment frameworks have emerged and are effective options to deal with the VGC problem. Examples of those frameworks include Pregel, Graphx and Giraph. The latter is one of the most popular large graph processing frameworks both in industry and academia. In this paper, we present a novel graph coloring algorithm designed for utilizing the simple parallelization technique provided by the Giraph framework or any other vertex-centric paradigm. We have compared our algorithm to existing Giraph graph coloring algorithms with regard to solution quality (number of used colors) and CPU runtime, using several large graph datasets. The obtained results have shown that the proposed algorithm is much more efficient than existing Giraph algorithms.
{"title":"A distributed large graph coloring algorithm on Giraph","authors":"Assia Brighen, Hachem Slimani, A. Rezgui, H. Kheddouci","doi":"10.1109/CloudTech49835.2020.9365872","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365872","url":null,"abstract":"Vertex graph coloring (VGC) is a well known problem in graph theory and has a large number of applications in various domains such as telecommunications, bioinformatics, and Internet. It is one of the 21 NP-complete problems of Karp. Several large graph treatment frameworks have emerged and are effective options to deal with the VGC problem. Examples of those frameworks include Pregel, Graphx and Giraph. The latter is one of the most popular large graph processing frameworks both in industry and academia. In this paper, we present a novel graph coloring algorithm designed for utilizing the simple parallelization technique provided by the Giraph framework or any other vertex-centric paradigm. We have compared our algorithm to existing Giraph graph coloring algorithms with regard to solution quality (number of used colors) and CPU runtime, using several large graph datasets. The obtained results have shown that the proposed algorithm is much more efficient than existing Giraph algorithms.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124966243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-24DOI: 10.1109/CloudTech49835.2020.9365893
K. E. Moutaouakil, Abdellatif el Ouissari, A. Touhafi, N. Aharrane
Support Vector Machines (SVM) is a classification model based on the duality optimization approach. Non-zero Lagrange multipliers correspond to the data selected to be support vectors used to build the margin decision. Unfortunately, SVM has two major drawbacks: the noisy and redundant data cause an overfitting; moreover, the number of local minima increases with the size of data, even worse when it comes to Big Data. To overcome these shortcoming, we propose a new version of SVM, called Density Based Support Vector Machine (DBVSM), which performs on three steps: first, we set two parameters, the radius of the neighborhood and the size of this latter. Second, we determine three types of points: noisy, cord and interior. Third, we solve the dual problem based on the cord data only. To justify this choice, we demonstrate that the cord points cannot be support vectors. Moreover, we show that the kernel functions don't change the cord point nature even. The DBSVM is benchmarked on several datasets and is compared with a variety of methods in the literature. The results of the tests prove that the proposed algorithm is able to provide very competitive results in terms of time, classification performance, and capacity to tackle datasets of very large size. Finally, to point out the consistency of the DBSVM, several tests were performed for different values of the ratio and the neighborhood size.
{"title":"An Improved Density Based Support Vector Machine (DBSVM)","authors":"K. E. Moutaouakil, Abdellatif el Ouissari, A. Touhafi, N. Aharrane","doi":"10.1109/CloudTech49835.2020.9365893","DOIUrl":"https://doi.org/10.1109/CloudTech49835.2020.9365893","url":null,"abstract":"Support Vector Machines (SVM) is a classification model based on the duality optimization approach. Non-zero Lagrange multipliers correspond to the data selected to be support vectors used to build the margin decision. Unfortunately, SVM has two major drawbacks: the noisy and redundant data cause an overfitting; moreover, the number of local minima increases with the size of data, even worse when it comes to Big Data. To overcome these shortcoming, we propose a new version of SVM, called Density Based Support Vector Machine (DBVSM), which performs on three steps: first, we set two parameters, the radius of the neighborhood and the size of this latter. Second, we determine three types of points: noisy, cord and interior. Third, we solve the dual problem based on the cord data only. To justify this choice, we demonstrate that the cord points cannot be support vectors. Moreover, we show that the kernel functions don't change the cord point nature even. The DBSVM is benchmarked on several datasets and is compared with a variety of methods in the literature. The results of the tests prove that the proposed algorithm is able to provide very competitive results in terms of time, classification performance, and capacity to tackle datasets of very large size. Finally, to point out the consistency of the DBSVM, several tests were performed for different values of the ratio and the neighborhood size.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127679017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}