Like most modern computer systems, High Performance Computing (HPC) machines integrate many highly configurable hardware devices and software components. Finding their optimal parametrization is a complex task, as the size of the parametric space and the non-linear behavior of HPC systems make hand tuning, theoretical modeling or exhaustive sampling unsuitable for most cases. Auto-tuning methods relying on black-box optimization have emerged as a promising solution for finding systems' best parametrization without making any assumption on their behaviors. In this paper, we present the architecture of an auto-tuning framework, called Smart HPC Application MANager (SHAMan), that integrates black-box optimization heuristics to find the optimal parametrization of an Input/Output (I/O) accelerator for a HPC application. We describe the conceptual and technical architecture of the framework and its native support for HPC clusters' ecosystem. We detail in depth the stand-alone optimization engine and its integration as a service provided by a Web application. We deployed and tested the framework by tuning an I/O accelerator developed by the Atos company on a HPC cluster running in production. The tuner's performance is evaluated by optimizing 90 different I/O oriented applications. We show a median improvement of 29% in speed-up compared to the default parametrization and this improvement goes up to 98% for a certain class of applications.
{"title":"SHAMan","authors":"Sophie Robert, S. Zertal, G. Goret","doi":"10.1145/3419604.3419775","DOIUrl":"https://doi.org/10.1145/3419604.3419775","url":null,"abstract":"Like most modern computer systems, High Performance Computing (HPC) machines integrate many highly configurable hardware devices and software components. Finding their optimal parametrization is a complex task, as the size of the parametric space and the non-linear behavior of HPC systems make hand tuning, theoretical modeling or exhaustive sampling unsuitable for most cases. Auto-tuning methods relying on black-box optimization have emerged as a promising solution for finding systems' best parametrization without making any assumption on their behaviors. In this paper, we present the architecture of an auto-tuning framework, called Smart HPC Application MANager (SHAMan), that integrates black-box optimization heuristics to find the optimal parametrization of an Input/Output (I/O) accelerator for a HPC application. We describe the conceptual and technical architecture of the framework and its native support for HPC clusters' ecosystem. We detail in depth the stand-alone optimization engine and its integration as a service provided by a Web application. We deployed and tested the framework by tuning an I/O accelerator developed by the Atos company on a HPC cluster running in production. The tuner's performance is evaluated by optimizing 90 different I/O oriented applications. We show a median improvement of 29% in speed-up compared to the default parametrization and this improvement goes up to 98% for a certain class of applications.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"142 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120921953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdallah Zahidi, S. Amrane, N. Azami, Naoual Nasser
Solenoids, also called electromagnetic actuators driven by nonlinear magnetic forces, are widely used in many applications. Polarization controllers using fiber squeezer are attractive for their low-loss as well as their low-penalty coherent optical fiber trunk system. However, for the polarization controllers using solenoids as actuators, the stability problem due to the saturation of their magnetic circuit must be studied. In fact, in their conventional configuration, the open-loop stability affects performance and limits applications. Moreover, fluctuations on the performance of solenoids are another major problem especially in industrial applications. These fluctuations are essentially owing to changes in the spring constant, the coefficient of friction, the inductance and resistance of the coil. Preventive maintenance by controlling these parameters is necessary to avoid eventual effects of the parameter variations in the responses of these actuators. This paper proposes a new methodology for smart control of the solenoid response by using polarization controllers based on solenoids that are used as mechanical actuators to exert pressure on optical fiber. The pressure induces optical birefringence that modifies polarization of the light. First, the circuit with PID correctors has been suggested to improve stability performance. Then, a simulation is proposed using Matlab-Simulink software to examine the influence of the solenoid parameters on the corrector constants. The results of the simulation show that if the system parameters change the constants Kp, Ki and Kd of the PID corrector must be adjusted to keep an optimized dynamic response.
{"title":"Self-tuning PID of the Solenoid Response Based on Fiber Squeezer","authors":"Abdallah Zahidi, S. Amrane, N. Azami, Naoual Nasser","doi":"10.1145/3419604.3419769","DOIUrl":"https://doi.org/10.1145/3419604.3419769","url":null,"abstract":"Solenoids, also called electromagnetic actuators driven by nonlinear magnetic forces, are widely used in many applications. Polarization controllers using fiber squeezer are attractive for their low-loss as well as their low-penalty coherent optical fiber trunk system. However, for the polarization controllers using solenoids as actuators, the stability problem due to the saturation of their magnetic circuit must be studied. In fact, in their conventional configuration, the open-loop stability affects performance and limits applications. Moreover, fluctuations on the performance of solenoids are another major problem especially in industrial applications. These fluctuations are essentially owing to changes in the spring constant, the coefficient of friction, the inductance and resistance of the coil. Preventive maintenance by controlling these parameters is necessary to avoid eventual effects of the parameter variations in the responses of these actuators. This paper proposes a new methodology for smart control of the solenoid response by using polarization controllers based on solenoids that are used as mechanical actuators to exert pressure on optical fiber. The pressure induces optical birefringence that modifies polarization of the light. First, the circuit with PID correctors has been suggested to improve stability performance. Then, a simulation is proposed using Matlab-Simulink software to examine the influence of the solenoid parameters on the corrector constants. The results of the simulation show that if the system parameters change the constants Kp, Ki and Kd of the PID corrector must be adjusted to keep an optimized dynamic response.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126170325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, Natural Language Processing has shown significant development, especially in text mining and analysis. An important task in this area is learning vector-space representations of text. Since various machine learning algorithms require representing their inputs in a vector format. In this paper, we highlight the most important language representation learning models used in the literature, ranging from the free contextual approaches like word2vec and Glove until the appearance of recent modern contextualized approaches such as ELMo, BERT, and XLNet. We show and discuss their main architectures and their main strengths and limits.
{"title":"Language representation learning models: A comparative study","authors":"Sanae Achsas, E. Nfaoui","doi":"10.1145/3419604.3419773","DOIUrl":"https://doi.org/10.1145/3419604.3419773","url":null,"abstract":"Recently, Natural Language Processing has shown significant development, especially in text mining and analysis. An important task in this area is learning vector-space representations of text. Since various machine learning algorithms require representing their inputs in a vector format. In this paper, we highlight the most important language representation learning models used in the literature, ranging from the free contextual approaches like word2vec and Glove until the appearance of recent modern contextualized approaches such as ELMo, BERT, and XLNet. We show and discuss their main architectures and their main strengths and limits.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132162502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, Multi Criteria Decision Making (MCDM) methods are becoming increasingly used in solving supply chain management problems due to the conflicting criteria involved in this area. In fact, Multi Criteria Decision Making methods help Decision Makers (DMs) deal with the evaluation of multiple alternatives with taking into account their preferences expressed as numerous criteria. Controlling stocks in multi-echelon inventory systems presents many challenges due to the complexity of the supply chains and the inter-dependencies between their nodes. In particular, choosing a multi-echelon inventory policy to send a batch from an installation to another may depend to the inventory status at all sites, which make this decision difficult to take. Most of the time, the Decision Maker seems to be looking for fitting the decision problem to the MCDM method framework and not adjusting the method to the problem situation. We aim in this paper to guide DMs choose the appropriate MCDM method that will aid them select the best multi-echelon inventory policies to adopt for their supply chains. A comparison of different MCDM methods with the use of a set of evaluation criteria will be established in this research work. We intend to provide a framework for comparing multiple MCDM methods for the multi-echelon inventory system selection problem.
{"title":"Comparison of MCDM Methods for Multi-echelon Inventory System Selection Problem","authors":"Nouçaiba Sbai, L. Benabbou, A. Berrado","doi":"10.1145/3419604.3419783","DOIUrl":"https://doi.org/10.1145/3419604.3419783","url":null,"abstract":"Nowadays, Multi Criteria Decision Making (MCDM) methods are becoming increasingly used in solving supply chain management problems due to the conflicting criteria involved in this area. In fact, Multi Criteria Decision Making methods help Decision Makers (DMs) deal with the evaluation of multiple alternatives with taking into account their preferences expressed as numerous criteria. Controlling stocks in multi-echelon inventory systems presents many challenges due to the complexity of the supply chains and the inter-dependencies between their nodes. In particular, choosing a multi-echelon inventory policy to send a batch from an installation to another may depend to the inventory status at all sites, which make this decision difficult to take. Most of the time, the Decision Maker seems to be looking for fitting the decision problem to the MCDM method framework and not adjusting the method to the problem situation. We aim in this paper to guide DMs choose the appropriate MCDM method that will aid them select the best multi-echelon inventory policies to adopt for their supply chains. A comparison of different MCDM methods with the use of a set of evaluation criteria will be established in this research work. We intend to provide a framework for comparing multiple MCDM methods for the multi-echelon inventory system selection problem.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131898381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The management of global and distributed software projects is a very difficult task further complicated by the emergence of new challenges inherent in stakeholder dispersion. Software cost estimation plays a central role to face challenges in the context of Global Software Development (GSD). The objective of this study is to identify software cost attributes related to GSD context to present an integrative framework encompassing these attributes. Thirty cost attributes were identified using a Systematic Literature Review (SLR) and later compiled into a framework inspired by the Software Engineering Institute (SEI) taxonomy.
{"title":"Identifying Software Cost Attributes of Software Project Management in Global Software Development: An Integrative Framework","authors":"Manal El Bajta, A. Idri","doi":"10.1145/3419604.3419780","DOIUrl":"https://doi.org/10.1145/3419604.3419780","url":null,"abstract":"The management of global and distributed software projects is a very difficult task further complicated by the emergence of new challenges inherent in stakeholder dispersion. Software cost estimation plays a central role to face challenges in the context of Global Software Development (GSD). The objective of this study is to identify software cost attributes related to GSD context to present an integrative framework encompassing these attributes. Thirty cost attributes were identified using a Systematic Literature Review (SLR) and later compiled into a framework inspired by the Software Engineering Institute (SEI) taxonomy.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129240063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamid Slimani, Oussama Hamal, N. E. Faddouli, S. Bennani, Naila Amrous
The accompaniment and the follow-up of the learners in an online training aim at helping the learner to carry out his or her training and to guarantee an adapted and quality learning. During a learning process, personalized search and recommendation of digital educational resources form aspects of this accompaniment. This article presents a search engine and a hybrid recommendation of digital educational resources. This engine allows for filtering and personalized search by providing adapted resources to the users' profiles on the one hand; on the other hand, to making a combination of the collaborative, the content-based and the semantic filtering to propose other additional resources. The semantic filtering is based on the exploitation of SPARQL queries from the system that we propose. They are executed on a remote server containing reusable vocabularies and formalized according to the linked data principles and technologies, such as the Lod Cloud. The result obtained is a set of linked terms to the keywords specified in the search query. These terms are then used to extend the search. We used a search test-set by keywords entered via a form and then we manually analyzed the linked terms obtained and the documents returned. The results obtained by our approach are satisfactory.
{"title":"The hybrid recommendation of digital educational resources in a distance learning environment: the case of MOOC","authors":"Hamid Slimani, Oussama Hamal, N. E. Faddouli, S. Bennani, Naila Amrous","doi":"10.1145/3419604.3419621","DOIUrl":"https://doi.org/10.1145/3419604.3419621","url":null,"abstract":"The accompaniment and the follow-up of the learners in an online training aim at helping the learner to carry out his or her training and to guarantee an adapted and quality learning. During a learning process, personalized search and recommendation of digital educational resources form aspects of this accompaniment. This article presents a search engine and a hybrid recommendation of digital educational resources. This engine allows for filtering and personalized search by providing adapted resources to the users' profiles on the one hand; on the other hand, to making a combination of the collaborative, the content-based and the semantic filtering to propose other additional resources. The semantic filtering is based on the exploitation of SPARQL queries from the system that we propose. They are executed on a remote server containing reusable vocabularies and formalized according to the linked data principles and technologies, such as the Lod Cloud. The result obtained is a set of linked terms to the keywords specified in the search query. These terms are then used to extend the search. We used a search test-set by keywords entered via a form and then we manually analyzed the linked terms obtained and the documents returned. The results obtained by our approach are satisfactory.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129541937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sentiment analysis has aroused the interest of many studies in recent years. Regarding to its high importance in taking and extracting decisional information, the light of research is still shed on it. The first step of a sentiment analysis system is the construction of the basic knowledge, namely the linguistic resources. The classical methods of lexicon building are manual, semi-automatic, or automatic. Both the manual and semi-automatic methods need a manual check that is time and effort consuming whereas the automatic approach neglects word semantic. Herein, we intend to automate the lexicon extraction method as well as giving accurate polarity. In order to perform this task and achieve satisfying results, we extract a Bag-of-Words and we then apply many filters on it to keep only clean and sentimental terms. This paper also explores the effectiveness of a supervised approach based on the Bag-of-Words model in defining sentiment polarity of the processed reviews in order to shed the light on its usefulness.
{"title":"Construction of an accurate automatic lexicon for Arabic sentiment analysis","authors":"Ibtissam Touahri, A. Mazroui","doi":"10.1145/3419604.3419627","DOIUrl":"https://doi.org/10.1145/3419604.3419627","url":null,"abstract":"Sentiment analysis has aroused the interest of many studies in recent years. Regarding to its high importance in taking and extracting decisional information, the light of research is still shed on it. The first step of a sentiment analysis system is the construction of the basic knowledge, namely the linguistic resources. The classical methods of lexicon building are manual, semi-automatic, or automatic. Both the manual and semi-automatic methods need a manual check that is time and effort consuming whereas the automatic approach neglects word semantic. Herein, we intend to automate the lexicon extraction method as well as giving accurate polarity. In order to perform this task and achieve satisfying results, we extract a Bag-of-Words and we then apply many filters on it to keep only clean and sentimental terms. This paper also explores the effectiveness of a supervised approach based on the Bag-of-Words model in defining sentiment polarity of the processed reviews in order to shed the light on its usefulness.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132594642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Open Source Software (OSS) is gaining interests of software engineering community as well as practitioners from industry with the growth of the internet. Studies in estimating maintenance effort (MEE) of such software product have been published in the literature in order to provide better estimation. The aim of this study is to provide a review of studies related to maintenance effort estimation for open source software (OSSMEE). To this end, a set of 60 primary empirical studies are selected from six electronic databases and a discussion is provided according to eight research questions (RQs) related to: publication year, publication source, datasets (OSS projects), metrics (independent variables), techniques, maintenance effort (dependent variable), validation methods, and accuracy criteria used in the empirical validation. This study has found that popular OSS projects have been used, Linear Regression, Naïve Bayes and k Nearest Neighbors were frequently used, and bug resolution was the most used regarding the estimation of maintenance effort for the future releases. A set of gaps are identified and recommendations for researchers are also provided.
{"title":"A Review of Open Source Software Maintenance Effort Estimation","authors":"Chaymae Miloudi, Laila Cheikhi, A. Idri","doi":"10.1145/3419604.3419809","DOIUrl":"https://doi.org/10.1145/3419604.3419809","url":null,"abstract":"Open Source Software (OSS) is gaining interests of software engineering community as well as practitioners from industry with the growth of the internet. Studies in estimating maintenance effort (MEE) of such software product have been published in the literature in order to provide better estimation. The aim of this study is to provide a review of studies related to maintenance effort estimation for open source software (OSSMEE). To this end, a set of 60 primary empirical studies are selected from six electronic databases and a discussion is provided according to eight research questions (RQs) related to: publication year, publication source, datasets (OSS projects), metrics (independent variables), techniques, maintenance effort (dependent variable), validation methods, and accuracy criteria used in the empirical validation. This study has found that popular OSS projects have been used, Linear Regression, Naïve Bayes and k Nearest Neighbors were frequently used, and bug resolution was the most used regarding the estimation of maintenance effort for the future releases. A set of gaps are identified and recommendations for researchers are also provided.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122510953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The ubiquity and the fast growth of online resources has made it more and more difficult to try to respect the differences between learners in terms of cognitive ability and knowledge structure. This is even clearer with recommendation algorithms that use traditional collaborative filtering as they struggle through identifying more helpful, user friendly and easy learning resources. On top of that, the incoherent recommended content and the compound and nonlinear data on online learning users cannot be effectively handled, thus making the recommendations less efficient. To increase the level of efficiency of learning resource recommendations, this paper introduces a two steps efficient resource recommendation model. this model is based on unsupervised deep learning machine to identify learning styles and users' clusters, and a sentiment analyzer bonus system, based on user experience, to improve or decrease recommender items list classification. The model integrates also teachers to incite them to enhance the quality and the success rate of appropriate selected items. The elaboration of such a model requires the use of a considerable quantity of data learners' features, course content and assessment attributes. Furthermore, this model needs to incorporate learner interactions features. These are the requirements to build Learner features vector as input for the first step and Learner-Content ratings vector to choose the more efficient learning resource to recommend.
{"title":"Recommender E-Learning platform using sentiment analysis aggregation","authors":"Jamal Mawane, A. Naji, M. Ramdani","doi":"10.1145/3419604.3419784","DOIUrl":"https://doi.org/10.1145/3419604.3419784","url":null,"abstract":"The ubiquity and the fast growth of online resources has made it more and more difficult to try to respect the differences between learners in terms of cognitive ability and knowledge structure. This is even clearer with recommendation algorithms that use traditional collaborative filtering as they struggle through identifying more helpful, user friendly and easy learning resources. On top of that, the incoherent recommended content and the compound and nonlinear data on online learning users cannot be effectively handled, thus making the recommendations less efficient. To increase the level of efficiency of learning resource recommendations, this paper introduces a two steps efficient resource recommendation model. this model is based on unsupervised deep learning machine to identify learning styles and users' clusters, and a sentiment analyzer bonus system, based on user experience, to improve or decrease recommender items list classification. The model integrates also teachers to incite them to enhance the quality and the success rate of appropriate selected items. The elaboration of such a model requires the use of a considerable quantity of data learners' features, course content and assessment attributes. Furthermore, this model needs to incorporate learner interactions features. These are the requirements to build Learner features vector as input for the first step and Learner-Content ratings vector to choose the more efficient learning resource to recommend.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123427043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lahbib Ajallouda, F. Z. Fagroud, A. Zellou, E. Benlahmar
Today, we are witnessing rapid growth in Web resources that allow Internet users to express and share their ideas, opinions, and judgments on a variety of issues. Several classification approaches have been proposed to classify textual data. But all these approaches require us to label the clusters we want to obtain. Which, in reality, is not available because we do not know in advance the information that can be proposed through these opinions. To overcome this constraint, clustering approaches such as K-mean, HAC or FCM can be exploited. In this paper, we present and compare these approaches. And to show the importance of exploiting clustering algorithms, to classify and analyze textual data in Arabic. By applying them to a real case that has created a great debate in Morocco, which is the case of teachers contracting with academies.
{"title":"K-means, HAC and FCM Which Clustering Approach for Arabic Text?","authors":"Lahbib Ajallouda, F. Z. Fagroud, A. Zellou, E. Benlahmar","doi":"10.1145/3419604.3419779","DOIUrl":"https://doi.org/10.1145/3419604.3419779","url":null,"abstract":"Today, we are witnessing rapid growth in Web resources that allow Internet users to express and share their ideas, opinions, and judgments on a variety of issues. Several classification approaches have been proposed to classify textual data. But all these approaches require us to label the clusters we want to obtain. Which, in reality, is not available because we do not know in advance the information that can be proposed through these opinions. To overcome this constraint, clustering approaches such as K-mean, HAC or FCM can be exploited. In this paper, we present and compare these approaches. And to show the importance of exploiting clustering algorithms, to classify and analyze textual data in Arabic. By applying them to a real case that has created a great debate in Morocco, which is the case of teachers contracting with academies.","PeriodicalId":250715,"journal":{"name":"Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125120228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}