Pub Date : 2021-06-01DOI: 10.1109/icdata52997.2021.00003
{"title":"[Copyright notice]","authors":"","doi":"10.1109/icdata52997.2021.00003","DOIUrl":"https://doi.org/10.1109/icdata52997.2021.00003","url":null,"abstract":"","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124329452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00033
Eba Victoire Kié, M. Babri, C. Lishou, S. Ouya
Several research studies have shown that the socio-constructivist model allows learners to develop skills. This model advocates collaborative work between peers and pedagogical activities through practical work. With the advent of the COVID-19 pandemic and its constraints, universities have turned to online courses. Despite the use of online collaboration tools, we are finding that some real-time practical work based on screen sharing is difficult to do given the time lag between transmission and reception due to considerable latency. To overcome this latency problem we propose in this paper an algorithm that optimizes routing and resource allocation in software-defined optical data center networks, using Virtual Extensible Networks (VXLANs). This allows the implementation of socioconstructivist models while minimizing latency. Our solution is tested at the Cheikh Anta Diop University of Dakar and has significantly improved latency by making online courses and practical work in the context of COVID-19 fluid and synchronous.
{"title":"Improvement of online real-time practical work platforms by optimizing routing and resource allocation in software-defined elastic optical data center networks: The case of the Cheikh Anta Diop University of Dakar","authors":"Eba Victoire Kié, M. Babri, C. Lishou, S. Ouya","doi":"10.1109/ICDATA52997.2021.00033","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00033","url":null,"abstract":"Several research studies have shown that the socio-constructivist model allows learners to develop skills. This model advocates collaborative work between peers and pedagogical activities through practical work. With the advent of the COVID-19 pandemic and its constraints, universities have turned to online courses. Despite the use of online collaboration tools, we are finding that some real-time practical work based on screen sharing is difficult to do given the time lag between transmission and reception due to considerable latency. To overcome this latency problem we propose in this paper an algorithm that optimizes routing and resource allocation in software-defined optical data center networks, using Virtual Extensible Networks (VXLANs). This allows the implementation of socioconstructivist models while minimizing latency. Our solution is tested at the Cheikh Anta Diop University of Dakar and has significantly improved latency by making online courses and practical work in the context of COVID-19 fluid and synchronous.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132309400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00023
Naoufal El Allali, Mourad Fariss, H. Asaidi, Mohamed Bellouki
The significant existence of web services is challenging to the researchers regarding their diversity of types and their diffusion. It may lead to difficulty in identifying the relevant service during the discovery or composition process. To tackle this problem, we propose a new method to categorize semantic web services based on the Naive Bayes algorithm using a weighting method (TF-IDF), which binds a service according to its description importance offered by the service provider to be categorized in a relevant class. It enhances the performance by proposing a compatible combination of the preprocessing techniques (Natural language processing) to achieve a better classification result. This method has been tested on the OWLS-TC dataset, categorized into seven classes, and its accuracy is 93%.
{"title":"Multinomial Naive Bayes Categorization for Semantic Web Services","authors":"Naoufal El Allali, Mourad Fariss, H. Asaidi, Mohamed Bellouki","doi":"10.1109/ICDATA52997.2021.00023","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00023","url":null,"abstract":"The significant existence of web services is challenging to the researchers regarding their diversity of types and their diffusion. It may lead to difficulty in identifying the relevant service during the discovery or composition process. To tackle this problem, we propose a new method to categorize semantic web services based on the Naive Bayes algorithm using a weighting method (TF-IDF), which binds a service according to its description importance offered by the service provider to be categorized in a relevant class. It enhances the performance by proposing a compatible combination of the preprocessing techniques (Natural language processing) to achieve a better classification result. This method has been tested on the OWLS-TC dataset, categorized into seven classes, and its accuracy is 93%.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122329026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00013
Jbari Olaya, Chakkor Otman
The beta-divergences has been largely used in the machine learning literature. In this paper, we will go into detail about what they are, where they come from, their relation with Bregman divergence, and why they are so useful in many machine learning algorithms. In particular, Nonnegative Matrix Factorization (NMF), witch we presented as an example using Majorization-Minimization approach.
{"title":"Beta-divergence for Nonnegative Matrix Factorization","authors":"Jbari Olaya, Chakkor Otman","doi":"10.1109/ICDATA52997.2021.00013","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00013","url":null,"abstract":"The beta-divergences has been largely used in the machine learning literature. In this paper, we will go into detail about what they are, where they come from, their relation with Bregman divergence, and why they are so useful in many machine learning algorithms. In particular, Nonnegative Matrix Factorization (NMF), witch we presented as an example using Majorization-Minimization approach.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126608701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00049
Loubna Moumeni, Ilham Slimani, Ilhame El Farissi, M. Saber, M. Belkasmi
Many studies in big data focus on the uses of data available to researchers, leaving without treatment data that is on the servers but of which researchers are unaware [1]. This unmanaged, unrated, and unknown data is prevalent in most companies. This overlooked information takes up valuable storage capacity and may contain hidden risks and usually lies abandoned on the periphery of the information governance agenda. In this article, we present and discuss the value of dark data and focus on risks caused by this dark data. To this end, we provide statistics of a major IDC (International Data Corporation) and other companies. We also explain how this new term in the world of data can be exploited, to avoid the risks.
{"title":"Dark data as a new challenge to improve business performances: review and perspectives","authors":"Loubna Moumeni, Ilham Slimani, Ilhame El Farissi, M. Saber, M. Belkasmi","doi":"10.1109/ICDATA52997.2021.00049","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00049","url":null,"abstract":"Many studies in big data focus on the uses of data available to researchers, leaving without treatment data that is on the servers but of which researchers are unaware [1]. This unmanaged, unrated, and unknown data is prevalent in most companies. This overlooked information takes up valuable storage capacity and may contain hidden risks and usually lies abandoned on the periphery of the information governance agenda. In this article, we present and discuss the value of dark data and focus on risks caused by this dark data. To this end, we provide statistics of a major IDC (International Data Corporation) and other companies. We also explain how this new term in the world of data can be exploited, to avoid the risks.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133674770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00048
Duygu Çınar Umdu, E. Alakavuk, Aleyna Koyuncu
The point where Sustainable Development and Smart Growth understandings meet with standardization and technology has enabled eco-labels, assessment models and guidelines to emerge and become a trend. With Agenda 21, this trend has been well adopted and eco-labels, assessment models and certifications have begun to be created for many disciplines. The concept of sustainable neighbourhood, strengthened by the development of the Eco-Urbanism concept, constitutes an important basis for sustainable urban design and the Neighbourhood Sustainability Assessment (NSA) concept is nourished from this basis. BREEAM stands out as the first example developed when it comes to sustainability models of architecture and urban design. This study is a review on the criteria and indicators, purpose, status of the BREEAM Communities Model developed for smart and sustainable urban design, which is a rising value, and the scoring system was examined through BREEAM data and literature. In addition, the differences, and similarities between 5 different Neighbourhood Sustainability Assessments which are LEED ND, CASBEE UD, Citylab, and BEAM Plus Neighboorhood by 4 organizations from 70 members of the World Green Building Council, including BRE, which constitutes the BREEAM system, were investigated. As a result of all these studies, the strengths, and weaknesses of BREEAM Communities were determined.
可持续发展和智能增长的理解与标准化和技术相结合,使生态标签、评估模型和指导方针得以出现并成为一种趋势。《21世纪议程》很好地采用了这一趋势,并开始为许多学科建立生态标签、评价模式和证书。生态城市主义概念的发展加强了可持续社区的概念,构成了可持续城市设计的重要基础,社区可持续性评估(NSA)概念正是在此基础上孕育出来的。在建筑和城市设计的可持续性模型方面,BREEAM作为第一个例子脱颖而出。本研究回顾了为智慧和可持续城市设计而开发的BREEAM社区模型的标准和指标、目的、现状,并通过BREEAM数据和文献对评分系统进行了检验。此外,我们还调查了世界绿色建筑委员会70个成员中的4个组织(包括组成BREEAM系统的BRE)对LEED ND、CASBEE UD、Citylab和BEAM Plus neighborhood进行的5种不同社区可持续性评估之间的异同。通过所有这些研究,确定了BREEAM社区的优势和劣势。
{"title":"BREEAM Communities: Criteria Aim, Status, Strengths and Weaknesses","authors":"Duygu Çınar Umdu, E. Alakavuk, Aleyna Koyuncu","doi":"10.1109/ICDATA52997.2021.00048","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00048","url":null,"abstract":"The point where Sustainable Development and Smart Growth understandings meet with standardization and technology has enabled eco-labels, assessment models and guidelines to emerge and become a trend. With Agenda 21, this trend has been well adopted and eco-labels, assessment models and certifications have begun to be created for many disciplines. The concept of sustainable neighbourhood, strengthened by the development of the Eco-Urbanism concept, constitutes an important basis for sustainable urban design and the Neighbourhood Sustainability Assessment (NSA) concept is nourished from this basis. BREEAM stands out as the first example developed when it comes to sustainability models of architecture and urban design. This study is a review on the criteria and indicators, purpose, status of the BREEAM Communities Model developed for smart and sustainable urban design, which is a rising value, and the scoring system was examined through BREEAM data and literature. In addition, the differences, and similarities between 5 different Neighbourhood Sustainability Assessments which are LEED ND, CASBEE UD, Citylab, and BEAM Plus Neighboorhood by 4 organizations from 70 members of the World Green Building Council, including BRE, which constitutes the BREEAM system, were investigated. As a result of all these studies, the strengths, and weaknesses of BREEAM Communities were determined.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117290784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00041
J. Lamterkati, L. Ouboubker, M. Khafallah
In the conventional Direct Power Control (DPC) the active/reactive powers and line currents have high ripples caused by the switching frequency variations. To repair it, in this paper the Space Vector Modulation (SVM) is used. This paper will aim firstly at developing a new SVM-DPC structure based Fuzzy Logic Controllers (FLC's) to overcome the SVM-DPC based Proportional-Integral (PI) controller limitations, such as the sensitivity to the external disturbances and the parameter uncertainties. In order to rectify the disadvantages of traditional SVM-DPC approach, active power and reactive power fuzzy logic controllers were designed to replace the classical active and reactive powers PI controllers. The active power and reactive power fuzzy logic controller has two inputs (active/reactive power error and its change rate) and one output (active/reactive component of the reference vector voltage). Numerical simulation results are given for conventional Proportional-integral and Fuzzy Logic Controllers to prove the performances of the proposed strategy. The simulation verifies that fuzzy SVM-DPC is able to of effectively improve control performance in terms of settling time, overshoot, steady state, line current harmonics and total harmonic distortion.
{"title":"Direct Power Fuzzy Control of Three-Phase AC-DC Converter using a Space Vector Modulation","authors":"J. Lamterkati, L. Ouboubker, M. Khafallah","doi":"10.1109/ICDATA52997.2021.00041","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00041","url":null,"abstract":"In the conventional Direct Power Control (DPC) the active/reactive powers and line currents have high ripples caused by the switching frequency variations. To repair it, in this paper the Space Vector Modulation (SVM) is used. This paper will aim firstly at developing a new SVM-DPC structure based Fuzzy Logic Controllers (FLC's) to overcome the SVM-DPC based Proportional-Integral (PI) controller limitations, such as the sensitivity to the external disturbances and the parameter uncertainties. In order to rectify the disadvantages of traditional SVM-DPC approach, active power and reactive power fuzzy logic controllers were designed to replace the classical active and reactive powers PI controllers. The active power and reactive power fuzzy logic controller has two inputs (active/reactive power error and its change rate) and one output (active/reactive component of the reference vector voltage). Numerical simulation results are given for conventional Proportional-integral and Fuzzy Logic Controllers to prove the performances of the proposed strategy. The simulation verifies that fuzzy SVM-DPC is able to of effectively improve control performance in terms of settling time, overshoot, steady state, line current harmonics and total harmonic distortion.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124238336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00037
Said Agharass, M. Laaboubi, A. Saddik, R. Latif
Hardware/software co-design (HSCD) is an essential part of the configuration flow of current electronic system-level (ESL) devices [1]. This paper shows an overview of the hardware/software co-design approach, which aims to study the algorithmic and temporal limitations to design a robust architecture that takes maximum advantage of the desired architecture. In addition, we presented a comparative study of Sobel filters for the preprocessing of image processing algorithms. This study was based on the Intel-Altera CPU-FPGA board. We implemented the algorithm in two different methods, the first one based on the VHDL description language and the second one based on the OpenCL high-level language. The results showed that the use of VHDL has less resource consumption than the OpenCL based implementation. From the results, we can conclude that the use of VHDL takes less resources, but the problem here is the complexity of coding using this HDL. Therefore, OpenCL is very strong in speeding up algorithms that are based on massive data.
{"title":"Hardware Software Co-design based CPU-FPGA Architecture: Overview and Evaluation","authors":"Said Agharass, M. Laaboubi, A. Saddik, R. Latif","doi":"10.1109/ICDATA52997.2021.00037","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00037","url":null,"abstract":"Hardware/software co-design (HSCD) is an essential part of the configuration flow of current electronic system-level (ESL) devices [1]. This paper shows an overview of the hardware/software co-design approach, which aims to study the algorithmic and temporal limitations to design a robust architecture that takes maximum advantage of the desired architecture. In addition, we presented a comparative study of Sobel filters for the preprocessing of image processing algorithms. This study was based on the Intel-Altera CPU-FPGA board. We implemented the algorithm in two different methods, the first one based on the VHDL description language and the second one based on the OpenCL high-level language. The results showed that the use of VHDL has less resource consumption than the OpenCL based implementation. From the results, we can conclude that the use of VHDL takes less resources, but the problem here is the complexity of coding using this HDL. Therefore, OpenCL is very strong in speeding up algorithms that are based on massive data.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125621086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00011
Mohamed Sraitih, Y. Jabrane, Abdelghafour Atlas
Electrocardiogram (ECG) is one the most used tool to diagnose the health problems of the heart, but a manual diagnosis of these heart-beat classes by cardiologists could be time-consuming. An automated computer aid remains important in such cases to facilitate some tasks and aid specialists rapidly define and classify arrhythmias. In this paper, we investigated paradigms used for ECG arrhythmia classification of the Intra- and inter-patient paradigm with and without the association for the advancement of medical instrumentation (AAMII) standards and presented several papers that worked on each type. We discussed a variety of limitations and revealed that there is still room for improvement in the classification's performance, especially for the NON-AAMII inter-patient paradigm, to further boost confidence for the applicability of such solutions in clinical diagnosis which is more conforming to the clinical practice in which varieties of ECG signals are collected from different subjects. Such an organized literature survey allows researchers to merge an unobstructed view on all the aspects of ECG classification for the identification of gaps and the research issues unmet so far.
{"title":"An overview on intra- and inter-patient paradigm for ECG Heartbeat Arrhythmia Classification","authors":"Mohamed Sraitih, Y. Jabrane, Abdelghafour Atlas","doi":"10.1109/ICDATA52997.2021.00011","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00011","url":null,"abstract":"Electrocardiogram (ECG) is one the most used tool to diagnose the health problems of the heart, but a manual diagnosis of these heart-beat classes by cardiologists could be time-consuming. An automated computer aid remains important in such cases to facilitate some tasks and aid specialists rapidly define and classify arrhythmias. In this paper, we investigated paradigms used for ECG arrhythmia classification of the Intra- and inter-patient paradigm with and without the association for the advancement of medical instrumentation (AAMII) standards and presented several papers that worked on each type. We discussed a variety of limitations and revealed that there is still room for improvement in the classification's performance, especially for the NON-AAMII inter-patient paradigm, to further boost confidence for the applicability of such solutions in clinical diagnosis which is more conforming to the clinical practice in which varieties of ECG signals are collected from different subjects. Such an organized literature survey allows researchers to merge an unobstructed view on all the aspects of ECG classification for the identification of gaps and the research issues unmet so far.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127771116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-01DOI: 10.1109/ICDATA52997.2021.00022
Zineb Nassr, N. Sael, F. Benabbou
Natural Language Processing (NLP) is a branch of artificial intelligence AI that helps computers understand, interpret and manipulate human language. NLP draws from many disciplines, including computer science and computational linguistics, in its pursuit to fill the gap between human communication and computer understanding. In the information age, using NLP for optimizing information search process, text summary, text, and data analysis systems become the most important. So, to achieve accuracy, redundant words without or with low semantic meaning must be filtered. These words are known as stop words. The Stop words list has been developed for languages like Arabic, English, Chinese, French, etc. But Standard Stop Words list is always missing for dialects, as Moroccan dialect. Manual Identification of stop words for the Moroccan dialect is a difficult task, especially with the diversity of ways that can be used to write a simple stop word. In this work, we propose a novel method for Moroccan dialect stop word generation. To attempt this objective, we first realize preprocessing steps to reduce noise, create stop words dictionary to enrich our database for training purposes and finally use word embedding to build stop words clusters. This list is generated from three popular social networks: Facebook, twitter, and YouTube.
{"title":"Generate a list of Stop Words in Moroccan Dialect from Social Network Data Using Word Embedding","authors":"Zineb Nassr, N. Sael, F. Benabbou","doi":"10.1109/ICDATA52997.2021.00022","DOIUrl":"https://doi.org/10.1109/ICDATA52997.2021.00022","url":null,"abstract":"Natural Language Processing (NLP) is a branch of artificial intelligence AI that helps computers understand, interpret and manipulate human language. NLP draws from many disciplines, including computer science and computational linguistics, in its pursuit to fill the gap between human communication and computer understanding. In the information age, using NLP for optimizing information search process, text summary, text, and data analysis systems become the most important. So, to achieve accuracy, redundant words without or with low semantic meaning must be filtered. These words are known as stop words. The Stop words list has been developed for languages like Arabic, English, Chinese, French, etc. But Standard Stop Words list is always missing for dialects, as Moroccan dialect. Manual Identification of stop words for the Moroccan dialect is a difficult task, especially with the diversity of ways that can be used to write a simple stop word. In this work, we propose a novel method for Moroccan dialect stop word generation. To attempt this objective, we first realize preprocessing steps to reduce noise, create stop words dictionary to enrich our database for training purposes and finally use word embedding to build stop words clusters. This list is generated from three popular social networks: Facebook, twitter, and YouTube.","PeriodicalId":231714,"journal":{"name":"2021 International Conference on Digital Age & Technological Advances for Sustainable Development (ICDATA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128499521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}