Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672669
Deena Al-Ashwal, Eman Zaid Al-Sewari, A. A. Al-Shargabi
During testing of programs, developers face two types of errors: syntax errors, and logical errors. Generally, logical errors in programming are more difficult to detect. To figure out the reason of that errors, it should trace the source code manually to find the potential instructions that may cause the problem. Consequently the testing will spend a lot of time, effort, and cost. The cost will be problematic with large-scale systems, and the cost will doubled in evolution, confirmation testing, and regression testing. This paper introduces a prototype of a CASE tool for Java logical errors detecting using static and dynamic testing techniques. This research utilizes the Junit and PMD tools to detect the logical errors and analyze the potential causes of these errors based on Java common logical errors lists. The prototype is tested according to some Java programs under different conditions.
{"title":"A CASE Tool for JAVA Programs Logical Errors Detection: Static and Dynamic Testing","authors":"Deena Al-Ashwal, Eman Zaid Al-Sewari, A. A. Al-Shargabi","doi":"10.1109/ACIT.2018.8672669","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672669","url":null,"abstract":"During testing of programs, developers face two types of errors: syntax errors, and logical errors. Generally, logical errors in programming are more difficult to detect. To figure out the reason of that errors, it should trace the source code manually to find the potential instructions that may cause the problem. Consequently the testing will spend a lot of time, effort, and cost. The cost will be problematic with large-scale systems, and the cost will doubled in evolution, confirmation testing, and regression testing. This paper introduces a prototype of a CASE tool for Java logical errors detecting using static and dynamic testing techniques. This research utilizes the Junit and PMD tools to detect the logical errors and analyze the potential causes of these errors based on Java common logical errors lists. The prototype is tested according to some Java programs under different conditions.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116940510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672713
Zeinab Abbas, C. Fakih, Ali Saad, M. Ayache
Intra-Cytoplasmic Sperm Injection (ICSI) represents the best chance to have a baby for couples that have an infertility problem. ICSI treatment is expensive, and there are a number of factors affecting the success of the treatment. This work is mainly aimed to classify and predict the ICSI treatment results using (1) the classical statistical study, (i.e. logistic regression) and (2) the artificial intelligence (i.e. Neural Networks). For this purpose, data are extracted from real patients. The data contain parameters such as the age, the endometrial receptivity, the endometrial and myometrial vascularity index, number of embryo transfer, the day of transfer, and the quality of embryo transferred. These parameters may affect the result of the ICSI treatment. Overall, the logistic regression predicts the output of the ICSI outcome with an accuracy of 75%. In other parts, the neural network managed to achieve an accuracy of 79.5% with all parameters and 75% with only the significant parameters.
{"title":"Vaginal Power Doppler Parameters as New Predictors of Intra-Cytoplasmic Sperm Injection Outcome","authors":"Zeinab Abbas, C. Fakih, Ali Saad, M. Ayache","doi":"10.1109/ACIT.2018.8672713","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672713","url":null,"abstract":"Intra-Cytoplasmic Sperm Injection (ICSI) represents the best chance to have a baby for couples that have an infertility problem. ICSI treatment is expensive, and there are a number of factors affecting the success of the treatment. This work is mainly aimed to classify and predict the ICSI treatment results using (1) the classical statistical study, (i.e. logistic regression) and (2) the artificial intelligence (i.e. Neural Networks). For this purpose, data are extracted from real patients. The data contain parameters such as the age, the endometrial receptivity, the endometrial and myometrial vascularity index, number of embryo transfer, the day of transfer, and the quality of embryo transferred. These parameters may affect the result of the ICSI treatment. Overall, the logistic regression predicts the output of the ICSI outcome with an accuracy of 75%. In other parts, the neural network managed to achieve an accuracy of 79.5% with all parameters and 75% with only the significant parameters.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129728195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672695
Nadia Minkara, Nafez Haddad, Walid Kamali
Myocardial Infarction due to ischemic and other causes in someone's heart leads to heart attack and death. Electrocardiogram (ECG) and simple blood testing could diagnose the causes of heart failure. ECG records the electrical activities of the heart by showing waveform complexes called PQRST representing the electric serial events of polarization, depolarization and repolarization processes that take place in the heart per heartbeat. The ST segment of the PQRST complex is a phase of ventricular repolarization and that is of great importance in cardiac failure diagnosis. In cases of certain heart conditions and or failure, this ST segment may rise above or decline below a reference line. Besides ECG recording, a blood test of a protein called Troponin that is released in ng/I by the heart muscles becomes elevated in the blood stream in response to MI's, Pectoris angina, and/or coronary ischemia or other related medical conditions. Both ECG recording and Troponin testing would likely confirm or exclude the occurrence of possible MI's. Findings of Troponin testing with sensitivity of 59.46% and specificity of 85.1 % indicated of having myocardial infarction, while others indicated of having related cardiac troubles.
{"title":"Study of Myocardial Infarction Versus ECG ST Segment and Cardiac Marker Enzyme, High Sensitive Troponin Testing","authors":"Nadia Minkara, Nafez Haddad, Walid Kamali","doi":"10.1109/ACIT.2018.8672695","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672695","url":null,"abstract":"Myocardial Infarction due to ischemic and other causes in someone's heart leads to heart attack and death. Electrocardiogram (ECG) and simple blood testing could diagnose the causes of heart failure. ECG records the electrical activities of the heart by showing waveform complexes called PQRST representing the electric serial events of polarization, depolarization and repolarization processes that take place in the heart per heartbeat. The ST segment of the PQRST complex is a phase of ventricular repolarization and that is of great importance in cardiac failure diagnosis. In cases of certain heart conditions and or failure, this ST segment may rise above or decline below a reference line. Besides ECG recording, a blood test of a protein called Troponin that is released in ng/I by the heart muscles becomes elevated in the blood stream in response to MI's, Pectoris angina, and/or coronary ischemia or other related medical conditions. Both ECG recording and Troponin testing would likely confirm or exclude the occurrence of possible MI's. Findings of Troponin testing with sensitivity of 59.46% and specificity of 85.1 % indicated of having myocardial infarction, while others indicated of having related cardiac troubles.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116323866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672671
Mohamad Mourad, Ahmad Diab, M. Khalil, C. Marque
In the last ten years the ratio of preterm labor mortality increases. Many studies were done about the cause of gestation as well as preterm labor. Nowadays, studies are dealing with the signal detection of uterus contraction which can be used by analysis methods in order to determine its features and its role in labor. In this study, many Non-linear methods are used to extract features from the signal in order to differentiate between pregnancy and labor signals and to monitor pregnancy of women for each week before labor (WBL): Lempel-Ziv complexity (lzc), Fractal Dimension (FD), Hjorth parameter. All data recorded between Lebanon and France from 12 WBL until 1 WBL and labor using a 4×4 matrix of electrodes. Methods were tested first on synthetic signals to test their sensitivity to nonlinearity change then they were applied on real signals. Results show the implementation of the nonlinear analysis methods of signal processing to differentiate between the contraction signals, which appears as the variation and sensitivity of pregnancy and labor signals with respect to methods that can help to detect normal and preterm labor.
{"title":"Pregnancy/Labor Discrimination and Monitoring: An Investigation Using Nonlinear Methods","authors":"Mohamad Mourad, Ahmad Diab, M. Khalil, C. Marque","doi":"10.1109/ACIT.2018.8672671","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672671","url":null,"abstract":"In the last ten years the ratio of preterm labor mortality increases. Many studies were done about the cause of gestation as well as preterm labor. Nowadays, studies are dealing with the signal detection of uterus contraction which can be used by analysis methods in order to determine its features and its role in labor. In this study, many Non-linear methods are used to extract features from the signal in order to differentiate between pregnancy and labor signals and to monitor pregnancy of women for each week before labor (WBL): Lempel-Ziv complexity (lzc), Fractal Dimension (FD), Hjorth parameter. All data recorded between Lebanon and France from 12 WBL until 1 WBL and labor using a 4×4 matrix of electrodes. Methods were tested first on synthetic signals to test their sensitivity to nonlinearity change then they were applied on real signals. Results show the implementation of the nonlinear analysis methods of signal processing to differentiate between the contraction signals, which appears as the variation and sensitivity of pregnancy and labor signals with respect to methods that can help to detect normal and preterm labor.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124448076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672717
Leila Ouahrani, Djamel Bennouar
Enhancing the quality of teaching and learning in education might be through designing, implementing, and making effective use of assessment practice. In this paper we address the task of computer assisted assessment of short student answers. We describe a new statistical approach used to design Short Answer Grading System adapted to Arabic language. The approach consists of building a semantic space that gives distributional representation of words based on word co-occurrences in text corpora. Semantic similarity is computed using the summation vector model. Score similarity is enhanced by an individual normalized term frequencies weighting and then combining the index of common words between the model and the student answers using syntactic DICE's coefficient. A great advantage of this statistical approach is that it does not require the existence of any word data models. It is particularly suitable in situations where no large, publicly available, linguistic resources can be found for a desired language. Evaluated on two datasets, the proposed approach yielded 81.49% correlation and 0.97 Root Mean Squared Error with human grading scores. The proposed approach gets significantly closer to some works in the literature and outperforms others. This shows that such an approach can be as effective as approaches using sophisticated similarities calculations that make the system difficult to achieve and to use in practice.
{"title":"A Vector Space Based Approach for Short Answer Grading System","authors":"Leila Ouahrani, Djamel Bennouar","doi":"10.1109/ACIT.2018.8672717","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672717","url":null,"abstract":"Enhancing the quality of teaching and learning in education might be through designing, implementing, and making effective use of assessment practice. In this paper we address the task of computer assisted assessment of short student answers. We describe a new statistical approach used to design Short Answer Grading System adapted to Arabic language. The approach consists of building a semantic space that gives distributional representation of words based on word co-occurrences in text corpora. Semantic similarity is computed using the summation vector model. Score similarity is enhanced by an individual normalized term frequencies weighting and then combining the index of common words between the model and the student answers using syntactic DICE's coefficient. A great advantage of this statistical approach is that it does not require the existence of any word data models. It is particularly suitable in situations where no large, publicly available, linguistic resources can be found for a desired language. Evaluated on two datasets, the proposed approach yielded 81.49% correlation and 0.97 Root Mean Squared Error with human grading scores. The proposed approach gets significantly closer to some works in the literature and outperforms others. This shows that such an approach can be as effective as approaches using sophisticated similarities calculations that make the system difficult to achieve and to use in practice.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131422827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672673
R. Tlemsani, Khadidja Belbachir
This work presents survey, implementation and test for a neural network: TDNN (Time Delay Neural Network), applied to on-line handwritten recognition characters. In this work, we present a recognizer conception for on-line Arabic handwriting. On-line handwriting recognition of Arabic script is a complex problem, since it is naturally both cursive and unconstrained. This system permits to interpret a script represented by the pen trajectory. This technique is used notably in the electronic tablets. We will construct a data base with several scripters. Afterwards, and before attacking the recognition phase, there is a constructional samples phase of Arabic characters acquired from an electronic tablet to digitize (NOUN DATABASE). Obtained scores shows an effectiveness of the proposed approach based on convolutional neural networks.
{"title":"An Improved Arabic On-Line Characters Recognition System","authors":"R. Tlemsani, Khadidja Belbachir","doi":"10.1109/ACIT.2018.8672673","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672673","url":null,"abstract":"This work presents survey, implementation and test for a neural network: TDNN (Time Delay Neural Network), applied to on-line handwritten recognition characters. In this work, we present a recognizer conception for on-line Arabic handwriting. On-line handwriting recognition of Arabic script is a complex problem, since it is naturally both cursive and unconstrained. This system permits to interpret a script represented by the pen trajectory. This technique is used notably in the electronic tablets. We will construct a data base with several scripters. Afterwards, and before attacking the recognition phase, there is a constructional samples phase of Arabic characters acquired from an electronic tablet to digitize (NOUN DATABASE). Obtained scores shows an effectiveness of the proposed approach based on convolutional neural networks.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125024887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672725
Samar Sindian, A. Samhat, M. Crussiére, J. Hélard, Ayman Khalil
The shared superframe multihop nature of wireless personal area networks (WPANs) poses fundamental challenges to the design of effective and optimal resource allocation algorithms with respect to resource utilization and fairness across different network devices. In this paper, we propose a distributed optimization framework for resource allocation scheme in an IEEE 802.15.5 hop-1 for fairly sharing network resources among contending stations network. A suite of problem formulations for the hop-1 IEEE 802.15.5 devices is proposed. Simulation results show different high satisfaction and fairness indexes among these different problems. Consequently, a trade-off between satisfaction and fairness should be conducted for choosing the optimal problem.
{"title":"Optimization Framework for Resource Allocation in IEEE 802.15.5 Hop-1","authors":"Samar Sindian, A. Samhat, M. Crussiére, J. Hélard, Ayman Khalil","doi":"10.1109/ACIT.2018.8672725","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672725","url":null,"abstract":"The shared superframe multihop nature of wireless personal area networks (WPANs) poses fundamental challenges to the design of effective and optimal resource allocation algorithms with respect to resource utilization and fairness across different network devices. In this paper, we propose a distributed optimization framework for resource allocation scheme in an IEEE 802.15.5 hop-1 for fairly sharing network resources among contending stations network. A suite of problem formulations for the hop-1 IEEE 802.15.5 devices is proposed. Simulation results show different high satisfaction and fairness indexes among these different problems. Consequently, a trade-off between satisfaction and fairness should be conducted for choosing the optimal problem.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132310105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672677
Maram Assi, R. Haraty
The Knapsack Problem (KP) is one of the most studied combinatorial problems. There are many variations of the problem along with many real life applications. KP seeks to select some of the available items with the maximal total weight in a way that does not exceed a given maximum limit L. Knapsack problems have been used to tackle real life problem belonging to a variety of fields including cryptography and applied mathematics. In this paper, we consider the different instances of Knapsack Problem along with its applications and various approaches to solve the problem.
{"title":"A Survey of the Knapsack Problem","authors":"Maram Assi, R. Haraty","doi":"10.1109/ACIT.2018.8672677","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672677","url":null,"abstract":"The Knapsack Problem (KP) is one of the most studied combinatorial problems. There are many variations of the problem along with many real life applications. KP seeks to select some of the available items with the maximal total weight in a way that does not exceed a given maximum limit L. Knapsack problems have been used to tackle real life problem belonging to a variety of fields including cryptography and applied mathematics. In this paper, we consider the different instances of Knapsack Problem along with its applications and various approaches to solve the problem.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123155640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672582
Kamal Alieyan, Mohammed Anbar, Ammar Almomani, R. Abdullah, Mohammad Alauthman
A botnet is considered a serious issue that threatens cyber security. It is a mean used by cybercriminals for carrying out their illegal activities. Such activities may include click fraud and DDoS attacks. The present paper aims to propose a new filtering approach called “The Gunner System”. The mentioned approach involves rule-based Domain Name System (DNS) features for detecting botnets. Through this approach, the researchers expect that the accuracy of the DNS-based botnet detection will be enhanced.
{"title":"Botnets Detecting Attack Based on DNS Features","authors":"Kamal Alieyan, Mohammed Anbar, Ammar Almomani, R. Abdullah, Mohammad Alauthman","doi":"10.1109/ACIT.2018.8672582","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672582","url":null,"abstract":"A botnet is considered a serious issue that threatens cyber security. It is a mean used by cybercriminals for carrying out their illegal activities. Such activities may include click fraud and DDoS attacks. The present paper aims to propose a new filtering approach called “The Gunner System”. The mentioned approach involves rule-based Domain Name System (DNS) features for detecting botnets. Through this approach, the researchers expect that the accuracy of the DNS-based botnet detection will be enhanced.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"105 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120899390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.1109/ACIT.2018.8672697
Mariam Khader, A. Awajan, Ghazi Al-Naymat
The social networks are one of the main sources of big data. Continuously, it produce huge volume of variety types of data at high velocity rates. This huge volume of data contains valuable information that requires efficient and scalable analysis techniques to be extracted. Hadoop/MapReduce is considered the most suitable framework for handling big data because of its scalability, reliability and simplicity. One of the basic applications to extract valuable information from data is the sentiment analysis. The sentiment analysis studies peoples' opinion by classifying their written text into positive or negative polarity. In this work, a sentiment analysis method for analyzing a Twitter data set is analyzed. The method uses the Naive Bayes algorithm for classifying the text into positive and negative polarity. Several linguistic and NLP preprocessing techniques were applied on the data set. The aim of these preprocessing techniques is to study their effects on the quality of big data classification. The applied preprocessing techniques have achieved an enhancement in the classification accuracy of the Naive Bayes algorithm. The experiments prove that the performance of the sentiment analysis is enhanced by 5% using NLP and linguistic processing, yielding an accuracy of 73 % on the used data set.
{"title":"The Effects of Natural Language Processing on Big Data Analysis: Sentiment Analysis Case Study","authors":"Mariam Khader, A. Awajan, Ghazi Al-Naymat","doi":"10.1109/ACIT.2018.8672697","DOIUrl":"https://doi.org/10.1109/ACIT.2018.8672697","url":null,"abstract":"The social networks are one of the main sources of big data. Continuously, it produce huge volume of variety types of data at high velocity rates. This huge volume of data contains valuable information that requires efficient and scalable analysis techniques to be extracted. Hadoop/MapReduce is considered the most suitable framework for handling big data because of its scalability, reliability and simplicity. One of the basic applications to extract valuable information from data is the sentiment analysis. The sentiment analysis studies peoples' opinion by classifying their written text into positive or negative polarity. In this work, a sentiment analysis method for analyzing a Twitter data set is analyzed. The method uses the Naive Bayes algorithm for classifying the text into positive and negative polarity. Several linguistic and NLP preprocessing techniques were applied on the data set. The aim of these preprocessing techniques is to study their effects on the quality of big data classification. The applied preprocessing techniques have achieved an enhancement in the classification accuracy of the Naive Bayes algorithm. The experiments prove that the performance of the sentiment analysis is enhanced by 5% using NLP and linguistic processing, yielding an accuracy of 73 % on the used data set.","PeriodicalId":443170,"journal":{"name":"2018 International Arab Conference on Information Technology (ACIT)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125247549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}