Abstract Machine learning based sentiment analysis is an interdisciplinary approach in opinion mining, particularly in the field of media and communication research. In spite of their different backgrounds, researchers have collaborated to test, train and again retest the machine learning approach to collect, analyse and withdraw a meaningful insight from large datasets. This research classifies the texts of micro-blog (tweets) into positive and negative responses about a particular phenomenon. The study also demonstrates the process of compilation of corpus for review of sentiments, cleaning the body of text to make it a meaningful text, find people’s emotions about it, and interpret the findings. Till date the public sentiment after abrogation of Article 370 has not been studied, which adds the novelty to this scientific study. This study includes the dataset collection from Twitter that comprises 66.7 % of positive tweets and 34.3 % of negative tweets of the people about the abrogation of Article 370. Experimental testing reveals that the proposed methodology is much more effective than the previously proposed methodology. This study focuses on comparison of unsupervised lexicon-based models (TextBlob, AFINN, Vader Sentiment) and supervised machine learning models (KNN, SVM, Random Forest and Naïve Bayes) for sentiment analysis. This is the first study with cyber public opinion over the abrogation of Article 370. Twitter data of more than 2 lakh tweets were collected by the authors. After cleaning, 29732 tweets were selected for analysis. As per the results among supervised learning, Random Forest performs the best, whereas among unsupervised learning TextBlob achieves the highest accuracy of 99 % and 88 %, respectively. Performance parameters of the proposed supervised machine learning models also surpass the result of the recent study performed in 2023 for sentiment analysis.
{"title":"Empirical Analysis of Supervised and Unsupervised Machine Learning Algorithms with Aspect-Based Sentiment Analysis","authors":"Satwinder Singh, Harpreet Kaur, Rubal Kanozia, Gurpreet Kaur","doi":"10.2478/acss-2023-0012","DOIUrl":"https://doi.org/10.2478/acss-2023-0012","url":null,"abstract":"Abstract Machine learning based sentiment analysis is an interdisciplinary approach in opinion mining, particularly in the field of media and communication research. In spite of their different backgrounds, researchers have collaborated to test, train and again retest the machine learning approach to collect, analyse and withdraw a meaningful insight from large datasets. This research classifies the texts of micro-blog (tweets) into positive and negative responses about a particular phenomenon. The study also demonstrates the process of compilation of corpus for review of sentiments, cleaning the body of text to make it a meaningful text, find people’s emotions about it, and interpret the findings. Till date the public sentiment after abrogation of Article 370 has not been studied, which adds the novelty to this scientific study. This study includes the dataset collection from Twitter that comprises 66.7 % of positive tweets and 34.3 % of negative tweets of the people about the abrogation of Article 370. Experimental testing reveals that the proposed methodology is much more effective than the previously proposed methodology. This study focuses on comparison of unsupervised lexicon-based models (TextBlob, AFINN, Vader Sentiment) and supervised machine learning models (KNN, SVM, Random Forest and Naïve Bayes) for sentiment analysis. This is the first study with cyber public opinion over the abrogation of Article 370. Twitter data of more than 2 lakh tweets were collected by the authors. After cleaning, 29732 tweets were selected for analysis. As per the results among supervised learning, Random Forest performs the best, whereas among unsupervised learning TextBlob achieves the highest accuracy of 99 % and 88 %, respectively. Performance parameters of the proposed supervised machine learning models also surpass the result of the recent study performed in 2023 for sentiment analysis.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"139 1","pages":"125 - 136"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77990829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The Coronavirus is a virus that spreads very quickly. Therefore, it has had very destructive effects in many areas worldwide. Because X-ray images are an easily accessible, fast, and inexpensive method, they are widely used worldwide to diagnose COVID-19. This study tried detecting COVID-19 from X-ray images using pre-trained VGG16, VGG19, InceptionV3, and Resnet50 CNN architectures and modified versions of these architectures. The fully connected layers of the pre-trained architectures have been reorganized in the modified CNN architectures. These architectures were trained on binary and three-class datasets, revealing their classification performance. The data set was collected from four different sources and consisted of 594 COVID-19, 1345 viral pneumonia, and 1341 normal X-ray images. Models are built using Tensorflow and Keras Libraries with Python programming language. Preprocessing was performed on the dataset by applying resizing, normalization, and one hot encoding operation. Model performances were evaluated according to many performance metrics such as recall, specificity, accuracy, precision, F1-score, confusion matrix, ROC analysis, etc., using 5-fold cross-validation. The highest classification performance was obtained in the modified VGG19 model with 99.84 % accuracy for binary classification (COVID-19 vs. Normal) and in the modified VGG16 model with 98.26 % accuracy for triple classification (COVID-19 vs. Pneumonia vs. Normal). These models have a higher accuracy rate than other studies in the literature. In addition, the number of COVID-19 X-ray images in the dataset used in this study is approximately two times higher than in other studies. Since it is obtained from different sources, it is irregular and does not have a standard. Despite this, it is noteworthy that higher classification performance was achieved than in previous studies. Modified VGG16 and VGG19 models (available at github.com/akaraci/LargeDatasetCovid19) can be used as an auxiliary tool in slight healthcare organizations’ shortage of specialists to detect COVID-19.
冠状病毒是一种传播非常迅速的病毒。因此,它在世界上许多地区产生了极具破坏性的影响。由于x射线图像是一种容易获得、快速和廉价的方法,因此在世界范围内广泛用于诊断COVID-19。本研究尝试使用预训练的VGG16、VGG19、InceptionV3和Resnet50 CNN架构以及这些架构的修改版本从x射线图像中检测COVID-19。在修改后的CNN架构中,预训练架构的全连接层被重新组织。这些架构在二分类和三类数据集上进行了训练,揭示了它们的分类性能。数据集来自四个不同的来源,包括594张COVID-19图像,1345张病毒性肺炎图像和1341张正常x线图像。模型使用Tensorflow和Keras库与Python编程语言构建。通过调整大小、规范化和一次热编码操作对数据集进行预处理。采用5倍交叉验证,根据召回率、特异性、准确度、精密度、f1评分、混淆矩阵、ROC分析等多项性能指标对模型性能进行评价。改进的VGG19模型在二元分类(COVID-19 vs. Normal)上的准确率为99.84%,在三重分类(COVID-19 vs.肺炎vs. Normal)上的准确率为98.26%,分类性能最高。与文献中其他研究相比,这些模型具有更高的准确率。此外,本研究中使用的数据集中的COVID-19 x射线图像数量大约是其他研究的两倍。由于它的来源不同,所以它是不规则的,没有标准。尽管如此,值得注意的是,与以往的研究相比,我们取得了更高的分类性能。改进的VGG16和VGG19模型(可在github.com/akaraci/LargeDatasetCovid19上获得)可作为辅助工具,用于轻微医疗机构缺乏检测COVID-19的专家。
{"title":"Predicting COVID-19 Cases on a Large Chest X-Ray Dataset Using Modified Pre-trained CNN Architectures","authors":"Abdulkadir Karac","doi":"10.2478/acss-2023-0005","DOIUrl":"https://doi.org/10.2478/acss-2023-0005","url":null,"abstract":"Abstract The Coronavirus is a virus that spreads very quickly. Therefore, it has had very destructive effects in many areas worldwide. Because X-ray images are an easily accessible, fast, and inexpensive method, they are widely used worldwide to diagnose COVID-19. This study tried detecting COVID-19 from X-ray images using pre-trained VGG16, VGG19, InceptionV3, and Resnet50 CNN architectures and modified versions of these architectures. The fully connected layers of the pre-trained architectures have been reorganized in the modified CNN architectures. These architectures were trained on binary and three-class datasets, revealing their classification performance. The data set was collected from four different sources and consisted of 594 COVID-19, 1345 viral pneumonia, and 1341 normal X-ray images. Models are built using Tensorflow and Keras Libraries with Python programming language. Preprocessing was performed on the dataset by applying resizing, normalization, and one hot encoding operation. Model performances were evaluated according to many performance metrics such as recall, specificity, accuracy, precision, F1-score, confusion matrix, ROC analysis, etc., using 5-fold cross-validation. The highest classification performance was obtained in the modified VGG19 model with 99.84 % accuracy for binary classification (COVID-19 vs. Normal) and in the modified VGG16 model with 98.26 % accuracy for triple classification (COVID-19 vs. Pneumonia vs. Normal). These models have a higher accuracy rate than other studies in the literature. In addition, the number of COVID-19 X-ray images in the dataset used in this study is approximately two times higher than in other studies. Since it is obtained from different sources, it is irregular and does not have a standard. Despite this, it is noteworthy that higher classification performance was achieved than in previous studies. Modified VGG16 and VGG19 models (available at github.com/akaraci/LargeDatasetCovid19) can be used as an auxiliary tool in slight healthcare organizations’ shortage of specialists to detect COVID-19.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"10 1","pages":"44 - 57"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81994837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chinthaka Amarasinghe, A. Ratnaweera, Sanjeeva Maitripala
Abstract Underwater simultaneous localization and mapping (SLAM) poses significant challenges for modern visual SLAM systems. The integration of deep learning networks within computer vision offers promising potential for addressing these difficulties. Our research draws inspiration from deep learning approaches applied to interest point detection and matching, single image depth prediction and underwater image enhancement. In response, we propose 3D-Net, a deep learning-assisted network designed to tackle these three tasks simultaneously. The network consists of three branches, each serving a distinct purpose: interest point detection, descriptor generation, and depth prediction. The interest point detector and descriptor generator can effectively serve as a front end for a classical SLAM system. The predicted depth information is akin to a virtual depth camera, opening up possibilities for various applications. We provide quantitative and qualitative evaluations to illustrate some of these potential uses. The network was trained in in several steps, using in-air datasets and followed by generated underwater datasets. Further, the network is integrated into feature-based SALM systems ORBSLAM2 and ORBSSLAM3, providing a comprehensive assessment of its effectiveness for underwater navigation.
{"title":"UW Deep SLAM-CNN Assisted Underwater SLAM","authors":"Chinthaka Amarasinghe, A. Ratnaweera, Sanjeeva Maitripala","doi":"10.2478/acss-2023-0010","DOIUrl":"https://doi.org/10.2478/acss-2023-0010","url":null,"abstract":"Abstract Underwater simultaneous localization and mapping (SLAM) poses significant challenges for modern visual SLAM systems. The integration of deep learning networks within computer vision offers promising potential for addressing these difficulties. Our research draws inspiration from deep learning approaches applied to interest point detection and matching, single image depth prediction and underwater image enhancement. In response, we propose 3D-Net, a deep learning-assisted network designed to tackle these three tasks simultaneously. The network consists of three branches, each serving a distinct purpose: interest point detection, descriptor generation, and depth prediction. The interest point detector and descriptor generator can effectively serve as a front end for a classical SLAM system. The predicted depth information is akin to a virtual depth camera, opening up possibilities for various applications. We provide quantitative and qualitative evaluations to illustrate some of these potential uses. The network was trained in in several steps, using in-air datasets and followed by generated underwater datasets. Further, the network is integrated into feature-based SALM systems ORBSLAM2 and ORBSSLAM3, providing a comprehensive assessment of its effectiveness for underwater navigation.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"44 1","pages":"100 - 113"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79011478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. T. Nguyen, My N. Nguyen, Bang Anh Nguyen, Linh Chi Nguyen, Linh Duong Phung
Abstract The advent of medical imaging significantly assisted in disease diagnosis and treatment. This study introduces to a framework for detecting several human body parts in Computerised Tomography (CT) images formatted in DICOM files. In addition, the method can highlight the bone areas inside CT images and transform 2D slices into a visual 3D model to illustrate the structure of human body parts. Firstly, we leveraged shallow convolutional Neural Networks to classify body parts and detect bone areas in each part. Then, Grad-CAM was applied to highlight the bone areas. Finally, Insight and Visualization libraries were utilized to visualize slides in 3D of a body part. As a result, the classifiers achieved 98 % in F1-score in the classification of human body parts on a CT image dataset, including 1234 slides capturing body parts from a woman for the training phase and 1245 images from a male for testing. In addition, distinguishing between bone and non-bone images can reach 97 % in F1-score on the dataset generated by setting a threshold value to reveal bone areas in CT images. Moreover, the Grad-CAM-based approach can provide clear, accurate visualizations with segmented bones in the image. Also, we successfully converted 2D slice images of a body part into a lively 3D model that provided a more intuitive view from any angle. The proposed approach is expected to provide an interesting visual tool for supporting doctors in medical image-based disease diagnosis.
{"title":"Recognition and 3D Visualization of Human Body Parts and Bone Areas Using CT Images","authors":"H. T. Nguyen, My N. Nguyen, Bang Anh Nguyen, Linh Chi Nguyen, Linh Duong Phung","doi":"10.2478/acss-2023-0007","DOIUrl":"https://doi.org/10.2478/acss-2023-0007","url":null,"abstract":"Abstract The advent of medical imaging significantly assisted in disease diagnosis and treatment. This study introduces to a framework for detecting several human body parts in Computerised Tomography (CT) images formatted in DICOM files. In addition, the method can highlight the bone areas inside CT images and transform 2D slices into a visual 3D model to illustrate the structure of human body parts. Firstly, we leveraged shallow convolutional Neural Networks to classify body parts and detect bone areas in each part. Then, Grad-CAM was applied to highlight the bone areas. Finally, Insight and Visualization libraries were utilized to visualize slides in 3D of a body part. As a result, the classifiers achieved 98 % in F1-score in the classification of human body parts on a CT image dataset, including 1234 slides capturing body parts from a woman for the training phase and 1245 images from a male for testing. In addition, distinguishing between bone and non-bone images can reach 97 % in F1-score on the dataset generated by setting a threshold value to reveal bone areas in CT images. Moreover, the Grad-CAM-based approach can provide clear, accurate visualizations with segmented bones in the image. Also, we successfully converted 2D slice images of a body part into a lively 3D model that provided a more intuitive view from any angle. The proposed approach is expected to provide an interesting visual tool for supporting doctors in medical image-based disease diagnosis.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"305 1","pages":"66 - 77"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83444446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Cloud remains an active and dominant player in the field of information technology. Hence, to meet the rapidly growing requirement of computational processes and storage resources, the cloud provider deploys efficient data centres globally that comprise thousands of IT servers. Because of tremendous energy and resource utilization, a reliable cloud platform has to be necessarily optimized. Effective load balancing is a great option to overcome these issues. However, loading balancing difficulties, such as increased computational complexity, the chance of losing the client data during task rescheduling, and consuming huge memory of the host, and new VM (Virtual Machine), need appropriate optimization. Hence, the study aims to create a newly developed IG-WA (Inquisitive Genetic–Wolf Optimization) framework that meritoriously detects the optimized virtual machine in an environment. For this purpose, the system utilises the GWO (Grey Wolf Optimization) method with an evolutionary mechanism for achieving a proper compromise between exploitation and exploration, thereby accelerating the convergence and achieving optimized accuracy. Furthermore, the fitness function evaluated with an inquisitive genetic algorithm adds value to the overall efficacy. Performance evaluation brings forward the outperformance of the proposed IGWO system in terms of energy consumption, execution time and cost, makespan, CPU utilization, and memory utilization. Further, the system attains more comprehensive and better results when compared to the state of art methods.
{"title":"Inquisitive Genetic-Based Wolf Optimization for Load Balancing in Cloud Computing","authors":"Suman Sansanwal, Nitin Jain","doi":"10.2478/acss-2023-0017","DOIUrl":"https://doi.org/10.2478/acss-2023-0017","url":null,"abstract":"Abstract Cloud remains an active and dominant player in the field of information technology. Hence, to meet the rapidly growing requirement of computational processes and storage resources, the cloud provider deploys efficient data centres globally that comprise thousands of IT servers. Because of tremendous energy and resource utilization, a reliable cloud platform has to be necessarily optimized. Effective load balancing is a great option to overcome these issues. However, loading balancing difficulties, such as increased computational complexity, the chance of losing the client data during task rescheduling, and consuming huge memory of the host, and new VM (Virtual Machine), need appropriate optimization. Hence, the study aims to create a newly developed IG-WA (Inquisitive Genetic–Wolf Optimization) framework that meritoriously detects the optimized virtual machine in an environment. For this purpose, the system utilises the GWO (Grey Wolf Optimization) method with an evolutionary mechanism for achieving a proper compromise between exploitation and exploration, thereby accelerating the convergence and achieving optimized accuracy. Furthermore, the fitness function evaluated with an inquisitive genetic algorithm adds value to the overall efficacy. Performance evaluation brings forward the outperformance of the proposed IGWO system in terms of energy consumption, execution time and cost, makespan, CPU utilization, and memory utilization. Further, the system attains more comprehensive and better results when compared to the state of art methods.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"30 1","pages":"170 - 179"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82328639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Any set of devices for controlling home appliances that link to a common network and may be controlled independently or remotely are typically referred to as smart home technology. Smart homes and home automation are not completely unknown to people anymore; smart devices and sensors are part of daily life in the 21st century. Among other benefits of home automation devices, it is possible to manage home appliances and monitor resource usage, and security. It is essential to find practical information about smart home users, and possible use cases. The current survey covers smart home usage benefits and challenges for the users. The study presents the result of the collected information from different countries, and the participants are people from a variety of age groups and occupations. The questionnaire that contains both qualitative and quantitative questions was distributed through internet channels such as blog posts and social network groups. Furthermore, to generate the survey questions we conducted a literature review to gain a better understating of the subject and the related work. The research provides a better foundation for future smart home development. As a result of this survey-based study and in addition to finding the desirable home automation features, we discovered the amount of money users are ready to spend to automate their homes. Connecting the favourite smart home features to its users and the amount of money they are ready to spend on them can provide a bigger picture for the smart home industry as a whole and particularly be beneficial for developers and start-ups.
{"title":"Who are Smart Home Users and What do they Want? – Insights from an International Survey","authors":"Ashkan Yaldaie, J. Porras, O. Drögehorn","doi":"10.2478/acss-2023-0011","DOIUrl":"https://doi.org/10.2478/acss-2023-0011","url":null,"abstract":"Abstract Any set of devices for controlling home appliances that link to a common network and may be controlled independently or remotely are typically referred to as smart home technology. Smart homes and home automation are not completely unknown to people anymore; smart devices and sensors are part of daily life in the 21st century. Among other benefits of home automation devices, it is possible to manage home appliances and monitor resource usage, and security. It is essential to find practical information about smart home users, and possible use cases. The current survey covers smart home usage benefits and challenges for the users. The study presents the result of the collected information from different countries, and the participants are people from a variety of age groups and occupations. The questionnaire that contains both qualitative and quantitative questions was distributed through internet channels such as blog posts and social network groups. Furthermore, to generate the survey questions we conducted a literature review to gain a better understating of the subject and the related work. The research provides a better foundation for future smart home development. As a result of this survey-based study and in addition to finding the desirable home automation features, we discovered the amount of money users are ready to spend to automate their homes. Connecting the favourite smart home features to its users and the amount of money they are ready to spend on them can provide a bigger picture for the smart home industry as a whole and particularly be beneficial for developers and start-ups.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"42 1","pages":"114 - 124"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89449245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Plant diseases are a primary hazard to the productiveness of crops, which impacts food protection and decreases the profitability of farmers. Consequently, identification of plant diseases becomes a crucial task. By taking the right nurturing measures to remediate these diseases in the early stages can drastically help in fending off the reduction in productivity/profit. Providing an intelligent and automated solution becomes a necessity. This can be achieved with the help of machine learning techniques. It involves a number of steps like image acquisition, image pre-processing using filtering and contrast enhancement techniques. Image segmentation, which is a crucial part in disease detection system, is done by applying genetic algorithm and the colour, texture features extracted using a local binary pattern. The novelty of this approach is applying the genetic algorithm for image segmentation and combining a set of propositions from all the learning classifiers with an ensemble method and calculating the results. This obeys the optimistic features of all the learning classifiers. System accuracy is evaluated using precision, recall, and accuracy measures. After analysing the results, it clearly shows that the ensemble models deliver very good accuracy of over 92 % as compared to an individual SVM, Naïve Bayes, and KNN classifiers.
{"title":"Detection and Classification of Banana Leaf Disease Using Novel Segmentation and Ensemble Machine Learning Approach","authors":"Vandana Chaudhari, M. Patil","doi":"10.2478/acss-2023-0009","DOIUrl":"https://doi.org/10.2478/acss-2023-0009","url":null,"abstract":"Abstract Plant diseases are a primary hazard to the productiveness of crops, which impacts food protection and decreases the profitability of farmers. Consequently, identification of plant diseases becomes a crucial task. By taking the right nurturing measures to remediate these diseases in the early stages can drastically help in fending off the reduction in productivity/profit. Providing an intelligent and automated solution becomes a necessity. This can be achieved with the help of machine learning techniques. It involves a number of steps like image acquisition, image pre-processing using filtering and contrast enhancement techniques. Image segmentation, which is a crucial part in disease detection system, is done by applying genetic algorithm and the colour, texture features extracted using a local binary pattern. The novelty of this approach is applying the genetic algorithm for image segmentation and combining a set of propositions from all the learning classifiers with an ensemble method and calculating the results. This obeys the optimistic features of all the learning classifiers. System accuracy is evaluated using precision, recall, and accuracy measures. After analysing the results, it clearly shows that the ensemble models deliver very good accuracy of over 92 % as compared to an individual SVM, Naïve Bayes, and KNN classifiers.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"19 1","pages":"92 - 99"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81880880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The Content Based Image Retrieval (CBIR) system is a framework for finding images from huge datasets that are similar to a given image. The main component of CBIR system is the strategy for retrieval of images. There are many strategies available and most of these rely on single feature extraction. The single feature-based strategy may not be efficient for all types of images. Similarly, due to a larger set of data, image retrieval may become inefficient. Hence, this article proposes a system that comprises of two-stage retrieval with different features at every stage where the first stage will be coarse retrieval and the second will be fine retrieval. The proposed framework is validated on standard benchmark images and compared with existing frameworks. The results are recorded in graphical and numerical form, thus supporting the efficiency of the proposed system.
{"title":"Efficient Content-Based Image Retrieval System with Two-Tier Hybrid Frameworks","authors":"Fatima Shaheen, R. Raibagkar","doi":"10.2478/acss-2022-0018","DOIUrl":"https://doi.org/10.2478/acss-2022-0018","url":null,"abstract":"Abstract The Content Based Image Retrieval (CBIR) system is a framework for finding images from huge datasets that are similar to a given image. The main component of CBIR system is the strategy for retrieval of images. There are many strategies available and most of these rely on single feature extraction. The single feature-based strategy may not be efficient for all types of images. Similarly, due to a larger set of data, image retrieval may become inefficient. Hence, this article proposes a system that comprises of two-stage retrieval with different features at every stage where the first stage will be coarse retrieval and the second will be fine retrieval. The proposed framework is validated on standard benchmark images and compared with existing frameworks. The results are recorded in graphical and numerical form, thus supporting the efficiency of the proposed system.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"77 1","pages":"166 - 182"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76568751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In software development, defects influence the quality and cost in an undesirable way. Software defect prediction (SDP) is one of the techniques which improves the software quality and testing efficiency by early identification of defects(bug/fault/error). Thus, several experiments have been suggested for defect prediction (DP) techniques. Mainly DP method utilises historical project data for constructing prediction models. SDP performs well within projects until there is an adequate amount of data accessible to train the models. However, if the data are inadequate or limited for the same project, the researchers mainly use Cross-Project Defect Prediction (CPDP). CPDP is a possible alternative option that refers to anticipating defects using prediction models built on historical data from other projects. CPDP is challenging due to its data distribution and domain difference problem. The proposed framework is an effective two-stage approach for CPDP, i.e., model generation and prediction process. In model generation phase, the conglomeration of different pre-processing, including feature selection and class reweights technique, is used to improve the initial data quality. Finally, a fine-tuned efficient bagging and boosting based hybrid ensemble model is developed, which avoids model over -fitting/under-fitting and helps enhance the prediction performance. In the prediction process phase, the generated model predicts the historical data from other projects, which has defects or clean. The framework is evaluated using25 software projects obtained from public repositories. The result analysis shows that the proposed model has achieved a 0.71±0.03 f1-score, which significantly improves the state-of-the-art approaches by 23 % to 60 %.
{"title":"Cross-Project Defect Prediction with Metrics Selection and Balancing Approach","authors":"Meetesh Nevendra, Pradeep Singh","doi":"10.2478/acss-2022-0015","DOIUrl":"https://doi.org/10.2478/acss-2022-0015","url":null,"abstract":"Abstract In software development, defects influence the quality and cost in an undesirable way. Software defect prediction (SDP) is one of the techniques which improves the software quality and testing efficiency by early identification of defects(bug/fault/error). Thus, several experiments have been suggested for defect prediction (DP) techniques. Mainly DP method utilises historical project data for constructing prediction models. SDP performs well within projects until there is an adequate amount of data accessible to train the models. However, if the data are inadequate or limited for the same project, the researchers mainly use Cross-Project Defect Prediction (CPDP). CPDP is a possible alternative option that refers to anticipating defects using prediction models built on historical data from other projects. CPDP is challenging due to its data distribution and domain difference problem. The proposed framework is an effective two-stage approach for CPDP, i.e., model generation and prediction process. In model generation phase, the conglomeration of different pre-processing, including feature selection and class reweights technique, is used to improve the initial data quality. Finally, a fine-tuned efficient bagging and boosting based hybrid ensemble model is developed, which avoids model over -fitting/under-fitting and helps enhance the prediction performance. In the prediction process phase, the generated model predicts the historical data from other projects, which has defects or clean. The framework is evaluated using25 software projects obtained from public repositories. The result analysis shows that the proposed model has achieved a 0.71±0.03 f1-score, which significantly improves the state-of-the-art approaches by 23 % to 60 %.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"58 1","pages":"137 - 148"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90396663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The research examines the accuracy of current solution models for the Arabic text sentiment classification, including traditional machine learning and deep learning algorithms. The main aim is to detect the opinion and emotion expressed in Telecom companies’ customers tweets. Three supervised machine learning algorithms, Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF), and one deep learning algorithm, Convolutional Neural Network (CNN) were applied to classify the sentiment of 1098 unique Arabic textual tweets. The research results show that deep learning CNN using Word Embedding achieved higher performance in terms of accuracy with F1 score = 0.81. Furthermore, in the aspect classification task, the results reveal that applying Part of Speech (POS) features with deep learning CNN algorithm was efficient and reached 75 % accuracy using a dataset consisting of 1277 tweets. Additionally, in this study, we added an additional task of extracting the geographical location information from the tweet content. The location detection model achieved the following precision values: 0.6 and 0.89 for both Point of Interest (POI) and city (CIT).
{"title":"Aspect-based Sentiment Analysis and Location Detection for Arabic Language Tweets","authors":"N. Alshammari, Amal Almansour","doi":"10.2478/acss-2022-0013","DOIUrl":"https://doi.org/10.2478/acss-2022-0013","url":null,"abstract":"Abstract The research examines the accuracy of current solution models for the Arabic text sentiment classification, including traditional machine learning and deep learning algorithms. The main aim is to detect the opinion and emotion expressed in Telecom companies’ customers tweets. Three supervised machine learning algorithms, Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF), and one deep learning algorithm, Convolutional Neural Network (CNN) were applied to classify the sentiment of 1098 unique Arabic textual tweets. The research results show that deep learning CNN using Word Embedding achieved higher performance in terms of accuracy with F1 score = 0.81. Furthermore, in the aspect classification task, the results reveal that applying Part of Speech (POS) features with deep learning CNN algorithm was efficient and reached 75 % accuracy using a dataset consisting of 1277 tweets. Additionally, in this study, we added an additional task of extracting the geographical location information from the tweet content. The location detection model achieved the following precision values: 0.6 and 0.89 for both Point of Interest (POI) and city (CIT).","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"105 1","pages":"119 - 127"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77810937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}