Increasing dependence on Information and Communication Technologies (ICT) and especially on the Internet in Industrial Control Systems (ICS) has made these systems the primary target of cyber-attacks. As ICS are extensively used in Critical Infrastructures (CI), this makes CI more vulnerable to cyber-attacks and their protection becomes an important issue. On the other hand, cyberattacks can exploit not only software but also physics; that is, they can target the fundamental physical aspects of computation. The newly discovered RowHammer (RH) fault injection attack is a serious vulnerability targeting hardware on reliability and security of DRAM (Dynamic Random Access Memory). Studies on this vulnerability issue raise serious security concerns. The purpose of this study was to overview the RH phenomenon in DRAMs and its possible security risks on ICSs and to discuss a few possible realistic RH attack scenarios for ICSs. The results of the study revealed that RH is a serious security threat to any computer-based system having DRAMs, and this also applies to ICS.
工业控制系统越来越依赖信息和通信技术,尤其是互联网,使这些系统成为网络攻击的主要目标。由于ICS在关键基础设施(CI)中被广泛使用,这使得CI更容易受到网络攻击,其保护成为一个重要问题。另一方面,网络攻击不仅可以利用软件,还可以利用物理;也就是说,它们可以针对计算的基本物理方面。新发现的RowHammer(RH)故障注入攻击是针对DRAM(Dynamic Random Access Memory,动态随机存取存储器)硬件的一个严重的可靠性和安全性漏洞。对这一漏洞问题的研究引起了严重的安全问题。本研究的目的是概述DRAM中的RH现象及其对ICSs可能的安全风险,并讨论ICSs可能存在的几种现实RH攻击场景。研究结果表明,RH对任何具有DRAM的基于计算机的系统都是严重的安全威胁,这也适用于ICS。
{"title":"CYBER SECURITY IN INDUSTRIAL CONTROL SYSTEMS (ICS): A SURVEY OF ROWHAMMER VULNERABILITY","authors":"Hakan Aydin, A. Sertbas","doi":"10.35784/acs-2022-15","DOIUrl":"https://doi.org/10.35784/acs-2022-15","url":null,"abstract":"Increasing dependence on Information and Communication Technologies (ICT) and especially on the Internet in Industrial Control Systems (ICS) has made these systems the primary target of cyber-attacks. As ICS are extensively used in Critical Infrastructures (CI), this makes CI more vulnerable to cyber-attacks and their protection becomes an important issue. On the other hand, cyberattacks can exploit not only software but also physics; that is, they can target the fundamental physical aspects of computation. The newly discovered RowHammer (RH) fault injection attack is a serious vulnerability targeting hardware on reliability and security of DRAM (Dynamic Random Access Memory). Studies on this vulnerability issue raise serious security concerns. The purpose of this study was to overview the RH phenomenon in DRAMs and its possible security risks on ICSs and to discuss a few possible realistic RH attack scenarios for ICSs. The results of the study revealed that RH is a serious security threat to any computer-based system having DRAMs, and this also applies to ICS.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43846177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahmoud Bakr, Sayed Abdel-Gaber, M. Nasr, M. Hazman
Plant diseases are a foremost risk to the safety of food. They have the potential to significantly reduce agricultural products quality and quantity. In agriculture sectors, it is the most prominent challenge to recognize plant diseases. In computer vision, the Convolutional Neural Network (CNN) produces good results when solving image classification tasks. For plant disease diagnosis, many deep learning architectures have been applied. This paper introduces a transfer learning based model for detecting tomato leaf diseases. This study proposes a model of DenseNet201 as a transfer learning-based model and CNN classifier. A comparison study between four deep learning models (VGG16, Inception V3, ResNet152V2 and DenseNet201) done in order to determine the best accuracy in using transfer learning in plant disease detection. The used images dataset contains 22930 photos of tomato leaves in 10 different classes, 9 disorders and one healthy class. In our experimental, the results shows that the proposed model achieves the highest training accuracy of 99.84% and validation accuracy of 99.30%.
{"title":"TOMATO DISEASE DETECTION MODEL BASED ON DENSENET AND TRANSFER LEARNING","authors":"Mahmoud Bakr, Sayed Abdel-Gaber, M. Nasr, M. Hazman","doi":"10.35784/acs-2022-13","DOIUrl":"https://doi.org/10.35784/acs-2022-13","url":null,"abstract":"Plant diseases are a foremost risk to the safety of food. They have the potential to significantly reduce agricultural products quality and quantity. In agriculture sectors, it is the most prominent challenge to recognize plant diseases. In computer vision, the Convolutional Neural Network (CNN) produces good results when solving image classification tasks. For plant disease diagnosis, many deep learning architectures have been applied. This paper introduces a transfer learning based model for detecting tomato leaf diseases. This study proposes a model of DenseNet201 as a transfer learning-based model and CNN classifier. A comparison study between four deep learning models (VGG16, Inception V3, ResNet152V2 and DenseNet201) done in order to determine the best accuracy in using transfer learning in plant disease detection. The used images dataset contains 22930 photos of tomato leaves in 10 different classes, 9 disorders and one healthy class. In our experimental, the results shows that the proposed model achieves the highest training accuracy of 99.84% and validation accuracy of 99.30%.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46391960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katarzyna Orzechowska, T. Rubel, R. Kurjata, Krzysztof Zaremba
Tandem mass spectrometry is an analytical technique widely used in proteomics for the high-throughput characterization of proteins in biological samples. Modern in-depth proteomic studies require the collection of even millions of mass spectra representing short protein fragments (peptides). In order to identify the peptides, the measured spectra are most often scored against a database of amino acid sequences of known proteins. Due to the volume of input data and the sizes of proteomic databases, this is a resource-intensive task, which requires an efficient and scalable computational strategy. Here, we present SparkMS, an algorithm for peptide and protein identification from mass spectrometry data explicitly designed to work in a distributed computational environment. To achieve the required performance and scalability, we use Apache Spark, a modern framework that is becoming increasingly popular not only in the field of “big data” analysis but also in bioinformatics. This paper describes the algorithm in detail and demonstrates its performance on a large proteomic dataset. Experimental results indicate that SparkMS scales with the number of worker nodes and the increasing complexity of the search task. Furthermore, it exhibits a protein identification efficiency comparable to X!Tandem, a widely-used proteomic search engine.
{"title":"A DISTRIBUTED ALGORITHM FOR PROTEIN IDENTIFICATION FROM TANDEM MASS SPECTROMETRY DATA","authors":"Katarzyna Orzechowska, T. Rubel, R. Kurjata, Krzysztof Zaremba","doi":"10.35784/acs-2022-10","DOIUrl":"https://doi.org/10.35784/acs-2022-10","url":null,"abstract":"Tandem mass spectrometry is an analytical technique widely used in proteomics for the high-throughput characterization of proteins in biological samples. Modern in-depth proteomic studies require the collection of even millions of mass spectra representing short protein fragments (peptides). In order to identify the peptides, the measured spectra are most often scored against a database of amino acid sequences of known proteins. Due to the volume of input data and the sizes of proteomic databases, this is a resource-intensive task, which requires an efficient and scalable computational strategy. Here, we present SparkMS, an algorithm for peptide and protein identification from mass spectrometry data explicitly designed to work in a distributed computational environment. To achieve the required performance and scalability, we use Apache Spark, a modern framework that is becoming increasingly popular not only in the field of “big data” analysis but also in bioinformatics. This paper describes the algorithm in detail and demonstrates its performance on a large proteomic dataset. Experimental results indicate that SparkMS scales with the number of worker nodes and the increasing complexity of the search task. Furthermore, it exhibits a protein identification efficiency comparable to X!Tandem, a widely-used proteomic search engine.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43968129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we present a computer simulation model that generates the propagation of sound waves to solve a forward problem in ultrasound transmission tomography. The simulator can be used to create data sets used in the supervised learning process. A solution to the "free-space" boundary problem was proposed, and the memory consumption was significantly optimized from O(n2) to O(n). The given method of simulating wave scattering enables the control of the noise extinction time within the tomographic probe and the permeability of the sound wave. The presented version of the script simulates the classic variant of a circular probe with evenly distributed sensors around the circumference.
{"title":"APPLICATION OF FINITE DIFFERENCE METHOD FOR MEASUREMENT SIMULATION IN ULTRASOUND TRANSMISSION TOMOGRAPHY","authors":"Konrad Kania, Mariusz Mazurek, T. Rymarczyk","doi":"10.35784/acs-2022-16","DOIUrl":"https://doi.org/10.35784/acs-2022-16","url":null,"abstract":"In this work, we present a computer simulation model that generates the propagation of sound waves to solve a forward problem in ultrasound transmission tomography. The simulator can be used to create data sets used in the supervised learning process. A solution to the \"free-space\" boundary problem was proposed, and the memory consumption was significantly optimized from O(n2) to O(n). The given method of simulating wave scattering enables the control of the noise extinction time within the tomographic probe and the permeability of the sound wave. The presented version of the script simulates the classic variant of a circular probe with evenly distributed sensors around the circumference.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48351771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study investigated the use of computer games to detect the symptoms of mild cognitive impairment (MCI), an early stage of dementia, in the elderly. To this end, three serious games were used to measure the visio-perception coordination and psycho-motor abilities, spatial memory, and short-term digit span memory. Subsequently, the correlations between the results of the games and the results of the Korean Mini-Mental State Examination (K-MMSE), a dementia screening test, were analyzed. In addition, the game results of normal elderly persons were compared with those of elderly patients who exhibited MCI symptoms. The results indicated that the game play time and the frequency of errors had significant correlations with K-MMSE. Significant differences were also found in several factors between the control group and the group with MCI. Based on these findings, the advantages and disadvantages of using serious games as tools for screening mild cognitive impairment were discussed.
{"title":"USE OF SERIOUS GAMES FOR THE ASSESSMENT OF MILD COGNITIVE IMPAIRMENT IN THE ELDERLY","authors":"Moon-gee Choi","doi":"10.35784/acs-2022-9","DOIUrl":"https://doi.org/10.35784/acs-2022-9","url":null,"abstract":"This study investigated the use of computer games to detect the symptoms of mild cognitive impairment (MCI), an early stage of dementia, in the elderly. To this end, three serious games were used to measure the visio-perception coordination and psycho-motor abilities, spatial memory, and short-term digit span memory. Subsequently, the correlations between the results of the games and the results of the Korean Mini-Mental State Examination (K-MMSE), a dementia screening test, were analyzed. In addition, the game results of normal elderly persons were compared with those of elderly patients who exhibited MCI symptoms. The results indicated that the game play time and the frequency of errors had significant correlations with K-MMSE. Significant differences were also found in several factors between the control group and the group with MCI. Based on these findings, the advantages and disadvantages of using serious games as tools for screening mild cognitive impairment were discussed.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46956379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The healthcare industry is one of the many out there that could majorly benefit from advancement in the technology it utilizes. Artificial intelligence (AI) technologies are especially integral and specifically deep learning (DL); a highly useful data-driven technology. It is applied in a variety of different methods but it mainly depends on the structure of the available data. However, with varying applications, this technology produces data in different contexts with particular connotations. Reports which are the images of scans play a great role in identifying the existence of the disease in a patient. Further, the automation in processing these images using technology like CNN-based models makes it highly efficient in reducing human errors otherwise resulting in large data. Hence this study presents a hybrid deep learning architecture to classify the histopathology images to identify the presence of cancer in a patient. Further, the proposed models are parallelized using the TensorFlow-GPU framework to accelerate the training of these deep CNN (Convolution Neural Networks) architectures. This study uses the transfer learning technique during training and early stopping criteria are used to avoid overfitting during the training phase. these models use LSTM parallel layer imposed in the model to experiment with four considered architectures such as MobileNet, VGG16, and ResNet with 101 and 152 layers. The experimental results produced by these hybrid models show that the capability of Hybrid ResNet101 and Hybrid ResNet152 architectures are highly suitable with an accuracy of 90% and 92%. Finally, this study concludes that the proposed Hybrid ResNet-152 architecture is highly efficient in classifying the histopathology images. The proposed study has conducted a well-focused and detailed experimental study which will further help researchers to understand the deep CNN architectures to be applied in application development.
{"title":"HISTOPATHOLOGY IMAGE CLASSIFICATION USING HYBRID PARALLEL STRUCTURED DEEP-CNN MODELS","authors":"K. Dsouza, Z. Ansari","doi":"10.35784/acs-2022-2","DOIUrl":"https://doi.org/10.35784/acs-2022-2","url":null,"abstract":"The healthcare industry is one of the many out there that could majorly benefit from advancement in the technology it utilizes. Artificial intelligence (AI) technologies are especially integral and specifically deep learning (DL); a highly useful data-driven technology. It is applied in a variety of different methods but it mainly depends on the structure of the available data. However, with varying applications, this technology produces data in different contexts with particular connotations. Reports which are the images of scans play a great role in identifying the existence of the disease in a patient. Further, the automation in processing these images using technology like CNN-based models makes it highly efficient in reducing human errors otherwise resulting in large data. Hence this study presents a hybrid deep learning architecture to classify the histopathology images to identify the presence of cancer in a patient. Further, the proposed models are parallelized using the TensorFlow-GPU framework to accelerate the training of these deep CNN (Convolution Neural Networks) architectures. This study uses the transfer learning technique during training and early stopping criteria are used to avoid overfitting during the training phase. these models use LSTM parallel layer imposed in the model to experiment with four considered architectures such as MobileNet, VGG16, and ResNet with 101 and 152 layers. The experimental results produced by these hybrid models show that the capability of Hybrid ResNet101 and Hybrid ResNet152 architectures are highly suitable with an accuracy of 90% and 92%. Finally, this study concludes that the proposed Hybrid ResNet-152 architecture is highly efficient in classifying the histopathology images. The proposed study has conducted a well-focused and detailed experimental study which will further help researchers to understand the deep CNN architectures to be applied in application development.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41613980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Kliza, Karol Ścisłowski, K. Siadkowska, Jacek Padyjasek, M. Wendeker
This paper investigates the strenght of a conceptual main rotor blade dedicated to an unmanned helicopter. The blade is made of smart materials in order to optimize the efficiency of the aircraft by increasing its aerodynamic performance. This purpose was achieved by performing a series of strength calculations for the blade of a prototype main rotor used in an unmanned helicopter. The calculations were done with the Finite Element Method (FEM) and software like CAE (Computer-Aided Engineering) which uses advanced techniques of computer modeling of load in composite structures. Our analysis included CAD (Computer-Aided Design) modeling the rotor blade, importing the solid model into the CAE software, defining the simulation boundary conditions and performing strength calculations of the blade spar for selected materials used in aviation, i.e. fiberglass and carbon fiber laminate. This paper presents the results and analysis of the numerical calculations.
{"title":"STRENGTH ANALYSIS OF A PROTOTYPE COMPOSITE HELICOPTER ROTOR BLADE SPAR","authors":"R. Kliza, Karol Ścisłowski, K. Siadkowska, Jacek Padyjasek, M. Wendeker","doi":"10.35784/acs-2022-1","DOIUrl":"https://doi.org/10.35784/acs-2022-1","url":null,"abstract":"This paper investigates the strenght of a conceptual main rotor blade dedicated to an unmanned helicopter. The blade is made of smart materials in order to optimize the efficiency of the aircraft by increasing its aerodynamic performance. This purpose was achieved by performing a series of strength calculations for the blade of a prototype main rotor used in an unmanned helicopter. The calculations were done with the Finite Element Method (FEM) and software like CAE (Computer-Aided Engineering) which uses advanced techniques of computer modeling of load in composite structures. Our analysis included CAD (Computer-Aided Design) modeling the rotor blade, importing the solid model into the CAE software, defining the simulation boundary conditions and performing strength calculations of the blade spar for selected materials used in aviation, i.e. fiberglass and carbon fiber laminate. This paper presents the results and analysis of the numerical calculations.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43308177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detection and classification of vegetation is a crucial technical task in the management of natural resources since vegetation serves as a foundation for all living things and has a significant impact on climate change such as impacting terrestrial carbon dioxide (CO2). Traditional approaches for acquiring vegetation covers such as field surveys, map interpretation, collateral and data analysis are ineffective as they are time consuming and expensive. In this paper vegetation regions are automatically detected by applying simple but effective vegetation indices Normalized Difference Vegetation Index (NDVI) and Soil Adjusted Vegetation Index (SAVI) on red(R) and near infrared (NIR) bands of Landsat-8 satellite image. Remote sensing technology makes it possible to analyze vegetation cover across wide areas in a cost-effective manner. Using remotely sensed images, the mapping of vegetation requires a number of factors, techniques, and methodologies. The rapid improvement of remote sensing technologies broadens possibilities for image sources making remotely sensed images more accessible. The dataset used in this paper is the R and NIR bands of Level-1 Tier 1 Landsat-8 optical remote sensing image acquired on 6th September 2013, is processed and made available to users on 2nd May 2017. The pre-processing involving sub-setting operation is performed using the ERDAS Imagine tool on R and NIR bands of Landsat-8 image. The NDVI and SAVI are utilized to extract vegetation features automatically by using python language. Finally by establishing a threshold, vegetation cover of the research area is detected and then classified.
{"title":"DETECTION AND CLASSIFICATION OF VEGETATION AREAS FROM RED AND NEAR INFRARED BANDS OF LANDSAT-8 OPTICAL SATELLITE IMAGE","authors":"Anusha Nallapareddy","doi":"10.35784/acs-2022-4","DOIUrl":"https://doi.org/10.35784/acs-2022-4","url":null,"abstract":"Detection and classification of vegetation is a crucial technical task in the management of natural resources since vegetation serves as a foundation for all living things and has a significant impact on climate change such as impacting terrestrial carbon dioxide (CO2). Traditional approaches for acquiring vegetation covers such as field surveys, map interpretation, collateral and data analysis are ineffective as they are time consuming and expensive. In this paper vegetation regions are automatically detected by applying simple but effective vegetation indices Normalized Difference Vegetation Index (NDVI) and Soil Adjusted Vegetation Index (SAVI) on red(R) and near infrared (NIR) bands of Landsat-8 satellite image. Remote sensing technology makes it possible to analyze vegetation cover across wide areas in a cost-effective manner. Using remotely sensed images, the mapping of vegetation requires a number of factors, techniques, and methodologies. The rapid improvement of remote sensing technologies broadens possibilities for image sources making remotely sensed images more accessible. The dataset used in this paper is the R and NIR bands of Level-1 Tier 1 Landsat-8 optical remote sensing image acquired on 6th September 2013, is processed and made available to users on 2nd May 2017. The pre-processing involving sub-setting operation is performed using the ERDAS Imagine tool on R and NIR bands of Landsat-8 image. The NDVI and SAVI are utilized to extract vegetation features automatically by using python language. Finally by establishing a threshold, vegetation cover of the research area is detected and then classified.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49162887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the paper, the authors are presenting the outcome of web scraping software allowing for the automated classification of source code. The software system was prepared for a discussion forum for software developers to find fragments of source code that were published without marking them as code snippets. The analyzer software is using a Machine Learning binary classification model for differentiating between a programming language source code and highly technical text about software. The analyzer model was prepared using the AutoML subsystem without human intervention and fine-tuning and its accuracy in a described problem exceeds 95%. The analyzer based on the automatically generated model has been deployed and after the first year of continuous operation, its False Positive Rate is less than 3%. The similar process may be introduced in document management in software development process, where automatic tagging and search for code or pseudo-code may be useful for archiving purposes.
{"title":"DETECTION OF SOURCE CODE IN INTERNET TEXTS USING AUTOMATICALLY GENERATED MACHINE LEARNING MODELS","authors":"M. Badurowicz","doi":"10.35784/acs-2022-7","DOIUrl":"https://doi.org/10.35784/acs-2022-7","url":null,"abstract":"In the paper, the authors are presenting the outcome of web scraping software allowing for the automated classification of source code. The software system was prepared for a discussion forum for software developers to find fragments of source code that were published without marking them as code snippets. The analyzer software is using a Machine Learning binary classification model for differentiating between a programming language source code and highly technical text about software. The analyzer model was prepared using the AutoML subsystem without human intervention and fine-tuning and its accuracy in a described problem exceeds 95%. The analyzer based on the automatically generated model has been deployed and after the first year of continuous operation, its False Positive Rate is less than 3%. The similar process may be introduced in document management in software development process, where automatic tagging and search for code or pseudo-code may be useful for archiving purposes.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43145154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, heart disease is the major cause of deaths globally. According to a survey conducted by the World Health Organization, almost 18 million people die of heart diseases (or cardiovascular diseases) every day. So, there should be a system for early detection and prevention of heart disease. Detection of heart disease mostly depends on the huge pathological and clinical data that is quite complex. So, researchers and other medical professionals are showing keen interest in accurate prediction of heart disease. Heart disease is a general term for a large number of medical conditions related to heart and one of them is the coronary heart disease (CHD). Coronary heart disease is caused by the amassing of plaque on the artery walls. In this paper, various machine learning base and ensemble classifiers have been applied on heart disease dataset for efficient prediction of coronary heart disease. Various machine learning classifiers that have been employed include k-nearest neighbor, multilayer perceptron, multinomial naïve bayes, logistic regression, decision tree, random forest and support vector machine classifiers. Ensemble classifiers that have been used include majority voting, weighted average, bagging and boosting classifiers. The dataset used in this study is obtained from the Framingham Heart Study which is a long-term, ongoing cardiovascular study of people from the Framingham city in Massachusetts, USA. To evaluate the performance of the classifiers, various evaluation metrics including accuracy, precision, recall and f1 score have been used. According to our results, the best accuracy was achieved by logistic regression, random forest, majority voting, weighted average and bagging classifiers but the highest accuracy among these was achieved using weighted average ensemble classifier.
{"title":"IMPROVING CORONARY HEART DISEASE PREDICTION BY OUTLIER ELIMINATION","authors":"Lubna Riyaz, M. A. Butt, Majid Zaman","doi":"10.35784/acs-2022-6","DOIUrl":"https://doi.org/10.35784/acs-2022-6","url":null,"abstract":"Nowadays, heart disease is the major cause of deaths globally. According to a survey conducted by the World Health Organization, almost 18 million people die of heart diseases (or cardiovascular diseases) every day. So, there should be a system for early detection and prevention of heart disease. Detection of heart disease mostly depends on the huge pathological and clinical data that is quite complex. So, researchers and other medical professionals are showing keen interest in accurate prediction of heart disease. Heart disease is a general term for a large number of medical conditions related to heart and one of them is the coronary heart disease (CHD). Coronary heart disease is caused by the amassing of plaque on the artery walls. In this paper, various machine learning base and ensemble classifiers have been applied on heart disease dataset for efficient prediction of coronary heart disease. Various machine learning classifiers that have been employed include k-nearest neighbor, multilayer perceptron, multinomial naïve bayes, logistic regression, decision tree, random forest and support vector machine classifiers. Ensemble classifiers that have been used include majority voting, weighted average, bagging and boosting classifiers. The dataset used in this study is obtained from the Framingham Heart Study which is a long-term, ongoing cardiovascular study of people from the Framingham city in Massachusetts, USA. To evaluate the performance of the classifiers, various evaluation metrics including accuracy, precision, recall and f1 score have been used. According to our results, the best accuracy was achieved by logistic regression, random forest, majority voting, weighted average and bagging classifiers but the highest accuracy among these was achieved using weighted average ensemble classifier. ","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42392134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}