Pub Date : 2021-10-01DOI: 10.2478/jaiscr-2021-0018
R. Nowicki, R. Seliga, Dariusz Żelasko, Y. Hayashi
Abstract The paper presents a performance analysis of a selected few rough set–based classification systems. They are hybrid solutions designed to process information with missing values. Rough set-–based classification systems combine various classification methods, such as support vector machines, k–nearest neighbour, fuzzy systems, and neural networks with the rough set theory. When all input values take the form of real numbers, and they are available, the structure of the classifier returns to a non–rough set version. The performance of the four systems has been analysed based on the classification results obtained for benchmark databases downloaded from the machine learning repository of the University of California at Irvine.
{"title":"Performance Analysis of Rough Set–Based Hybrid Classification Systems in the Case of Missing Values","authors":"R. Nowicki, R. Seliga, Dariusz Żelasko, Y. Hayashi","doi":"10.2478/jaiscr-2021-0018","DOIUrl":"https://doi.org/10.2478/jaiscr-2021-0018","url":null,"abstract":"Abstract The paper presents a performance analysis of a selected few rough set–based classification systems. They are hybrid solutions designed to process information with missing values. Rough set-–based classification systems combine various classification methods, such as support vector machines, k–nearest neighbour, fuzzy systems, and neural networks with the rough set theory. When all input values take the form of real numbers, and they are available, the structure of the classifier returns to a non–rough set version. The performance of the four systems has been analysed based on the classification results obtained for benchmark databases downloaded from the machine learning repository of the University of California at Irvine.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"307 - 318"},"PeriodicalIF":2.8,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41855870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-29DOI: 10.2478/jaiscr-2021-0014
Michal R. Wróbel, Janusz T. Starczewski, J. Fijałkowska, A. Siwocha, Christian Napoli
Abstract Handwritten text recognition systems interpret the scanned script images as text composed of letters. In this paper, efficient offline methods using fuzzy degrees, as well as interval fuzzy degrees of type-2, are proposed to recognize letters beforehand decomposed into strokes. For such strokes, the first stage methods are used to create a set of hypotheses as to whether a group of strokes matches letter or digit patterns. Subsequently, the second-stage methods are employed to select the most promising set of hypotheses with the use of fuzzy degrees. In a primary version of the second-stage system, standard fuzzy memberships are used to measure compatibility between strokes and character patterns. As an extension of the system thus created, interval type-2 fuzzy degrees are employed to perform a selection of hypotheses that fit multiple handwriting typefaces.
{"title":"Handwritten Word Recognition Using Fuzzy Matching Degrees","authors":"Michal R. Wróbel, Janusz T. Starczewski, J. Fijałkowska, A. Siwocha, Christian Napoli","doi":"10.2478/jaiscr-2021-0014","DOIUrl":"https://doi.org/10.2478/jaiscr-2021-0014","url":null,"abstract":"Abstract Handwritten text recognition systems interpret the scanned script images as text composed of letters. In this paper, efficient offline methods using fuzzy degrees, as well as interval fuzzy degrees of type-2, are proposed to recognize letters beforehand decomposed into strokes. For such strokes, the first stage methods are used to create a set of hypotheses as to whether a group of strokes matches letter or digit patterns. Subsequently, the second-stage methods are employed to select the most promising set of hypotheses with the use of fuzzy degrees. In a primary version of the second-stage system, standard fuzzy memberships are used to measure compatibility between strokes and character patterns. As an extension of the system thus created, interval type-2 fuzzy degrees are employed to perform a selection of hypotheses that fit multiple handwriting typefaces.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"229 - 242"},"PeriodicalIF":2.8,"publicationDate":"2021-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45700583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-29DOI: 10.2478/jaiscr-2021-0012
Wojciech Rafajłowicz
Abstract Let a class of proper curves is specified by positive examples only. We aim to propose a learning novelty detection algorithm that decides whether a new curve is outside this class or not. In opposite to the majority of the literature, two sources of a curve variability are present, namely, the one inherent to curves from the proper class and observations errors’. Therefore, firstly a decision function is trained on historical data, and then, descriptors of each curve to be classified are learned from noisy observations.When the intrinsic variability is Gaussian, a decision threshold can be established from T 2 Hotelling distribution and tuned to more general cases. Expansion coefficients in a selected orthogonal series are taken as descriptors and an algorithm for their learning is proposed that follows nonparametric curve fitting approaches. Its fast version is derived for descriptors that are based on the cosine series. Additionally, the asymptotic normality of learned descriptors and the bound for the probability of their large deviations are proved. The influence of this bound on the decision threshold is also discussed.The proposed approach covers curves described as functional data projected onto a finite-dimensional subspace of a Hilbert space as well a shape sensitive description of curves, known as square-root velocity (SRV). It was tested both on synthetic data and on real-life observations of the COVID-19 growth curves.
{"title":"Learning Novelty Detection Outside a Class of Random Curves with Application to COVID-19 Growth","authors":"Wojciech Rafajłowicz","doi":"10.2478/jaiscr-2021-0012","DOIUrl":"https://doi.org/10.2478/jaiscr-2021-0012","url":null,"abstract":"Abstract Let a class of proper curves is specified by positive examples only. We aim to propose a learning novelty detection algorithm that decides whether a new curve is outside this class or not. In opposite to the majority of the literature, two sources of a curve variability are present, namely, the one inherent to curves from the proper class and observations errors’. Therefore, firstly a decision function is trained on historical data, and then, descriptors of each curve to be classified are learned from noisy observations.When the intrinsic variability is Gaussian, a decision threshold can be established from T 2 Hotelling distribution and tuned to more general cases. Expansion coefficients in a selected orthogonal series are taken as descriptors and an algorithm for their learning is proposed that follows nonparametric curve fitting approaches. Its fast version is derived for descriptors that are based on the cosine series. Additionally, the asymptotic normality of learned descriptors and the bound for the probability of their large deviations are proved. The influence of this bound on the decision threshold is also discussed.The proposed approach covers curves described as functional data projected onto a finite-dimensional subspace of a Hilbert space as well a shape sensitive description of curves, known as square-root velocity (SRV). It was tested both on synthetic data and on real-life observations of the COVID-19 growth curves.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"195 - 215"},"PeriodicalIF":2.8,"publicationDate":"2021-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44979290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-29DOI: 10.2478/jaiscr-2021-0013
T. Gałkowski, A. Krzyżak, Zofia Patora-Wysocka, Z. Filutowicz, Lipo Wang
Abstract In the paper we develop an algorithm based on the Parzen kernel estimate for detection of sudden changes in 3-dimensional shapes which happen along the edge curves. Such problems commonly arise in various areas of computer vision, e.g., in edge detection, bioinformatics and processing of satellite imagery. In many engineering problems abrupt change detection may help in fault protection e.g. the jump detection in functions describing the static and dynamic properties of the objects in mechanical systems. We developed an algorithm for detecting abrupt changes which is nonparametric in nature and utilizes Parzen regression estimates of multivariate functions and their derivatives. In tests we apply this method, particularly but not exclusively, to the functions of two variables.
{"title":"A New Approach to Detection of Changes in Multidimensional Patterns - Part II","authors":"T. Gałkowski, A. Krzyżak, Zofia Patora-Wysocka, Z. Filutowicz, Lipo Wang","doi":"10.2478/jaiscr-2021-0013","DOIUrl":"https://doi.org/10.2478/jaiscr-2021-0013","url":null,"abstract":"Abstract In the paper we develop an algorithm based on the Parzen kernel estimate for detection of sudden changes in 3-dimensional shapes which happen along the edge curves. Such problems commonly arise in various areas of computer vision, e.g., in edge detection, bioinformatics and processing of satellite imagery. In many engineering problems abrupt change detection may help in fault protection e.g. the jump detection in functions describing the static and dynamic properties of the objects in mechanical systems. We developed an algorithm for detecting abrupt changes which is nonparametric in nature and utilizes Parzen regression estimates of multivariate functions and their derivatives. In tests we apply this method, particularly but not exclusively, to the functions of two variables.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"217 - 227"},"PeriodicalIF":2.8,"publicationDate":"2021-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41727224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-29DOI: 10.2478/jaiscr-2021-0015
P. Dziwiński, A. Przybył, P. Trippner, J. Paszkowski, Y. Hayashi
Abstract Over the last several decades, neuro-fuzzy systems (NFS) have been widely analyzed and described in the literature because of their many advantages. They can model the uncertainty characteristic of human reasoning and the possibility of a universal approximation. These properties allow, for example, for the implementation of nonlinear control and modeling systems of better quality than would be possible with the use of classical methods. However, according to the authors, the number of NFS applications deployed so far is not large enough. This is because the implementation of NFS on typical digital platforms, such as, for example, microcontrollers, has not led to sufficiently high performance. On the other hand, the world literature describes many cases of NFS hardware implementation in programmable gate arrays (FPGAs) offering sufficiently high performance. Unfortunately, the complexity and cost of such systems were so high that the solutions were not very successful. This paper proposes a method of the hardware implementation of MRBF-TS systems. Such systems are created by modifying a subclass of Takagi-Sugeno (TS) fuzzy-neural structures, i.e. the NFS group functionally equivalent to networks with radial basis functions (RBF). The structure of the MRBF-TS is designed to be well suited to the implementation on an FPGA. Thanks to this, it is possible to obtain both very high computing efficiency and high accuracy with relatively low consumption of hardware resources. This paper describes both, the method of implementing MRBFTS type structures on the FPGA and the method of designing such structures based on the population algorithm. The described solution allows for the implementation of control or modeling systems, the implementation of which was impossible so far due to technical or economic reasons.
{"title":"Hardware Implementation of a Takagi-Sugeno Neuro-Fuzzy System Optimized by a Population Algorithm","authors":"P. Dziwiński, A. Przybył, P. Trippner, J. Paszkowski, Y. Hayashi","doi":"10.2478/jaiscr-2021-0015","DOIUrl":"https://doi.org/10.2478/jaiscr-2021-0015","url":null,"abstract":"Abstract Over the last several decades, neuro-fuzzy systems (NFS) have been widely analyzed and described in the literature because of their many advantages. They can model the uncertainty characteristic of human reasoning and the possibility of a universal approximation. These properties allow, for example, for the implementation of nonlinear control and modeling systems of better quality than would be possible with the use of classical methods. However, according to the authors, the number of NFS applications deployed so far is not large enough. This is because the implementation of NFS on typical digital platforms, such as, for example, microcontrollers, has not led to sufficiently high performance. On the other hand, the world literature describes many cases of NFS hardware implementation in programmable gate arrays (FPGAs) offering sufficiently high performance. Unfortunately, the complexity and cost of such systems were so high that the solutions were not very successful. This paper proposes a method of the hardware implementation of MRBF-TS systems. Such systems are created by modifying a subclass of Takagi-Sugeno (TS) fuzzy-neural structures, i.e. the NFS group functionally equivalent to networks with radial basis functions (RBF). The structure of the MRBF-TS is designed to be well suited to the implementation on an FPGA. Thanks to this, it is possible to obtain both very high computing efficiency and high accuracy with relatively low consumption of hardware resources. This paper describes both, the method of implementing MRBFTS type structures on the FPGA and the method of designing such structures based on the population algorithm. The described solution allows for the implementation of control or modeling systems, the implementation of which was impossible so far due to technical or economic reasons.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"243 - 266"},"PeriodicalIF":2.8,"publicationDate":"2021-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42812454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-29DOI: 10.2478/jaiscr-2021-0011
Jiaqing Lv, M. Pawlak
Abstract This paper addresses the issue of data-driven smoothing parameter (bandwidth) selection in the context of nonparametric system identification of dynamic systems. In particular, we examine the identification problem of the block-oriented Hammerstein cascade system. A class of kernel-type Generalized Regression Neural Networks (GRNN) is employed as the identification algorithm. The statistical accuracy of the kernel GRNN estimate is critically influenced by the choice of the bandwidth. Given the need of data-driven bandwidth specification we propose several automatic selection methods that are compared by means of simulation studies. Our experiments reveal that the method referred to as the partitioned cross-validation algorithm can be recommended as the practical procedure for the bandwidth choice for the kernel GRNN estimate in terms of its statistical accuracy and implementation aspects.
{"title":"Bandwidth Selection for Kernel Generalized Regression Neural Networks in Identification of Hammerstein Systems","authors":"Jiaqing Lv, M. Pawlak","doi":"10.2478/jaiscr-2021-0011","DOIUrl":"https://doi.org/10.2478/jaiscr-2021-0011","url":null,"abstract":"Abstract This paper addresses the issue of data-driven smoothing parameter (bandwidth) selection in the context of nonparametric system identification of dynamic systems. In particular, we examine the identification problem of the block-oriented Hammerstein cascade system. A class of kernel-type Generalized Regression Neural Networks (GRNN) is employed as the identification algorithm. The statistical accuracy of the kernel GRNN estimate is critically influenced by the choice of the bandwidth. Given the need of data-driven bandwidth specification we propose several automatic selection methods that are compared by means of simulation studies. Our experiments reveal that the method referred to as the partitioned cross-validation algorithm can be recommended as the practical procedure for the bandwidth choice for the kernel GRNN estimate in terms of its statistical accuracy and implementation aspects.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"181 - 194"},"PeriodicalIF":2.8,"publicationDate":"2021-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42312435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.2478/jaiscr-2022-0006
M. Pérez-Pons, Javier Parra-Domínguez, S. Omatu, E. Herrera-Viedma, J. Corchado
Abstract Context: Machine Learning (ML) is a disruptive concept that has given rise to and generated interest in different applications in many fields of study. The purpose of Machine Learning is to solve real-life problems by automatically learning and improving from experience without being explicitly programmed for a specific problem, but for a generic type of problem. This article approaches the different applications of ML in a series of econometric methods. Objective: The objective of this research is to identify the latest applications and do a comparative study of the performance of econometric and ML models. The study aimed to find empirical evidence for the performance of ML algorithms being superior to traditional econometric models. The Methodology of systematic mapping of literature has been followed to carry out this research, according to the guidelines established by [39], and [58] that facilitate the identification of studies published about this subject. Results: The results show, that in most cases ML outperforms econometric models, while in other cases the best performance has been achieved by combining traditional methods and ML applications. Conclusion: inclusion and exclusions criteria have been applied and 52 articles closely related articles have been reviewed. The conclusion drawn from this research is that it is a field that is growing, which is something that is well known nowadays and that there is no certainty as to the performance of ML being always superior to that of econometric models.
{"title":"Machine Learning and Traditional Econometric Models: A Systematic Mapping Study","authors":"M. Pérez-Pons, Javier Parra-Domínguez, S. Omatu, E. Herrera-Viedma, J. Corchado","doi":"10.2478/jaiscr-2022-0006","DOIUrl":"https://doi.org/10.2478/jaiscr-2022-0006","url":null,"abstract":"Abstract Context: Machine Learning (ML) is a disruptive concept that has given rise to and generated interest in different applications in many fields of study. The purpose of Machine Learning is to solve real-life problems by automatically learning and improving from experience without being explicitly programmed for a specific problem, but for a generic type of problem. This article approaches the different applications of ML in a series of econometric methods. Objective: The objective of this research is to identify the latest applications and do a comparative study of the performance of econometric and ML models. The study aimed to find empirical evidence for the performance of ML algorithms being superior to traditional econometric models. The Methodology of systematic mapping of literature has been followed to carry out this research, according to the guidelines established by [39], and [58] that facilitate the identification of studies published about this subject. Results: The results show, that in most cases ML outperforms econometric models, while in other cases the best performance has been achieved by combining traditional methods and ML applications. Conclusion: inclusion and exclusions criteria have been applied and 52 articles closely related articles have been reviewed. The conclusion drawn from this research is that it is a field that is growing, which is something that is well known nowadays and that there is no certainty as to the performance of ML being always superior to that of econometric models.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"79 - 100"},"PeriodicalIF":2.8,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45119459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.2478/jaiscr-2022-0007
Christophe Karam, Julia El Zini, M. Awad, C. Saade, L. Naffaa, Mohammad Ali K. El Amine
Abstract There has been an amplified focus on and benefit from the adoption of artificial intelligence (AI) in medical imaging applications. However, deep learning approaches involve training with massive amounts of annotated data in order to guarantee generalization and achieve high accuracies. Gathering and annotating large sets of training images require expertise which is both expensive and time-consuming, especially in the medical field. Furthermore, in health care systems where mistakes can have catastrophic consequences, there is a general mistrust in the black-box aspect of AI models. In this work, we focus on improving the performance of medical imaging applications when limited data is available while focusing on the interpretability aspect of the proposed AI model. This is achieved by employing a novel transfer learning framework, progressive transfer learning, an automated annotation technique and a correlation analysis experiment on the learned representations. Progressive transfer learning helps jump-start the training of deep neural networks while improving the performance by gradually transferring knowledge from two source tasks into the target task. It is empirically tested on the wrist fracture detection application by first training a general radiology network RadiNet and using its weights to initialize RadiNetwrist, that is trained on wrist images to detect fractures. Experiments show that RadiNetwrist achieves an accuracy of 87% and an AUC ROC of 94% as opposed to 83% and 92% when it is pre-trained on the ImageNet dataset. This improvement in performance is investigated within an explainable AI framework. More concretely, the learned deep representations of RadiNetwrist are compared to those learned by the baseline model by conducting a correlation analysis experiment. The results show that, when transfer learning is gradually applied, some features are learned earlier in the network. Moreover, the deep layers in the progressive transfer learning framework are shown to encode features that are not encountered when traditional transfer learning techniques are applied. In addition to the empirical results, a clinical study is conducted and the performance of RadiNetwrist is compared to that of an expert radiologist. We found that RadiNetwrist exhibited similar performance to that of radiologists with more than 20 years of experience. This motivates follow-up research to train on more data to feasibly surpass radiologists’ performance, and investigate the interpretability of AI models in the healthcare domain where the decision-making process needs to be credible and transparent.
{"title":"A Progressive and Cross-Domain Deep Transfer Learning Framework for Wrist Fracture Detection","authors":"Christophe Karam, Julia El Zini, M. Awad, C. Saade, L. Naffaa, Mohammad Ali K. El Amine","doi":"10.2478/jaiscr-2022-0007","DOIUrl":"https://doi.org/10.2478/jaiscr-2022-0007","url":null,"abstract":"Abstract There has been an amplified focus on and benefit from the adoption of artificial intelligence (AI) in medical imaging applications. However, deep learning approaches involve training with massive amounts of annotated data in order to guarantee generalization and achieve high accuracies. Gathering and annotating large sets of training images require expertise which is both expensive and time-consuming, especially in the medical field. Furthermore, in health care systems where mistakes can have catastrophic consequences, there is a general mistrust in the black-box aspect of AI models. In this work, we focus on improving the performance of medical imaging applications when limited data is available while focusing on the interpretability aspect of the proposed AI model. This is achieved by employing a novel transfer learning framework, progressive transfer learning, an automated annotation technique and a correlation analysis experiment on the learned representations. Progressive transfer learning helps jump-start the training of deep neural networks while improving the performance by gradually transferring knowledge from two source tasks into the target task. It is empirically tested on the wrist fracture detection application by first training a general radiology network RadiNet and using its weights to initialize RadiNetwrist, that is trained on wrist images to detect fractures. Experiments show that RadiNetwrist achieves an accuracy of 87% and an AUC ROC of 94% as opposed to 83% and 92% when it is pre-trained on the ImageNet dataset. This improvement in performance is investigated within an explainable AI framework. More concretely, the learned deep representations of RadiNetwrist are compared to those learned by the baseline model by conducting a correlation analysis experiment. The results show that, when transfer learning is gradually applied, some features are learned earlier in the network. Moreover, the deep layers in the progressive transfer learning framework are shown to encode features that are not encountered when traditional transfer learning techniques are applied. In addition to the empirical results, a clinical study is conducted and the performance of RadiNetwrist is compared to that of an expert radiologist. We found that RadiNetwrist exhibited similar performance to that of radiologists with more than 20 years of experience. This motivates follow-up research to train on more data to feasibly surpass radiologists’ performance, and investigate the interpretability of AI models in the healthcare domain where the decision-making process needs to be credible and transparent.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"101 - 120"},"PeriodicalIF":2.8,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43988768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.2478/jaiscr-2022-0008
Ke Qing, Rongsheng Zhang
Abstract Text-based CAPTCHA is a convenient and effective safety mechanism that has been widely deployed across websites. The efficient end-to-end models of scene text recognition consisting of CNN and attention-based RNN show limited performance in solving text-based CAPTCHAs. In contrast with the street view image and document, the character sequence in CAPTCHA is non-semantic. The RNN loses its ability to learn the semantic context and only implicitly encodes the relative position of extracted features. Meanwhile, the security features, which prevent characters from segmentation and recognition, extensively increase the complexity of CAPTCHAs. The performance of this model is sensitive to different CAPTCHA schemes. In this paper, we analyze the properties of the text-based CAPTCHA and accordingly consider solving it as a highly position-relative character sequence recognition task. We propose a network named PosConv to leverage the position information in the character sequence without RNN. PosConv uses a novel padding strategy and modified convolution, explicitly encoding the relative position into the local features of characters. This mechanism of PosConv makes the extracted features from CAPTCHAs more informative and robust. We validate PosConv on six text-based CAPTCHA schemes, and it achieves state-of-the-art or competitive recognition accuracy with significantly fewer parameters and faster convergence speed.
{"title":"Position-Encoding Convolutional Network to Solving Connected Text Captcha","authors":"Ke Qing, Rongsheng Zhang","doi":"10.2478/jaiscr-2022-0008","DOIUrl":"https://doi.org/10.2478/jaiscr-2022-0008","url":null,"abstract":"Abstract Text-based CAPTCHA is a convenient and effective safety mechanism that has been widely deployed across websites. The efficient end-to-end models of scene text recognition consisting of CNN and attention-based RNN show limited performance in solving text-based CAPTCHAs. In contrast with the street view image and document, the character sequence in CAPTCHA is non-semantic. The RNN loses its ability to learn the semantic context and only implicitly encodes the relative position of extracted features. Meanwhile, the security features, which prevent characters from segmentation and recognition, extensively increase the complexity of CAPTCHAs. The performance of this model is sensitive to different CAPTCHA schemes. In this paper, we analyze the properties of the text-based CAPTCHA and accordingly consider solving it as a highly position-relative character sequence recognition task. We propose a network named PosConv to leverage the position information in the character sequence without RNN. PosConv uses a novel padding strategy and modified convolution, explicitly encoding the relative position into the local features of characters. This mechanism of PosConv makes the extracted features from CAPTCHAs more informative and robust. We validate PosConv on six text-based CAPTCHA schemes, and it achieves state-of-the-art or competitive recognition accuracy with significantly fewer parameters and faster convergence speed.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"121 - 133"},"PeriodicalIF":2.8,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42874156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.2478/jaiscr-2022-0009
Marton Szemenyei, Patrik Reizinger
Abstract 1Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multi-agent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.
{"title":"Handling Realistic Noise in Multi-Agent Systems with Self-Supervised Learning and Curiosity","authors":"Marton Szemenyei, Patrik Reizinger","doi":"10.2478/jaiscr-2022-0009","DOIUrl":"https://doi.org/10.2478/jaiscr-2022-0009","url":null,"abstract":"Abstract 1Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multi-agent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"135 - 148"},"PeriodicalIF":2.8,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43654733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}