Pub Date : 2022-12-15DOI: 10.22630/mgv.2022.31.1.3
Takwa Ben Aïcha Gader, Afef Kacem Echi
This work proposes a segmentation-free approach to Arabic Handwritten Text Recognition (AHTR): an attention-based Convolutional Neural Network - Recurrent Neural Network - Connectionist Temporal Classification (CNN-RNN-CTC) deep learning architecture. The model receives as input an image and provides, through a CNN, a sequence of essential features, which are transferred to an Attention-based Bidirectional Long Short-Term Memory Network (BLSTM). The BLSTM gives features sequence in order, and the attention mechanism allows the selection of relevant information from the features sequences. The selected information is then fed to the CTC, enabling the loss calculation and the transcription prediction. The contribution lies in extending the CNN by dropout layers, batch normalization, and dropout regularization parameters to prevent over-fitting. The output of the RNN block is passed through an attention mechanism to utilize the most relevant parts of the input sequence in a flexible manner. This solution enhances previous methods by improving the CNN speed and performance and controlling over model over-fitting. The proposed system achieves the best accuracy of 97.1% for the IFN-ENIT Arabic script database, which competes with the current state-of-the-art. It was also tested for the modern English handwriting of the IAM database, and the Character Error Rate of 2.9% is attained, which confirms the model's script independence.
{"title":"Attention-based Deep Learning Model for Arabic Handwritten Text Recognition","authors":"Takwa Ben Aïcha Gader, Afef Kacem Echi","doi":"10.22630/mgv.2022.31.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2022.31.1.3","url":null,"abstract":"This work proposes a segmentation-free approach to Arabic Handwritten Text Recognition (AHTR): an attention-based Convolutional Neural Network - Recurrent Neural Network - Connectionist Temporal Classification (CNN-RNN-CTC) deep learning architecture. The model receives as input an image and provides, through a CNN, a sequence of essential features, which are transferred to an Attention-based Bidirectional Long Short-Term Memory Network (BLSTM). The BLSTM gives features sequence in order, and the attention mechanism allows the selection of relevant information from the features sequences. The selected information is then fed to the CTC, enabling the loss calculation and the transcription prediction. The contribution lies in extending the CNN by dropout layers, batch normalization, and dropout regularization parameters to prevent over-fitting. The output of the RNN block is passed through an attention mechanism to utilize the most relevant parts of the input sequence in a flexible manner. This solution enhances previous methods by improving the CNN speed and performance and controlling over model over-fitting. The proposed system achieves the best accuracy of 97.1% for the IFN-ENIT Arabic script database, which competes with the current state-of-the-art. It was also tested for the modern English handwriting of the IAM database, and the Character Error Rate of 2.9% is attained, which confirms the model's script independence.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84015511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-12-01DOI: 10.22630/mgv.2021.30.1.2
R. R
In this pandemic-prone era, health is of utmost concern for everyone and hence eating good quality fruits is very much essential for sound health. Unfortunately, nowadays it is quite very difficult to obtain naturally ripened fruits, due to existence of chemically ripened fruits being ripened using hazardous chemicals such as calcium carbide. However, most of the state-of-the art techniques are primarily focusing on identification of chemically ripened fruits with the help of computer vision-based approaches, which are less effective towards quantification of chemical contaminations present in the sample fruits. To solve these issues, a new framework for chemical ripening and contamination detection is presented, which employs both visual and IR spectrometric signatures in two different stages. The experiments conducted on both the GUI tool as well as hardware-based setups, clearly demonstrate the efficiency of the proposed framework in terms of detection confidence levels followed by the percentage of presence of chemicals in the sample fruit.
{"title":"Chemical ripening and contaminations detection using neural networks-based image features and spectrometric signatures","authors":"R. R","doi":"10.22630/mgv.2021.30.1.2","DOIUrl":"https://doi.org/10.22630/mgv.2021.30.1.2","url":null,"abstract":"In this pandemic-prone era, health is of utmost concern for everyone and hence eating good quality fruits is very much essential for sound health. Unfortunately, nowadays it is quite very difficult to obtain naturally ripened fruits, due to existence of chemically ripened fruits being ripened using hazardous chemicals such as calcium carbide. However, most of the state-of-the art techniques are primarily focusing on identification of chemically ripened fruits with the help of computer vision-based approaches, which are less effective towards quantification of chemical contaminations present in the sample fruits. To solve these issues, a new framework for chemical ripening and contamination detection is presented, which employs both visual and IR spectrometric signatures in two different stages. The experiments conducted on both the GUI tool as well as hardware-based setups, clearly demonstrate the efficiency of the proposed framework in terms of detection confidence levels followed by the percentage of presence of chemicals in the sample fruit.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82051345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-26DOI: 10.22630/mgv.2021.30.1.1
Oge Marques, Luiz Zaniolo
The use of deep learning techniques for early and accurate medical image diagnosis has grown significantly in recent years, with some encouraging results across many medical specialties, pathologies, and image types. One of the most popular deep neural network architectures is the convolutional neural network (CNN), widely used for medical image classification and segmentation, among other tasks. One of the configuration parameters of a CNN is called stride and it regulates how sparsely the image is sampled during the convolutional process. This paper explores the idea of applying a patterned stride strategy: pixels closer to the center are processed with a smaller stride concentrating the amount of information sampled, and pixels away from the center are processed with larger strides consequently making those areas to be sampled more sparsely. We apply this method to different medical image classification tasks and demonstrate experimentally how the proposed patterned stride mechanism outperforms a baseline solution with the same computational cost (processing and memory). We also discuss the relevance and potential future extensions of the proposed method.
{"title":"On the use of CNNs with patterned stride for medical image analysis","authors":"Oge Marques, Luiz Zaniolo","doi":"10.22630/mgv.2021.30.1.1","DOIUrl":"https://doi.org/10.22630/mgv.2021.30.1.1","url":null,"abstract":"The use of deep learning techniques for early and accurate medical image diagnosis has grown significantly in recent years, with some encouraging results across many medical specialties, pathologies, and image types. One of the most popular deep neural network architectures is the convolutional neural network (CNN), widely used for medical image classification and segmentation, among other tasks. One of the configuration parameters of a CNN is called stride and it regulates how sparsely the image is sampled during the convolutional process. This paper explores the idea of applying a patterned stride strategy: pixels closer to the center are processed with a smaller stride concentrating the amount of information sampled, and pixels away from the center are processed with larger strides consequently making those areas to be sampled more sparsely. We apply this method to different medical image classification tasks and demonstrate experimentally how the proposed patterned stride mechanism outperforms a baseline solution with the same computational cost (processing and memory). We also discuss the relevance and potential future extensions of the proposed method.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80820308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.22630/mgv.2020.29.1.4
H. Iftikhar, H. Khan, B. Raza, Ahmad Shahir
Breast cancer is a leading cause of death among women. Early detection can significantly reduce the mortality rate among women and improve their prognosis. Mammography is the first line procedure for early diagnosis. In the early era, conventional Computer-Aided Diagnosis (CADx) systems for breast lesion diagnosis were based on just single view information. The last decade evidence the use of two views mammogram: Medio-Lateral Oblique (MLO) and Cranio-Caudal (CC) view for the CADx systems. Most recent studies show the effectiveness of four views of mammogram to train CADx system with feature fusion strategy for classification task. In this paper, we proposed an end-to-end Multi-View Attention-based Late Fusion (MVALF) CADx system that fused the obtained predictions of four view models, which is trained for each view separately. These separate models have different predictive ability for each class. The appropriate fusion of multi-view models can achieve better diagnosis performance. So, it is necessary to assign the proper weights to the multi-view classification models. To resolve this issue, attention-based weighting mechanism is adopted to assign the proper weights to trained models for fusion strategy. The proposed methodology is used for the classification of mammogram into normal, mass, calcification, malignant masses and benign masses. The publicly available datasets CBIS-DDSM and mini-MIAS are used for the experimentation. The results show that our proposed system achieved 0.996 AUC for normal vs. abnormal, 0.922 for mass vs. calcification and 0.896 for malignant vs. benign masses. Superior results are seen for the classification of malignant vs benign masses with our proposed approach, which is higher than the results using single view, two views and four views early fusion-based systems. The overall results of each level show the potential of multi-view late fusion with transfer learning in the diagnosis of breast cancer.
{"title":"Multi-View Attention-based Late Fusion (MVALF) CADx system for breast cancer using deep learning","authors":"H. Iftikhar, H. Khan, B. Raza, Ahmad Shahir","doi":"10.22630/mgv.2020.29.1.4","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.4","url":null,"abstract":"Breast cancer is a leading cause of death among women. Early detection can significantly reduce the mortality rate among women and improve their prognosis. Mammography is the first line procedure for early diagnosis. In the early era, conventional Computer-Aided Diagnosis (CADx) systems for breast lesion diagnosis were based on just single view information. The last decade evidence the use of two views mammogram: Medio-Lateral Oblique (MLO) and Cranio-Caudal (CC) view for the CADx systems. Most recent studies show the effectiveness of four views of mammogram to train CADx system with feature fusion strategy for classification task. In this paper, we proposed an end-to-end Multi-View Attention-based Late Fusion (MVALF) CADx system that fused the obtained predictions of four view models, which is trained for each view separately. These separate models have different predictive ability for each class. The appropriate fusion of multi-view models can achieve better diagnosis performance. So, it is necessary to assign the proper weights to the multi-view classification models. To resolve this issue, attention-based weighting mechanism is adopted to assign the proper weights to trained models for fusion strategy. The proposed methodology is used for the classification of mammogram into normal, mass, calcification, malignant masses and benign masses. The publicly available datasets CBIS-DDSM and mini-MIAS are used for the experimentation. The results show that our proposed system achieved 0.996 AUC for normal vs. abnormal, 0.922 for mass vs. calcification and 0.896 for malignant vs. benign masses. Superior results are seen for the classification of malignant vs benign masses with our proposed approach, which is higher than the results using single view, two views and four views early fusion-based systems. The overall results of each level show the potential of multi-view late fusion with transfer learning in the diagnosis of breast cancer.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82548587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.22630/mgv.2020.29.1.3
H. Azam, Humera Tariq
MRI scanner captures the skull along with the brain and the skull needs to be removed for enhanced reliability and validity of medical diagnostic practices. Skull Stripping from Brain MR Images is significantly a core area in medical applications. It is a complicated task to segment an image for skull stripping manually. It is not only time consuming but expensive as well. An automated skull stripping method with good efficiency and effectiveness is required. Currently, a number of skull stripping methods are used in practice. In this review paper, many soft-computing segmentation techniques have been discussed. The purpose of this research study is to review the existing literature to compare the existing traditional and modern methods used for skull stripping from Brain MR images along with their merits and demerits. The semi-systematic review of existing literature has been carried out using the meta-synthesis approach. Broadly, analyses are bifurcated into traditional and modern, i.e. soft-computing methods proposed, experimented with, or applied in practice for effective skull stripping. Popular databases with desired data of Brain MR Images have also been identified, categorized and discussed. Moreover, CPU and GPU based computer systems and their specifications used by different researchers for skull stripping have also been discussed. In the end, the research gap has been identified along with the proposed lead for future research work.
{"title":"Skull stripping using traditional and soft-computing approaches for Magnetic Resonance images: A semi-systematic meta-analysis","authors":"H. Azam, Humera Tariq","doi":"10.22630/mgv.2020.29.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.3","url":null,"abstract":"MRI scanner captures the skull along with the brain and the skull needs to be removed for enhanced reliability and validity of medical diagnostic practices. Skull Stripping from Brain MR Images is significantly a core area in medical applications. It is a complicated task to segment an image for skull stripping manually. It is not only time consuming but expensive as well. An automated skull stripping method with good efficiency and effectiveness is required. Currently, a number of skull stripping methods are used in practice. In this review paper, many soft-computing segmentation techniques have been discussed. The purpose of this research study is to review the existing literature to compare the existing traditional and modern methods used for skull stripping from Brain MR images along with their merits and demerits. The semi-systematic review of existing literature has been carried out using the meta-synthesis approach. Broadly, analyses are bifurcated into traditional and modern, i.e. soft-computing methods proposed, experimented with, or applied in practice for effective skull stripping. Popular databases with desired data of Brain MR Images have also been identified, categorized and discussed. Moreover, CPU and GPU based computer systems and their specifications used by different researchers for skull stripping have also been discussed. In the end, the research gap has been identified along with the proposed lead for future research work.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84683692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-01DOI: 10.22630/mgv.2020.29.1.2
J. Pach, Izabella Antoniuk, A. Krupa
In this paper we present an approach to text area detection using binary images, Constrained Run Length Algorithm and other noise reduction methods of removing the artefacts. Text processing includes various activities, most of which are related to preparing input data for further operations in the best possible way, that will not hinder the OCR algorithms. This is especially the case when handwritten manuscripts are considered, and even more so with very old documents. We present our methodology for text area detection problem, which is capable of removing most of irrelevant objects, including elements such as page edges, stains, folds etc. At the same time the presented method can handle multi-column texts or varying line thickness. The generated mask can accurately mark the actual text area, so that the output image can be easily used in further text processing steps.
{"title":"Text area detection in handwritten documents scanned for further processing","authors":"J. Pach, Izabella Antoniuk, A. Krupa","doi":"10.22630/mgv.2020.29.1.2","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.2","url":null,"abstract":"In this paper we present an approach to text area detection using binary images, Constrained Run Length Algorithm and other noise reduction methods of removing the artefacts. Text processing includes various activities, most of which are related to preparing input data for further operations in the best possible way, that will not hinder the OCR algorithms. This is especially the case when handwritten manuscripts are considered, and even more so with very old documents. We present our methodology for text area detection problem, which is capable of removing most of irrelevant objects, including elements such as page edges, stains, folds etc. At the same time the presented method can handle multi-column texts or varying line thickness. The generated mask can accurately mark the actual text area, so that the output image can be easily used in further text processing steps.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84398100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-01DOI: 10.22630/mgv.2020.29.1.5
Izabella Antoniuk, A. Krupa, Radosław Roszczyk
The acquisition of accurately coloured, balanced images in an optical microscope can be a challenge even for experienced microscope operators. This article presents an entirely automatic mechanism for balancing the white level that allows the correction of the microscopic colour images adequately. The results of the algorithm have been confirmed experimentally on a set of two hundred microscopic images. The images contained scans of three microscopic specimens commonly used in pathomorphology. Also, the results achieved were compared with other commonly used white balance algorithms in digital photography. The algorithm applied in this work is more effective than the classical algorithms used in colour photography for microscopic images stained with hematoxylin-phloxine-saffron and for immunohistochemical staining images.
{"title":"Normal Patch Retinex robust algorithm for white balancing in digital microscopy","authors":"Izabella Antoniuk, A. Krupa, Radosław Roszczyk","doi":"10.22630/mgv.2020.29.1.5","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.5","url":null,"abstract":"The acquisition of accurately coloured, balanced images in an optical microscope can be a challenge even for experienced microscope operators. This article presents an entirely automatic mechanism for balancing the white level that allows the correction of the microscopic colour images adequately. The results of the algorithm have been confirmed experimentally on a set of two hundred microscopic images. The images contained scans of three microscopic specimens commonly used in pathomorphology. Also, the results achieved were compared with other commonly used white balance algorithms in digital photography. The algorithm applied in this work is more effective than the classical algorithms used in colour photography for microscopic images stained with hematoxylin-phloxine-saffron and for immunohistochemical staining images.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88351578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-01DOI: 10.22630/mgv.2020.29.1.1
M. Bertolini, L. Magri
In the context of multiple view geometry, images of static scenes are modeled as linear projections from a projective space P^3 to a projective plane P^2 and, similarly, videos or images of suitable dynamic or segmented scenes can be modeled as linear projections from P^k to P^h, with k>h>=2. In those settings, the projective reconstruction of a scene consists in recovering the position of the projected objects and the projections themselves from their images, after identifying many enough correspondences between the images. A critical locus for the reconstruction problem is a configuration of points and of centers of projections, in the ambient space, where the reconstruction of a scene fails. Critical loci turn out to be suitable algebraic varieties. In this paper we investigate those critical loci which are hypersurfaces in high dimension complex projective spaces, and we determine their equations. Moreover, to give evidence of some practical implications of the existence of these critical loci, we perform a simulated experiment to test the instability phenomena for the reconstruction of a scene, near a critical hypersurface.
{"title":"Critical hypersurfaces and instability for reconstruction of scenes in high dimensional projective spaces","authors":"M. Bertolini, L. Magri","doi":"10.22630/mgv.2020.29.1.1","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.1","url":null,"abstract":"In the context of multiple view geometry, images of static scenes are modeled as linear projections from a projective space P^3 to a projective plane P^2 and, similarly, videos or images of suitable dynamic or segmented scenes can be modeled as linear projections from P^k to P^h, with k>h>=2. In those settings, the projective reconstruction of a scene consists in recovering the position of the projected objects and the projections themselves from their images, after identifying many enough correspondences between the images. A critical locus for the reconstruction problem is a configuration of points and of centers of projections, in the ambient space, where the reconstruction of a scene fails. Critical loci turn out to be suitable algebraic varieties. In this paper we investigate those critical loci which are hypersurfaces in high dimension complex projective spaces, and we determine their equations. Moreover, to give evidence of some practical implications of the existence of these critical loci, we perform a simulated experiment to test the instability phenomena for the reconstruction of a scene, near a critical hypersurface.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"276 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79253916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.22630/mgv.2018.27.1.3
R. Bohush, S. Ablameyko, T. Kalganova, P. Yarashevich
This paper discusses the algorithmic framework for image parking lot localization and classification for the video intelligent parking system. Perspective transformation, adaptive Otsu's binarization, mathematical morphology operations, representation of horizontal lines as vectors, creating and filtering vertical lines, and parking space coordinates determination are used for the localization of parking spaces in a~video frame. The algorithm for classification of parking spaces is based on the Histogram of Oriented Descriptors (HOG) and the Support Vector Machine (SVM) classifier. Parking lot descriptors are extracted based on HOG. The overall algorithmic framework consists of the following steps: vertical and horizontal gradient calculation for the image of the parking lot, gradient module vector and orientation calculation, power gradient accumulation in accordance with cell orientations, blocking of cells, second norm calculations, and normalization of cell orientation in blocks. The parameters of the descriptor have been optimized experimentally. The results demonstrate the improved classification accuracy over the class of similar algorithms and the proposed framework performs the best among the algorithms proposed earlier to solve the parking recognition problem.
{"title":"Extraction of image parking spaces in intelligent video surveillance systems","authors":"R. Bohush, S. Ablameyko, T. Kalganova, P. Yarashevich","doi":"10.22630/mgv.2018.27.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2018.27.1.3","url":null,"abstract":"This paper discusses the algorithmic framework for image parking lot localization and classification for the video intelligent parking system. Perspective transformation, adaptive Otsu's binarization, mathematical morphology operations, representation of horizontal lines as vectors, creating and filtering vertical lines, and parking space coordinates determination are used for the localization of parking spaces in a~video frame. The algorithm for classification of parking spaces is based on the Histogram of Oriented Descriptors (HOG) and the Support Vector Machine (SVM) classifier. Parking lot descriptors are extracted based on HOG. The overall algorithmic framework consists of the following steps: vertical and horizontal gradient calculation for the image of the parking lot, gradient module vector and orientation calculation, power gradient accumulation in accordance with cell orientations, blocking of cells, second norm calculations, and normalization of cell orientation in blocks. The parameters of the descriptor have been optimized experimentally. The results demonstrate the improved classification accuracy over the class of similar algorithms and the proposed framework performs the best among the algorithms proposed earlier to solve the parking recognition problem.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83642840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.22630/mgv.2018.27.1.1
M. Flasiński
Further results of research into graph grammar parsing for syntactic pattern recognition (Pattern Recognit. 21:623-629, 1988; 23:765-774, 1990; 24:1223-1224, 1991; 26:1-16, 1993; 43:249-2264, 2010; Comput. Vision Graph. Image Process. 47:1-21, 1989; Fundam. Inform. 80:379-413, 2007; Theoret. Comp. Sci. 201:189-231, 1998) are presented in the paper. The notion of interpreted graphs based on Tarski's model theory is introduced. The bottom-up parsing algorithm for ETPR(k) graph grammars is defined.
{"title":"Interpreted Graphs and ETPR(k) Graph Grammar Parsing for Syntactic Pattern Recognition","authors":"M. Flasiński","doi":"10.22630/mgv.2018.27.1.1","DOIUrl":"https://doi.org/10.22630/mgv.2018.27.1.1","url":null,"abstract":"Further results of research into graph grammar parsing for syntactic pattern recognition (Pattern Recognit. 21:623-629, 1988; 23:765-774, 1990; 24:1223-1224, 1991; 26:1-16, 1993; 43:249-2264, 2010; Comput. Vision Graph. Image Process. 47:1-21, 1989; Fundam. Inform. 80:379-413, 2007; Theoret. Comp. Sci. 201:189-231, 1998) are presented in the paper. The notion of interpreted graphs based on Tarski's model theory is introduced. The bottom-up parsing algorithm for ETPR(k) graph grammars is defined.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"119 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80395142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}