Machine Graphics and Vision最新文献

英文中文

Attention-based Deep Learning Model for Arabic Handwritten Text Recognition 基于注意力的阿拉伯语手写文本识别深度学习模型

Machine Graphics and Vision

Pub Date : 2022-12-15 DOI: 10.22630/mgv.2022.31.1.3

Takwa Ben Aïcha Gader, Afef Kacem Echi

This work proposes a segmentation-free approach to Arabic Handwritten Text Recognition (AHTR): an attention-based Convolutional Neural Network - Recurrent Neural Network - Connectionist Temporal Classification (CNN-RNN-CTC) deep learning architecture. The model receives as input an image and provides, through a CNN, a sequence of essential features, which are transferred to an Attention-based Bidirectional Long Short-Term Memory Network (BLSTM). The BLSTM gives features sequence in order, and the attention mechanism allows the selection of relevant information from the features sequences. The selected information is then fed to the CTC, enabling the loss calculation and the transcription prediction. The contribution lies in extending the CNN by dropout layers, batch normalization, and dropout regularization parameters to prevent over-fitting. The output of the RNN block is passed through an attention mechanism to utilize the most relevant parts of the input sequence in a flexible manner. This solution enhances previous methods by improving the CNN speed and performance and controlling over model over-fitting. The proposed system achieves the best accuracy of 97.1% for the IFN-ENIT Arabic script database, which competes with the current state-of-the-art. It was also tested for the modern English handwriting of the IAM database, and the Character Error Rate of 2.9% is attained, which confirms the model's script independence.

这项工作提出了一种无分割的阿拉伯语手写文本识别(AHTR)方法:一种基于注意力的卷积神经网络-循环神经网络-连接主义时间分类(CNN-RNN-CTC)深度学习架构。该模型接收图像作为输入，并通过CNN提供一系列基本特征，这些特征被转移到基于注意力的双向长短期记忆网络(BLSTM)。BLSTM按顺序给出特征序列，注意机制允许从特征序列中选择相关信息。然后将选择的信息馈送到CTC，使损失计算和转录预测成为可能。其贡献在于通过dropout层、批处理归一化和dropout正则化参数来扩展CNN，以防止过拟合。RNN块的输出通过注意机制传递，以灵活的方式利用输入序列中最相关的部分。该解决方案通过提高CNN的速度和性能以及控制模型过拟合来改进先前的方法。该系统在IFN-ENIT阿拉伯文字数据库中达到97.1%的最佳准确率，可与当前最先进的系统竞争。并对IAM数据库的现代英语笔迹进行了测试，得到了2.9%的字符错误率，证实了该模型的文字独立性。

{"title":"Attention-based Deep Learning Model for Arabic Handwritten Text Recognition","authors":"Takwa Ben Aïcha Gader, Afef Kacem Echi","doi":"10.22630/mgv.2022.31.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2022.31.1.3","url":null,"abstract":"This work proposes a segmentation-free approach to Arabic Handwritten Text Recognition (AHTR): an attention-based Convolutional Neural Network - Recurrent Neural Network - Connectionist Temporal Classification (CNN-RNN-CTC) deep learning architecture. The model receives as input an image and provides, through a CNN, a sequence of essential features, which are transferred to an Attention-based Bidirectional Long Short-Term Memory Network (BLSTM). The BLSTM gives features sequence in order, and the attention mechanism allows the selection of relevant information from the features sequences. The selected information is then fed to the CTC, enabling the loss calculation and the transcription prediction. The contribution lies in extending the CNN by dropout layers, batch normalization, and dropout regularization parameters to prevent over-fitting. The output of the RNN block is passed through an attention mechanism to utilize the most relevant parts of the input sequence in a flexible manner. This solution enhances previous methods by improving the CNN speed and performance and controlling over model over-fitting. The proposed system achieves the best accuracy of 97.1% for the IFN-ENIT Arabic script database, which competes with the current state-of-the-art. It was also tested for the modern English handwriting of the IAM database, and the Character Error Rate of 2.9% is attained, which confirms the model's script independence.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84015511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Chemical ripening and contaminations detection using neural networks-based image features and spectrometric signatures 利用基于神经网络的图像特征和光谱特征进行化学成熟和污染检测

Machine Graphics and Vision

Pub Date : 2021-12-01 DOI: 10.22630/mgv.2021.30.1.2

R. R

In this pandemic-prone era, health is of utmost concern for everyone and hence eating good quality fruits is very much essential for sound health. Unfortunately, nowadays it is quite very difficult to obtain naturally ripened fruits, due to existence of chemically ripened fruits being ripened using hazardous chemicals such as calcium carbide. However, most of the state-of-the art techniques are primarily focusing on identification of chemically ripened fruits with the help of computer vision-based approaches, which are less effective towards quantification of chemical contaminations present in the sample fruits. To solve these issues, a new framework for chemical ripening and contamination detection is presented, which employs both visual and IR spectrometric signatures in two different stages. The experiments conducted on both the GUI tool as well as hardware-based setups, clearly demonstrate the efficiency of the proposed framework in terms of detection confidence levels followed by the percentage of presence of chemicals in the sample fruit.

在这个大流行易发的时代，健康是每个人最关心的问题，因此吃高质量的水果对健康非常重要。不幸的是，现在很难获得自然成熟的水果，因为化学成熟的水果是用电石等危险化学品成熟的。然而，大多数最先进的技术主要集中在利用基于计算机视觉的方法识别化学成熟的水果，这些方法对样品水果中存在的化学污染的量化效果较差。为了解决这些问题，提出了一种新的化学成熟和污染检测框架，该框架在两个不同的阶段使用视觉和红外光谱特征。在GUI工具和基于硬件的设置上进行的实验，清楚地证明了所提议的框架在检测置信水平方面的效率，随后是样品水果中化学物质存在的百分比。

引用次数: 0

On the use of CNNs with patterned stride for medical image analysis 带模式步幅的cnn在医学图像分析中的应用

Machine Graphics and Vision

Pub Date : 2021-06-26 DOI: 10.22630/mgv.2021.30.1.1

Oge Marques, Luiz Zaniolo

The use of deep learning techniques for early and accurate medical image diagnosis has grown significantly in recent years, with some encouraging results across many medical specialties, pathologies, and image types. One of the most popular deep neural network architectures is the convolutional neural network (CNN), widely used for medical image classification and segmentation, among other tasks. One of the configuration parameters of a CNN is called stride and it regulates how sparsely the image is sampled during the convolutional process. This paper explores the idea of applying a patterned stride strategy: pixels closer to the center are processed with a smaller stride concentrating the amount of information sampled, and pixels away from the center are processed with larger strides consequently making those areas to be sampled more sparsely. We apply this method to different medical image classification tasks and demonstrate experimentally how the proposed patterned stride mechanism outperforms a baseline solution with the same computational cost (processing and memory). We also discuss the relevance and potential future extensions of the proposed method.

近年来，深度学习技术在早期和准确的医学图像诊断中的应用显著增长，在许多医学专业、病理和图像类型中取得了一些令人鼓舞的成果。最流行的深度神经网络架构之一是卷积神经网络(CNN)，广泛用于医学图像分类和分割等任务。CNN的配置参数之一被称为stride，它调节了卷积过程中图像采样的稀疏程度。本文探讨了应用模式步幅策略的思想:靠近中心的像素用较小的步幅处理，集中了采样的信息量，远离中心的像素用较大的步幅处理，从而使这些区域的采样更稀疏。我们将该方法应用于不同的医学图像分类任务，并通过实验证明了所提出的模式跨步机制如何优于具有相同计算成本(处理和内存)的基线解决方案。我们还讨论了所提出方法的相关性和潜在的未来扩展。

引用次数: 0

Multi-View Attention-based Late Fusion (MVALF) CADx system for breast cancer using deep learning 基于深度学习的基于多视图注意力的晚期融合(MVALF)乳腺癌CADx系统

Machine Graphics and Vision

Pub Date : 2020-12-01 DOI: 10.22630/mgv.2020.29.1.4

H. Iftikhar, H. Khan, B. Raza, Ahmad Shahir

Breast cancer is a leading cause of death among women. Early detection can significantly reduce the mortality rate among women and improve their prognosis. Mammography is the first line procedure for early diagnosis. In the early era, conventional Computer-Aided Diagnosis (CADx) systems for breast lesion diagnosis were based on just single view information. The last decade evidence the use of two views mammogram: Medio-Lateral Oblique (MLO) and Cranio-Caudal (CC) view for the CADx systems. Most recent studies show the effectiveness of four views of mammogram to train CADx system with feature fusion strategy for classification task. In this paper, we proposed an end-to-end Multi-View Attention-based Late Fusion (MVALF) CADx system that fused the obtained predictions of four view models, which is trained for each view separately. These separate models have different predictive ability for each class. The appropriate fusion of multi-view models can achieve better diagnosis performance. So, it is necessary to assign the proper weights to the multi-view classification models. To resolve this issue, attention-based weighting mechanism is adopted to assign the proper weights to trained models for fusion strategy. The proposed methodology is used for the classification of mammogram into normal, mass, calcification, malignant masses and benign masses. The publicly available datasets CBIS-DDSM and mini-MIAS are used for the experimentation. The results show that our proposed system achieved 0.996 AUC for normal vs. abnormal, 0.922 for mass vs. calcification and 0.896 for malignant vs. benign masses. Superior results are seen for the classification of malignant vs benign masses with our proposed approach, which is higher than the results using single view, two views and four views early fusion-based systems. The overall results of each level show the potential of multi-view late fusion with transfer learning in the diagnosis of breast cancer.

乳腺癌是妇女死亡的主要原因。早期发现可显著降低妇女死亡率并改善其预后。乳房x光检查是早期诊断的第一线程序。在早期，用于乳腺病变诊断的传统计算机辅助诊断(CADx)系统仅基于单一视图信息。在过去的十年中，两种视图的使用证明:中侧斜(MLO)和颅尾(CC)视图用于CADx系统。最近的研究表明，乳房x线照片的四种视图可以有效地训练具有特征融合策略的CADx系统进行分类任务。在本文中，我们提出了一个端到端的基于多视图注意力的后期融合(MVALF) CADx系统，该系统融合了四个视图模型的预测结果，并对每个视图分别进行训练。这些独立的模型对每个类别具有不同的预测能力。适当融合多视图模型可以获得更好的诊断性能。因此，有必要对多视图分类模型赋予适当的权重。为了解决这一问题，采用了基于注意力的加权机制，为训练好的模型分配合适的权重进行融合策略。提出的方法是用于分类乳房x线照片分为正常，肿块，钙化，恶性肿块和良性肿块。实验使用了公开可用的数据集CBIS-DDSM和mini-MIAS。结果表明，我们提出的系统在正常与异常、肿块与钙化、恶性与良性肿块之间的AUC分别达到0.996、0.922和0.896。我们提出的方法对恶性肿块和良性肿块的分类结果优于单视图、二视图和四视图早期融合系统的结果。各水平的总体结果显示了多视点晚期融合迁移学习在乳腺癌诊断中的潜力。

{"title":"Multi-View Attention-based Late Fusion (MVALF) CADx system for breast cancer using deep learning","authors":"H. Iftikhar, H. Khan, B. Raza, Ahmad Shahir","doi":"10.22630/mgv.2020.29.1.4","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.4","url":null,"abstract":"Breast cancer is a leading cause of death among women. Early detection can significantly reduce the mortality rate among women and improve their prognosis. Mammography is the first line procedure for early diagnosis. In the early era, conventional Computer-Aided Diagnosis (CADx) systems for breast lesion diagnosis were based on just single view information. The last decade evidence the use of two views mammogram: Medio-Lateral Oblique (MLO) and Cranio-Caudal (CC) view for the CADx systems. Most recent studies show the effectiveness of four views of mammogram to train CADx system with feature fusion strategy for classification task. In this paper, we proposed an end-to-end Multi-View Attention-based Late Fusion (MVALF) CADx system that fused the obtained predictions of four view models, which is trained for each view separately. These separate models have different predictive ability for each class. The appropriate fusion of multi-view models can achieve better diagnosis performance. So, it is necessary to assign the proper weights to the multi-view classification models. To resolve this issue, attention-based weighting mechanism is adopted to assign the proper weights to trained models for fusion strategy. The proposed methodology is used for the classification of mammogram into normal, mass, calcification, malignant masses and benign masses. The publicly available datasets CBIS-DDSM and mini-MIAS are used for the experimentation. The results show that our proposed system achieved 0.996 AUC for normal vs. abnormal, 0.922 for mass vs. calcification and 0.896 for malignant vs. benign masses. Superior results are seen for the classification of malignant vs benign masses with our proposed approach, which is higher than the results using single view, two views and four views early fusion-based systems. The overall results of each level show the potential of multi-view late fusion with transfer learning in the diagnosis of breast cancer.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82548587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Skull stripping using traditional and soft-computing approaches for Magnetic Resonance images: A semi-systematic meta-analysis 颅骨剥离使用传统和软计算方法的磁共振图像:半系统的荟萃分析

Machine Graphics and Vision

Pub Date : 2020-12-01 DOI: 10.22630/mgv.2020.29.1.3

H. Azam, Humera Tariq

MRI scanner captures the skull along with the brain and the skull needs to be removed for enhanced reliability and validity of medical diagnostic practices. Skull Stripping from Brain MR Images is significantly a core area in medical applications. It is a complicated task to segment an image for skull stripping manually. It is not only time consuming but expensive as well. An automated skull stripping method with good efficiency and effectiveness is required. Currently, a number of skull stripping methods are used in practice. In this review paper, many soft-computing segmentation techniques have been discussed. The purpose of this research study is to review the existing literature to compare the existing traditional and modern methods used for skull stripping from Brain MR images along with their merits and demerits. The semi-systematic review of existing literature has been carried out using the meta-synthesis approach. Broadly, analyses are bifurcated into traditional and modern, i.e. soft-computing methods proposed, experimented with, or applied in practice for effective skull stripping. Popular databases with desired data of Brain MR Images have also been identified, categorized and discussed. Moreover, CPU and GPU based computer systems and their specifications used by different researchers for skull stripping have also been discussed. In the end, the research gap has been identified along with the proposed lead for future research work.

MRI扫描仪捕获颅骨和大脑，需要移除颅骨以提高医疗诊断实践的可靠性和有效性。从脑磁共振图像中剥离颅骨是医学应用中的一个重要核心领域。人工分割颅骨剥离图像是一项复杂的任务。它不仅费时而且昂贵。需要一种效率高、效果好的自动颅骨剥离方法。目前，临床上应用的颅骨剥离方法有多种。在这篇综述中，讨论了许多软计算分割技术。本研究的目的是回顾现有的文献，比较现有的传统和现代的颅骨剥离方法，以及它们的优缺点。运用元综合方法对现有文献进行了半系统的综述。从广义上讲，分析分为传统和现代，即软计算方法提出，实验，或应用于实践中有效的颅骨剥离。此外，还对具有所需脑磁共振图像数据的流行数据库进行了识别、分类和讨论。此外，还讨论了不同研究人员用于颅骨剥离的基于CPU和GPU的计算机系统及其规格。最后，确定了研究差距，并提出了未来研究工作的建议。

{"title":"Skull stripping using traditional and soft-computing approaches for Magnetic Resonance images: A semi-systematic meta-analysis","authors":"H. Azam, Humera Tariq","doi":"10.22630/mgv.2020.29.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2020.29.1.3","url":null,"abstract":"MRI scanner captures the skull along with the brain and the skull needs to be removed for enhanced reliability and validity of medical diagnostic practices. Skull Stripping from Brain MR Images is significantly a core area in medical applications. It is a complicated task to segment an image for skull stripping manually. It is not only time consuming but expensive as well. An automated skull stripping method with good efficiency and effectiveness is required. Currently, a number of skull stripping methods are used in practice. In this review paper, many soft-computing segmentation techniques have been discussed. The purpose of this research study is to review the existing literature to compare the existing traditional and modern methods used for skull stripping from Brain MR images along with their merits and demerits. The semi-systematic review of existing literature has been carried out using the meta-synthesis approach. Broadly, analyses are bifurcated into traditional and modern, i.e. soft-computing methods proposed, experimented with, or applied in practice for effective skull stripping. Popular databases with desired data of Brain MR Images have also been identified, categorized and discussed. Moreover, CPU and GPU based computer systems and their specifications used by different researchers for skull stripping have also been discussed. In the end, the research gap has been identified along with the proposed lead for future research work.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84683692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Text area detection in handwritten documents scanned for further processing 文本区域检测在手写文件扫描进一步处理

Machine Graphics and Vision

Pub Date : 2020-01-01 DOI: 10.22630/mgv.2020.29.1.2

J. Pach, Izabella Antoniuk, A. Krupa

In this paper we present an approach to text area detection using binary images, Constrained Run Length Algorithm and other noise reduction methods of removing the artefacts. Text processing includes various activities, most of which are related to preparing input data for further operations in the best possible way, that will not hinder the OCR algorithms. This is especially the case when handwritten manuscripts are considered, and even more so with very old documents. We present our methodology for text area detection problem, which is capable of removing most of irrelevant objects, including elements such as page edges, stains, folds etc. At the same time the presented method can handle multi-column texts or varying line thickness. The generated mask can accurately mark the actual text area, so that the output image can be easily used in further text processing steps.

在本文中，我们提出了一种使用二值图像、约束运行长度算法和其他去除伪影的降噪方法来检测文本区域的方法。文本处理包括各种活动，其中大多数与以最佳方式为进一步操作准备输入数据有关，这不会妨碍OCR算法。在考虑手写手稿时尤其如此，对于非常古老的文件更是如此。我们提出了一种文本区域检测问题的方法，该方法能够去除大多数无关对象，包括页面边缘、污渍、折叠等元素。同时，该方法可以处理多列文本或变行粗细的文本。生成的掩码可以准确地标记实际的文本区域，使输出的图像可以方便地用于进一步的文本处理步骤。

引用次数: 0

Normal Patch Retinex robust algorithm for white balancing in digital microscopy 用于数字显微镜白平衡的Normal Patch Retinex鲁棒算法

Machine Graphics and Vision

Pub Date : 2020-01-01 DOI: 10.22630/mgv.2020.29.1.5

Izabella Antoniuk, A. Krupa, Radosław Roszczyk

The acquisition of accurately coloured, balanced images in an optical microscope can be a challenge even for experienced microscope operators. This article presents an entirely automatic mechanism for balancing the white level that allows the correction of the microscopic colour images adequately. The results of the algorithm have been confirmed experimentally on a set of two hundred microscopic images. The images contained scans of three microscopic specimens commonly used in pathomorphology. Also, the results achieved were compared with other commonly used white balance algorithms in digital photography. The algorithm applied in this work is more effective than the classical algorithms used in colour photography for microscopic images stained with hematoxylin-phloxine-saffron and for immunohistochemical staining images.

即使对于经验丰富的显微镜操作人员，在光学显微镜中获取准确的彩色，平衡的图像也是一项挑战。本文提出了一个完全自动的机制，以平衡白色水平，允许适当的显微彩色图像的校正。该算法的结果已经在一组200张显微图像上得到了实验证实。图像包含病理形态学中常用的三个显微标本的扫描。并将所得结果与数码摄影中常用的其他白平衡算法进行了比较。在这项工作中应用的算法是更有效的比经典算法用于彩色摄影显微图像染色苏木精-邻苯二酚-藏红花和免疫组织化学染色图像。

引用次数: 0

Critical hypersurfaces and instability for reconstruction of scenes in high dimensional projective spaces 高维投影空间场景重建的临界超曲面与不稳定性

Machine Graphics and Vision

Pub Date : 2020-01-01 DOI: 10.22630/mgv.2020.29.1.1

M. Bertolini, L. Magri

In the context of multiple view geometry, images of static scenes are modeled as linear projections from a projective space P^3 to a projective plane P^2 and, similarly, videos or images of suitable dynamic or segmented scenes can be modeled as linear projections from P^k to P^h, with k>h>=2. In those settings, the projective reconstruction of a scene consists in recovering the position of the projected objects and the projections themselves from their images, after identifying many enough correspondences between the images. A critical locus for the reconstruction problem is a configuration of points and of centers of projections, in the ambient space, where the reconstruction of a scene fails. Critical loci turn out to be suitable algebraic varieties. In this paper we investigate those critical loci which are hypersurfaces in high dimension complex projective spaces, and we determine their equations. Moreover, to give evidence of some practical implications of the existence of these critical loci, we perform a simulated experiment to test the instability phenomena for the reconstruction of a scene, near a critical hypersurface.

在多视图几何的背景下，静态场景的图像被建模为从射影空间P^3到射影平面P^2的线性投影，类似地，合适的动态或分段场景的视频或图像可以被建模为从P^k到P^h的线性投影，k>h>=2。在这些设置中，场景的投影重建包括在识别图像之间足够多的对应关系之后，从图像中恢复投影对象和投影本身的位置。重建问题的关键轨迹是在环境空间中点和投影中心的配置，在环境空间中场景的重建失败。关键位点是合适的代数变种。本文研究了高维复射影空间中超曲面的临界轨迹，并确定了它们的方程。此外，为了证明这些临界位点存在的一些实际意义，我们进行了一个模拟实验来测试在临界超表面附近重建场景的不稳定现象。

引用次数: 4

Extraction of image parking spaces in intelligent video surveillance systems 智能视频监控系统中车位图像的提取

Machine Graphics and Vision

Pub Date : 2019-12-01 DOI: 10.22630/mgv.2018.27.1.3

R. Bohush, S. Ablameyko, T. Kalganova, P. Yarashevich

This paper discusses the algorithmic framework for image parking lot localization and classification for the video intelligent parking system. Perspective transformation, adaptive Otsu's binarization, mathematical morphology operations, representation of horizontal lines as vectors, creating and filtering vertical lines, and parking space coordinates determination are used for the localization of parking spaces in a~video frame. The algorithm for classification of parking spaces is based on the Histogram of Oriented Descriptors (HOG) and the Support Vector Machine (SVM) classifier. Parking lot descriptors are extracted based on HOG. The overall algorithmic framework consists of the following steps: vertical and horizontal gradient calculation for the image of the parking lot, gradient module vector and orientation calculation, power gradient accumulation in accordance with cell orientations, blocking of cells, second norm calculations, and normalization of cell orientation in blocks. The parameters of the descriptor have been optimized experimentally. The results demonstrate the improved classification accuracy over the class of similar algorithms and the proposed framework performs the best among the algorithms proposed earlier to solve the parking recognition problem.

本文讨论了视频智能停车系统中图像停车场定位与分类的算法框架。利用视角变换、自适应Otsu二值化、数学形态学运算、水平线向量表示、垂直线生成和滤波、车位坐标确定等方法实现视频帧内车位的定位。车位分类算法基于定向描述符直方图(HOG)和支持向量机(SVM)分类器。基于HOG提取停车场描述符。整个算法框架包括以下几个步骤:停车场图像的垂直和水平梯度计算，梯度模块矢量和方向计算，根据单元方向的功率梯度积累，单元的块化，二次范数计算，单元在块中的方向归一化。对描述器的参数进行了实验优化。结果表明，该框架的分类精度比同类算法有所提高，在解决停车识别问题的算法中表现最好。

{"title":"Extraction of image parking spaces in intelligent video surveillance systems","authors":"R. Bohush, S. Ablameyko, T. Kalganova, P. Yarashevich","doi":"10.22630/mgv.2018.27.1.3","DOIUrl":"https://doi.org/10.22630/mgv.2018.27.1.3","url":null,"abstract":"This paper discusses the algorithmic framework for image parking lot localization and classification for the video intelligent parking system. Perspective transformation, adaptive Otsu's binarization, mathematical morphology operations, representation of horizontal lines as vectors, creating and filtering vertical lines, and parking space coordinates determination are used for the localization of parking spaces in a~video frame. The algorithm for classification of parking spaces is based on the Histogram of Oriented Descriptors (HOG) and the Support Vector Machine (SVM) classifier. Parking lot descriptors are extracted based on HOG. The overall algorithmic framework consists of the following steps: vertical and horizontal gradient calculation for the image of the parking lot, gradient module vector and orientation calculation, power gradient accumulation in accordance with cell orientations, blocking of cells, second norm calculations, and normalization of cell orientation in blocks. The parameters of the descriptor have been optimized experimentally. The results demonstrate the improved classification accuracy over the class of similar algorithms and the proposed framework performs the best among the algorithms proposed earlier to solve the parking recognition problem.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83642840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Interpreted Graphs and ETPR(k) Graph Grammar Parsing for Syntactic Pattern Recognition 用于句法模式识别的解释图和ETPR(k)图语法解析

Machine Graphics and Vision

Pub Date : 2019-12-01 DOI: 10.22630/mgv.2018.27.1.1

M. Flasiński

Further results of research into graph grammar parsing for syntactic pattern recognition (Pattern Recognit. 21:623-629, 1988; 23:765-774, 1990; 24:1223-1224, 1991; 26:1-16, 1993; 43:249-2264, 2010; Comput. Vision Graph. Image Process. 47:1-21, 1989; Fundam. Inform. 80:379-413, 2007; Theoret. Comp. Sci. 201:189-231, 1998) are presented in the paper. The notion of interpreted graphs based on Tarski's model theory is introduced. The bottom-up parsing algorithm for ETPR(k) graph grammars is defined.

面向句法模式识别的图语法解析的进一步研究成果(模式识别，21:623-629,1988;23:765 - 774, 1990;24:1223 - 1224, 1991;26:1-16, 1993;43:249 - 2264, 2010;第一版。远景图。图像处理。47:1- 21,1989;Fundam。科学通报，2007;系统结构。计算机科学，2011:189-231,1998)。在塔斯基模型理论的基础上，引入了解释图的概念。定义了ETPR(k)图语法的自底向上解析算法。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Machine Graphics and Vision

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀