首页 > 最新文献

2022 International Conference on Machine Vision and Image Processing (MVIP)最新文献

英文 中文
A New Algorithm for Hand Gesture Recognition in Color Videos for Operating System Commands 基于操作系统命令的彩色视频手势识别新算法
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738775
Maziyar Grami
In recent years, human-computer interaction and machine vision have become two of the favorite research areas in computer science. This paper is research on hand motion and hand gesture recognition using image processing techniques to control some system commands. Different hand motion and gesture recognition methods have been considered by researchers for use in computer systems, video game consoles, and mobile devices. In such cases, hand motion or hand gesture type is detected by tracking the hands' image or by matching the image with gestures storing in the database, after the first position of the hand is identified. In this paper, we provide an efficient way for recognizing and tracking the sequences of image frames. The main objective of this paper is to provide an efficient method for hand recognition in crowded environments, without any restrictions. The frames can be received from a video file. In the first frame, the location of the hand is recognized using color analysis, then it would be compared to the next frames. Detected movements are applied for controlling commands in an operating system. So far, an effective way for this kind of problem has not been suggested. In this study, we have tried to propose an efficient algorithm that is less dependent on background and lighting conditions. We tested the proposed method on several captured videos using the Image Processing Toolbox of MATLAB. These videos are captured in very crowded environments. The presented results showed that in different lighting conditions and various noises, the proposed method works almost without mistakes.
近年来,人机交互和机器视觉已成为计算机科学中两个热门的研究领域。本文研究了用图像处理技术来控制系统指令的手部运动和手势识别。研究人员已经考虑了不同的手部运动和手势识别方法,用于计算机系统、视频游戏机和移动设备。在这种情况下,在识别手的第一个位置后,通过跟踪手的图像或将图像与存储在数据库中的手势进行匹配来检测手的运动或手势类型。在本文中,我们提供了一种有效的方法来识别和跟踪图像帧序列。本文的主要目的是提供一种在拥挤环境中不受任何限制的有效的手部识别方法。帧可以从视频文件中接收。在第一帧中,使用颜色分析识别手的位置,然后将其与下一帧进行比较。检测到的移动应用于操作系统中的控制命令。到目前为止,还没有提出解决这类问题的有效方法。在本研究中,我们试图提出一种对背景和光照条件依赖性较小的高效算法。我们使用MATLAB的图像处理工具箱对几个捕获的视频进行了测试。这些视频是在非常拥挤的环境中拍摄的。实验结果表明,在不同的光照条件和不同的噪声条件下,该方法几乎没有误差。
{"title":"A New Algorithm for Hand Gesture Recognition in Color Videos for Operating System Commands","authors":"Maziyar Grami","doi":"10.1109/MVIP53647.2022.9738775","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738775","url":null,"abstract":"In recent years, human-computer interaction and machine vision have become two of the favorite research areas in computer science. This paper is research on hand motion and hand gesture recognition using image processing techniques to control some system commands. Different hand motion and gesture recognition methods have been considered by researchers for use in computer systems, video game consoles, and mobile devices. In such cases, hand motion or hand gesture type is detected by tracking the hands' image or by matching the image with gestures storing in the database, after the first position of the hand is identified. In this paper, we provide an efficient way for recognizing and tracking the sequences of image frames. The main objective of this paper is to provide an efficient method for hand recognition in crowded environments, without any restrictions. The frames can be received from a video file. In the first frame, the location of the hand is recognized using color analysis, then it would be compared to the next frames. Detected movements are applied for controlling commands in an operating system. So far, an effective way for this kind of problem has not been suggested. In this study, we have tried to propose an efficient algorithm that is less dependent on background and lighting conditions. We tested the proposed method on several captured videos using the Image Processing Toolbox of MATLAB. These videos are captured in very crowded environments. The presented results showed that in different lighting conditions and various noises, the proposed method works almost without mistakes.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124150765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compressed Sensing MRI Reconstruction Using Improved U-net based on Deep Generative Adversarial Networks 基于深度生成对抗网络的改进U-net压缩感知MRI重构
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738554
Seyed Amir Mousavi, M. Ahmadzadeh, Ehsan Yazdian
Magnetic Resonance Imaging as non-invasive imaging can produce detailed anatomical images. MRI is a time- consuming imaging technique. Several imaging techniques, like parallel imaging, have been suggested to enhance imaging speed. Compressive Sensing MRI utilizes the sparsity of MR images to reconstruct MR images with under-sampled k-space data. It has already been shown that convolutional neural networks work better than sparsity-based approaches in image quality and reconstruction speed. In this paper, a novel method based on very deep CNN for the reconstruction of MR images is proposed using Generative Adversarial Networks. Generative and discriminative networks are designed with improved ResNet architecture. Using improved architecture has led to deepening generative and discriminative networks, reducing aliasing artifacts, more accurate reconstruction of edges, and better reconstruction of tissues. Compared to DLMRI and DAGAN methods, we demonstrate the proposed method outperforms the conventional methods and deep learning-based approaches. Assessment is made on several datasets such as the brain, heart, and prostate. Reconstruction of brain data with a Cartesian mask of 30% in the proposed method has improved the SSIM criteria up to 0.99. Also, image reconstruction time is approximately 20 ms on GPU, which is suitable for real-time applications.
磁共振成像作为一种非侵入性成像,可以产生详细的解剖图像。核磁共振成像是一种耗时的成像技术。一些成像技术,如平行成像,已被建议提高成像速度。压缩感知MRI利用磁共振图像的稀疏性,用欠采样的k空间数据重建磁共振图像。已有研究表明,卷积神经网络在图像质量和重建速度方面优于基于稀疏性的方法。本文提出了一种基于深度CNN的生成对抗网络重建MR图像的新方法。基于改进的ResNet架构设计了生成式和判别式网络。使用改进的架构可以加深生成和判别网络,减少混叠伪影,更准确地重建边缘,更好地重建组织。与DLMRI和DAGAN方法相比,我们证明了所提出的方法优于传统方法和基于深度学习的方法。评估是在几个数据集上进行的,比如大脑、心脏和前列腺。用30%的笛卡尔掩模重建大脑数据,将SSIM标准提高到0.99。在GPU上,图像重建时间约为20ms,适合于实时应用。
{"title":"Compressed Sensing MRI Reconstruction Using Improved U-net based on Deep Generative Adversarial Networks","authors":"Seyed Amir Mousavi, M. Ahmadzadeh, Ehsan Yazdian","doi":"10.1109/MVIP53647.2022.9738554","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738554","url":null,"abstract":"Magnetic Resonance Imaging as non-invasive imaging can produce detailed anatomical images. MRI is a time- consuming imaging technique. Several imaging techniques, like parallel imaging, have been suggested to enhance imaging speed. Compressive Sensing MRI utilizes the sparsity of MR images to reconstruct MR images with under-sampled k-space data. It has already been shown that convolutional neural networks work better than sparsity-based approaches in image quality and reconstruction speed. In this paper, a novel method based on very deep CNN for the reconstruction of MR images is proposed using Generative Adversarial Networks. Generative and discriminative networks are designed with improved ResNet architecture. Using improved architecture has led to deepening generative and discriminative networks, reducing aliasing artifacts, more accurate reconstruction of edges, and better reconstruction of tissues. Compared to DLMRI and DAGAN methods, we demonstrate the proposed method outperforms the conventional methods and deep learning-based approaches. Assessment is made on several datasets such as the brain, heart, and prostate. Reconstruction of brain data with a Cartesian mask of 30% in the proposed method has improved the SSIM criteria up to 0.99. Also, image reconstruction time is approximately 20 ms on GPU, which is suitable for real-time applications.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132035585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Registration Based on Redundant Keypoint Elimination SARSIFT Algorithm and MROGH Descriptor 基于冗余关键点消除SARSIFT算法和MROGH描述符的图像配准
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738737
Zahra Hossein-Nejad, M. Nasri
In this article, a new approach is suggested in remote-sensing images registration. In the suggested approach, first, the features extraction process is done based on proposed redundant keypoint elimination method synthetic aperture radar-SIFT (RKEM-SARSIFT). Second, creating descriptors is based on the Multi-Support Region Order-Based Gradient Histogram (MROGH) algorithm. Finally, matching process is done based on nearest neighbor distance ratio (NNDR) and transformation model is done based affine transform. The simulation results on several remote sensing image datasets affirm the suggested approach advantage in comparison with some other basic registration methods in terms of precision matching, SITMMR and SITMMC.
本文提出了一种新的遥感图像配准方法。在该方法中,首先,基于所提出的冗余关键点消除方法合成孔径雷达- sift (rkom - sarsift)进行特征提取;其次,基于多支持区域有序梯度直方图(MROGH)算法创建描述符。最后,基于最近邻距离比(NNDR)进行匹配处理,并基于仿射变换建立变换模型。在多个遥感影像数据集上的仿真结果证实了该方法在精度匹配、SITMMR和SITMMC方面优于其他基本配准方法。
{"title":"Image Registration Based on Redundant Keypoint Elimination SARSIFT Algorithm and MROGH Descriptor","authors":"Zahra Hossein-Nejad, M. Nasri","doi":"10.1109/MVIP53647.2022.9738737","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738737","url":null,"abstract":"In this article, a new approach is suggested in remote-sensing images registration. In the suggested approach, first, the features extraction process is done based on proposed redundant keypoint elimination method synthetic aperture radar-SIFT (RKEM-SARSIFT). Second, creating descriptors is based on the Multi-Support Region Order-Based Gradient Histogram (MROGH) algorithm. Finally, matching process is done based on nearest neighbor distance ratio (NNDR) and transformation model is done based affine transform. The simulation results on several remote sensing image datasets affirm the suggested approach advantage in comparison with some other basic registration methods in terms of precision matching, SITMMR and SITMMC.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131776224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Machine Vision Based Method for Extracting Visual Features of Froth in Copper Floatation Process 基于机器视觉的铜浮选过程泡沫视觉特征提取方法
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738765
Abbas Barhoun, A. M. Khiavi, Alireza Sokhandan Sorkhabi, H. S. Aghdasi, Behzad Kargari
Froth flotation is one of the most important and widespread methods of separation of minerals and waste materials and at the same time one of the most accurate methods of refining low-grade metal minerals. This paper presents a method for visual feature extraction of froth bubbles including the size, color, shape, and mobility based on machine vision and image processing techniques. The proposed method is capable of identifying bubbles properties as well as estimating their velocity and direction of movement. The performance of the proposed method is evaluated using real videos captured from the copper floatation process. The method description, as well as simulation results, are presented.
泡沫浮选是一种最重要、应用最广泛的选矿与废选方法,同时也是提炼低品位金属矿物最精确的方法之一。本文提出了一种基于机器视觉和图像处理技术的气泡视觉特征提取方法,包括气泡的大小、颜色、形状和流动性。该方法不仅能够识别气泡的性质,而且能够估计气泡的运动速度和方向。利用从铜浮选过程中捕获的真实视频对该方法的性能进行了评估。给出了方法描述和仿真结果。
{"title":"A Machine Vision Based Method for Extracting Visual Features of Froth in Copper Floatation Process","authors":"Abbas Barhoun, A. M. Khiavi, Alireza Sokhandan Sorkhabi, H. S. Aghdasi, Behzad Kargari","doi":"10.1109/MVIP53647.2022.9738765","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738765","url":null,"abstract":"Froth flotation is one of the most important and widespread methods of separation of minerals and waste materials and at the same time one of the most accurate methods of refining low-grade metal minerals. This paper presents a method for visual feature extraction of froth bubbles including the size, color, shape, and mobility based on machine vision and image processing techniques. The proposed method is capable of identifying bubbles properties as well as estimating their velocity and direction of movement. The performance of the proposed method is evaluated using real videos captured from the copper floatation process. The method description, as well as simulation results, are presented.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129833643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer Learning on Semantic Segmentation for Sugar Crystal Analysis 糖晶分析语义分割的迁移学习
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738778
Zohoor Hayali, G. Akbarizadeh
In sugar factories, crystal particle analysis plays an important role in the quality of sugar production. Analyzes include measuring the crystals' dimensions, estimating the area and distribution of the crystals in terms of dimensions, which are of great help in setting up sugar kilns. To be able to analyze sugar particles, we must first be able to segment crystals correctly. Therefore, segmentation is the first and most important stage of the analysis. This paper introduces a method based on Transfer Learning (TL) in deep neural networks for the Semantic Segmentation of sugar crystals. In this method, by modifying a pre-trained Convolutional Neural Network (CNN) called DeepLab, the semantic Segmentation of sugar crystals is performed and the results clearly show that this method can label the crystals with high accuracy and remove extra parts.
在制糖厂,晶体颗粒分析对制糖质量起着重要的作用。分析包括晶体尺寸的测量,晶体尺寸的面积和分布的估计,这些对糖窑的设置有很大的帮助。为了能够分析糖颗粒,我们必须首先能够正确地分割晶体。因此,分割是分析的第一个也是最重要的阶段。介绍了一种基于深度神经网络迁移学习(TL)的糖晶体语义分割方法。该方法通过对deepplab预训练卷积神经网络(CNN)进行修改,对糖晶体进行语义分割,结果清楚地表明,该方法可以高精度地标记晶体并去除多余部分。
{"title":"Transfer Learning on Semantic Segmentation for Sugar Crystal Analysis","authors":"Zohoor Hayali, G. Akbarizadeh","doi":"10.1109/MVIP53647.2022.9738778","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738778","url":null,"abstract":"In sugar factories, crystal particle analysis plays an important role in the quality of sugar production. Analyzes include measuring the crystals' dimensions, estimating the area and distribution of the crystals in terms of dimensions, which are of great help in setting up sugar kilns. To be able to analyze sugar particles, we must first be able to segment crystals correctly. Therefore, segmentation is the first and most important stage of the analysis. This paper introduces a method based on Transfer Learning (TL) in deep neural networks for the Semantic Segmentation of sugar crystals. In this method, by modifying a pre-trained Convolutional Neural Network (CNN) called DeepLab, the semantic Segmentation of sugar crystals is performed and the results clearly show that this method can label the crystals with high accuracy and remove extra parts.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"26 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114034757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improvement of Human Tracking Based on an Accurate Estimation of Feet or Head Position 基于足部或头部位置精确估计的人体跟踪改进
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738750
Ali Dadgar, Y. Baleghi, M. Ezoji
In this paper a method is presented to estimate the position of feet/head of objects in various camera views. In this method, first, all objects in the scene are detected using the background subtraction. Then, human and non-human objects are separated via the support vector machine (SVM) that is trained based on local binary patterns (LBP) features. The basic idea of the next step of this work is that the feet/head of an object are the group of pixels that are projected to small region on ground/top plane by corresponding homography matrix. This idea is expressed via an optimization problem which avoids partitioning out small group of pixels. Experimental results show that the proposed methods can improve the accuracy of the object tracking.
本文提出了一种在不同摄像机视图中估计物体脚/头位置的方法。该方法首先利用背景差法对场景中的所有物体进行检测。然后,通过基于局部二值模式(LBP)特征训练的支持向量机(SVM)对人类和非人类物体进行分离。下一步工作的基本思路是,物体的脚/头是一组像素,通过相应的单应性矩阵投影到地面/顶面的小区域。这个想法是通过一个优化问题来表达的,该问题避免了分割小组像素。实验结果表明,该方法可以提高目标跟踪的精度。
{"title":"Improvement of Human Tracking Based on an Accurate Estimation of Feet or Head Position","authors":"Ali Dadgar, Y. Baleghi, M. Ezoji","doi":"10.1109/MVIP53647.2022.9738750","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738750","url":null,"abstract":"In this paper a method is presented to estimate the position of feet/head of objects in various camera views. In this method, first, all objects in the scene are detected using the background subtraction. Then, human and non-human objects are separated via the support vector machine (SVM) that is trained based on local binary patterns (LBP) features. The basic idea of the next step of this work is that the feet/head of an object are the group of pixels that are projected to small region on ground/top plane by corresponding homography matrix. This idea is expressed via an optimization problem which avoids partitioning out small group of pixels. Experimental results show that the proposed methods can improve the accuracy of the object tracking.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133155244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Secure Hybrid Permissioned Blockchain and Deep Learning Platform for CT Image Classification 用于CT图像分类的安全混合许可区块链和深度学习平台
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738736
M. Noei, Mohammadreza Parvizimosaed, Aliakbar Saleh Bigdeli, Mohammadmostafa Yalpanian
Pneumonia is a life-threatening and prevalent disease and needs to be diagnosed within a short time because of the lungs' fluid flow. Late detection of the disease may result in the patient’s death. Thus, advanced diagnosis is a critical factor besides the disease progress. In addition to advanced diagnosis, the privacy of datasets is important for organizations. Due to the great value of datasets, hospitals do not want to share their datasets, but they want to share their trained network weights. Therefore, in this paper, we combine deep learning and blockchain to implement the blockchain as distributed storage. Using permission blockchain, weights are broadcasted among other hospitals securely. Because of the security, the dataset is shared with five hospitals equally. Each hospital trains its network model and sends its weights to the blockchain. The goal is to broadcast the aggregated weights among hospitals securely and have good enough results because the whole dataset is not implemented to train a network. The dataset contains 5856 images, and hospitals implement a residual neural network with 28 layers. The results show that hospitals can increase the accuracy of their model using shared weights compared to a model without using shared weights.
肺炎是一种危及生命的流行疾病,由于肺部的液体流动,需要在短时间内诊断出来。这种疾病发现晚可能导致病人死亡。因此,除了病情进展外,先进的诊断也是一个关键因素。除了高级诊断之外,数据集的隐私对组织也很重要。由于数据集的巨大价值,医院不希望共享他们的数据集,但他们希望共享他们训练过的网络权重。因此,在本文中,我们将深度学习与区块链相结合,实现区块链作为分布式存储。使用权限区块链,权重在其他医院之间安全地广播。出于安全考虑,数据集与五家医院平等共享。每家医院都训练自己的网络模型,并将其权重发送到区块链。目标是在医院之间安全地广播聚合的权重,并获得足够好的结果,因为整个数据集没有实现用于训练网络。数据集包含5856张图像,医院实现了一个28层的残差神经网络。结果表明,与不使用共享权重的模型相比,医院使用共享权重可以提高模型的准确性。
{"title":"A Secure Hybrid Permissioned Blockchain and Deep Learning Platform for CT Image Classification","authors":"M. Noei, Mohammadreza Parvizimosaed, Aliakbar Saleh Bigdeli, Mohammadmostafa Yalpanian","doi":"10.1109/MVIP53647.2022.9738736","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738736","url":null,"abstract":"Pneumonia is a life-threatening and prevalent disease and needs to be diagnosed within a short time because of the lungs' fluid flow. Late detection of the disease may result in the patient’s death. Thus, advanced diagnosis is a critical factor besides the disease progress. In addition to advanced diagnosis, the privacy of datasets is important for organizations. Due to the great value of datasets, hospitals do not want to share their datasets, but they want to share their trained network weights. Therefore, in this paper, we combine deep learning and blockchain to implement the blockchain as distributed storage. Using permission blockchain, weights are broadcasted among other hospitals securely. Because of the security, the dataset is shared with five hospitals equally. Each hospital trains its network model and sends its weights to the blockchain. The goal is to broadcast the aggregated weights among hospitals securely and have good enough results because the whole dataset is not implemented to train a network. The dataset contains 5856 images, and hospitals implement a residual neural network with 28 layers. The results show that hospitals can increase the accuracy of their model using shared weights compared to a model without using shared weights.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123935381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Spatial Quality Assessment of Pansharpened Images Based on Gray Level Co-Occurrence Matrix 基于灰度共生矩阵的泛锐化图像空间质量评价
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738763
S. Aghapour Maleki, H. Ghassemian
Assessing the quality of pansharpened images is a critical issue in order to obtain a quantitative score to represent the quality and compare the performance of different fusion methods. Most of the introduced metrics for pansharpened image quality assessment, evaluate the spectral content of the image, while in different applications of remote sensing like detection and identification of image objects, spatial quality has an important role. In the current study, a new index for spatial quality assessment is introduced that extracts gray level co-occurrence matrix (GLCM) from distorted and reference images and compares the similarities of these features. The tempere image database 2013 (TID2013) that provides reference and different types of distorted images with subjective scores of each image is used as the desired database. To solve the high computational complexity of obtaining GLCM features, the fast GLCM method is employed. In this way, 16 different features are extracted. To select the features that have the most consistency with the human visual system (HVS), the forward floating search method is used as a feature selection method and five features are obtained as the final features to form the desired index. Experimental results show the efficiency of the proposed method in determining the spatial quality of fused images compared with that of the available quality assessment metrics.
为了获得一个量化的分数来代表图像的质量,并比较不同融合方法的性能,对泛锐化图像的质量进行评估是一个关键问题。大多数引入的指标用于泛锐化图像质量评估,评估图像的光谱含量,而在遥感的不同应用中,如图像对象的检测和识别,空间质量具有重要作用。本研究提出了一种新的空间质量评价指标,即从失真图像和参考图像中提取灰度共生矩阵(GLCM),并比较这些特征的相似性。我们使用了temere image database 2013 (TID2013)作为所需的数据库,该数据库提供了参考和不同类型的失真图像,并对每张图像进行了主观评分。为了解决GLCM特征获取的计算复杂度高的问题,采用了快速GLCM方法。这样可以提取出16种不同的特征。为了选择与人类视觉系统(HVS)最一致的特征,采用正向浮动搜索法作为特征选择方法,得到5个特征作为最终特征,形成期望的索引。实验结果表明,与现有的质量评价指标相比,该方法在确定融合图像的空间质量方面是有效的。
{"title":"Spatial Quality Assessment of Pansharpened Images Based on Gray Level Co-Occurrence Matrix","authors":"S. Aghapour Maleki, H. Ghassemian","doi":"10.1109/MVIP53647.2022.9738763","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738763","url":null,"abstract":"Assessing the quality of pansharpened images is a critical issue in order to obtain a quantitative score to represent the quality and compare the performance of different fusion methods. Most of the introduced metrics for pansharpened image quality assessment, evaluate the spectral content of the image, while in different applications of remote sensing like detection and identification of image objects, spatial quality has an important role. In the current study, a new index for spatial quality assessment is introduced that extracts gray level co-occurrence matrix (GLCM) from distorted and reference images and compares the similarities of these features. The tempere image database 2013 (TID2013) that provides reference and different types of distorted images with subjective scores of each image is used as the desired database. To solve the high computational complexity of obtaining GLCM features, the fast GLCM method is employed. In this way, 16 different features are extracted. To select the features that have the most consistency with the human visual system (HVS), the forward floating search method is used as a feature selection method and five features are obtained as the final features to form the desired index. Experimental results show the efficiency of the proposed method in determining the spatial quality of fused images compared with that of the available quality assessment metrics.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125272447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Multi Focus Image Fusion Using Determinant 基于行列式的快速多焦点图像融合
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738555
Mostafa Amin-Naji, A. Aghagolzadeh, Hami Mahdavinataj
This paper presents fast pixel-wise multi-focus image fusion in the spatial domain without bells and whistles. The proposed method just uses the determinant of the sliding windows from the input images as a metric to create a pixel-wise decision map. The sliding windows of 15 pixels with the stride of 7 pixels are passed through the input images. Then it creates a pixel-wise decision map for fusion multi-focus images. Also, some simple tricks like global image threshold using Otsu’s method and removal of small objects by morphological closing operation are used to refine the pixel-wise decision map. This method is high-speed and can fuse a pair of 512x512 multi-focus images around 0.05 seconds (50 milliseconds) in our hardware. We compared it with 22 prominent methods in the transform domain, spatial domain, and deep learning based methods that their source codes are available, and our method is faster than all of them. We conducted the objective and subjective experiments on the Lytro dataset, and our method can compete with their results. The proposed method may not have the best fusion quality among state-of-the-art methods, but to the best of our knowledge, this is the fastest pixel-wise method and very suitable for real-time image processing. All material and source code will be available in https://github.com/mostafaaminnaji/FastDetFuse and http://imagefusion.ir.
本文提出了一种空间域的快速逐像素多焦点图像融合方法。该方法仅使用输入图像中滑动窗口的行列式作为度量来创建逐像素决策图。15像素、7像素的滑动窗口通过输入图像。然后为融合多焦点图像创建逐像素决策图。同时,利用Otsu方法的全局图像阈值和形态闭合操作去除小目标等简单技巧来细化逐像素决策图。这种方法是高速的,在我们的硬件中可以在0.05秒(50毫秒)左右融合一对512x512的多焦点图像。我们将其与变换域、空间域和基于深度学习的22种主要方法进行了比较,发现我们的方法比它们都快。我们在Lytro数据集上进行了客观和主观的实验,我们的方法可以与他们的结果相媲美。该方法的融合质量可能不是目前最先进的方法中最好的,但据我们所知,这是最快的逐像素方法,非常适合实时图像处理。所有材料和源代码都可以在https://github.com/mostafaaminnaji/FastDetFuse和http://imagefusion.ir上获得。
{"title":"Fast Multi Focus Image Fusion Using Determinant","authors":"Mostafa Amin-Naji, A. Aghagolzadeh, Hami Mahdavinataj","doi":"10.1109/MVIP53647.2022.9738555","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738555","url":null,"abstract":"This paper presents fast pixel-wise multi-focus image fusion in the spatial domain without bells and whistles. The proposed method just uses the determinant of the sliding windows from the input images as a metric to create a pixel-wise decision map. The sliding windows of 15 pixels with the stride of 7 pixels are passed through the input images. Then it creates a pixel-wise decision map for fusion multi-focus images. Also, some simple tricks like global image threshold using Otsu’s method and removal of small objects by morphological closing operation are used to refine the pixel-wise decision map. This method is high-speed and can fuse a pair of 512x512 multi-focus images around 0.05 seconds (50 milliseconds) in our hardware. We compared it with 22 prominent methods in the transform domain, spatial domain, and deep learning based methods that their source codes are available, and our method is faster than all of them. We conducted the objective and subjective experiments on the Lytro dataset, and our method can compete with their results. The proposed method may not have the best fusion quality among state-of-the-art methods, but to the best of our knowledge, this is the fastest pixel-wise method and very suitable for real-time image processing. All material and source code will be available in https://github.com/mostafaaminnaji/FastDetFuse and http://imagefusion.ir.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126981756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
JPEG Steganalysis Using the Relations Between DCT Coefficients 利用DCT系数之间的关系进行JPEG隐写分析
Pub Date : 2022-02-23 DOI: 10.1109/MVIP53647.2022.9738785
Seyedeh Maryam Seyed Khalilollahi, Azadeh Mansouri
Increasing attention to steganalysis and steganography due to the need for secure information transfer is one of the most important concerns of communication. Among the several image formats, JPEG is the most widely used compression method today. As a result, various stenographic systems based on disguising messages in jpeg format have been presented. Consequently, steganalysis of JPEG images is very essential. Recently, using neural networks and deep learning has greatly increased both in spatial and JPEG steganalysis. However, in the field of JPEG steganalysis, most of the existing networks still utilized hand-designed components as well. In the proposed JPEG steganalysis method we investigate the relations of the quantized Discrete Cosine Transform (DCT) coefficients and extract the binary vectors as the input of the neural network employing the relations of mid-frequency coefficients. The experimental results illustrate the acceptable detection rate of the simple presented approach.
由于信息安全传输的需要,隐写分析和隐写术日益受到关注,是通信领域最重要的问题之一。在几种图像格式中,JPEG是目前使用最广泛的压缩方法。因此,出现了各种基于隐藏jpeg格式信息的速记系统。因此,对JPEG图像进行隐写分析是非常必要的。近年来,神经网络和深度学习在空间和JPEG隐写分析中的应用得到了极大的发展。然而,在JPEG隐写分析领域,现有的大多数网络仍然使用手工设计的组件。在提出的JPEG隐写分析方法中,我们研究了量化离散余弦变换(DCT)系数的关系,并利用中频系数的关系提取二值向量作为神经网络的输入。实验结果表明,该方法具有可接受的检测率。
{"title":"JPEG Steganalysis Using the Relations Between DCT Coefficients","authors":"Seyedeh Maryam Seyed Khalilollahi, Azadeh Mansouri","doi":"10.1109/MVIP53647.2022.9738785","DOIUrl":"https://doi.org/10.1109/MVIP53647.2022.9738785","url":null,"abstract":"Increasing attention to steganalysis and steganography due to the need for secure information transfer is one of the most important concerns of communication. Among the several image formats, JPEG is the most widely used compression method today. As a result, various stenographic systems based on disguising messages in jpeg format have been presented. Consequently, steganalysis of JPEG images is very essential. Recently, using neural networks and deep learning has greatly increased both in spatial and JPEG steganalysis. However, in the field of JPEG steganalysis, most of the existing networks still utilized hand-designed components as well. In the proposed JPEG steganalysis method we investigate the relations of the quantized Discrete Cosine Transform (DCT) coefficients and extract the binary vectors as the input of the neural network employing the relations of mid-frequency coefficients. The experimental results illustrate the acceptable detection rate of the simple presented approach.","PeriodicalId":184716,"journal":{"name":"2022 International Conference on Machine Vision and Image Processing (MVIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131400818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 International Conference on Machine Vision and Image Processing (MVIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1