首页 > 最新文献

2020 International Conference on Machine Vision and Image Processing (MVIP)最新文献

英文 中文
Scale Equivariant CNNs with Scale Steerable Filters 具有尺度可控制滤波器的尺度等变cnn
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116889
Hanieh Naderi, Leili Goli, S. Kasaei
Convolution Neural Networks (CNNs), despite being one of the most successful image classification methods, are not robust to most geometric transformations (rotation, isotropic scaling) because of their structural constraints. Recently, scale steerable filters have been proposed to allow scale invariance in CNNs. Although these filters enhance the network performance in scaled image classification tasks, they cannot maintain the scale information across the network. In this paper, this problem is addressed. First, a CNN is built with the usage of scale steerable filters. Then, a scale equivariat network is acquired by adding a feature map to each layer so that the scale-related features are retained across the network. At last, by defining the cost function as the cross entropy, this solution is evaluated and the model parameters are updated. The results show that it improves the perfromance about 2% over other comparable methods of scale equivariance and scale invariance, when run on the FMNIST-scale dataset.
卷积神经网络(cnn)尽管是最成功的图像分类方法之一,但由于其结构限制,对大多数几何变换(旋转、各向同性缩放)不具有鲁棒性。近年来,为了实现cnn的尺度不变性,提出了尺度可控制滤波器。虽然这些滤波器提高了网络在尺度图像分类任务中的性能,但它们不能保持整个网络的尺度信息。本文对这一问题进行了探讨。首先,利用尺度可调滤波器构建CNN。然后,通过在每一层添加特征映射来获得尺度等变网络,从而在整个网络中保留尺度相关的特征。最后,通过将代价函数定义为交叉熵,对该解进行评估,并更新模型参数。结果表明,在fmist -scale数据集上运行时,该方法比其他可比较的尺度等变性和尺度不变性方法的性能提高了约2%。
{"title":"Scale Equivariant CNNs with Scale Steerable Filters","authors":"Hanieh Naderi, Leili Goli, S. Kasaei","doi":"10.1109/MVIP49855.2020.9116889","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116889","url":null,"abstract":"Convolution Neural Networks (CNNs), despite being one of the most successful image classification methods, are not robust to most geometric transformations (rotation, isotropic scaling) because of their structural constraints. Recently, scale steerable filters have been proposed to allow scale invariance in CNNs. Although these filters enhance the network performance in scaled image classification tasks, they cannot maintain the scale information across the network. In this paper, this problem is addressed. First, a CNN is built with the usage of scale steerable filters. Then, a scale equivariat network is acquired by adding a feature map to each layer so that the scale-related features are retained across the network. At last, by defining the cost function as the cross entropy, this solution is evaluated and the model parameters are updated. The results show that it improves the perfromance about 2% over other comparable methods of scale equivariance and scale invariance, when run on the FMNIST-scale dataset.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126383383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Extracting Iso-Disparity Strip Width using a Statistical Model in a Stereo Vision System 基于统计模型的立体视觉等视差条带宽度提取
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116926
Benyamin Kheradvar, A. Mousavinia, A. M. Sodagar
Disparity map images, as outputs of a stereo vision system, are known as an effective approach in applications that need depth information in their procedure. One example of such applications is extracting planes with arbitrary attributes from a scene using the concept of iso-disparity strips. The width and direction of strips depend on the plane direction and position in the 3D space. In this paper, a statistical analysis is performed to model the behavior of these strips. This statistical analysis as well as a frequency analysis reveal that for each group of iso-disparity strips, which are corresponding to a single plane in 3D, the width of strips can be represented by an average value superposed by an Additive Gaussian Noise (AGN). This means that a simple averaging technique can significantly reduce the measurement noise in applications such as ground detection using these strips. Results show that the width of iso-disparity strips can be measured with an average precision of 96% using the presented noise model.
视差图图像作为立体视觉系统的输出,在处理过程中需要深度信息的应用中是一种有效的方法。这类应用的一个例子是使用等视差条的概念从场景中提取具有任意属性的平面。条带的宽度和方向取决于其在三维空间中的平面方向和位置。在本文中,进行了统计分析,以模拟这些带材的行为。这种统计分析和频率分析表明,对于每组等视差条带,它们对应于三维中的单个平面,条带的宽度可以用加性高斯噪声(AGN)叠加的平均值来表示。这意味着一个简单的平均技术可以显著降低应用中的测量噪声,如使用这些条带进行地面检测。结果表明,采用该噪声模型测量等视差带宽度的平均精度可达96%。
{"title":"Extracting Iso-Disparity Strip Width using a Statistical Model in a Stereo Vision System","authors":"Benyamin Kheradvar, A. Mousavinia, A. M. Sodagar","doi":"10.1109/MVIP49855.2020.9116926","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116926","url":null,"abstract":"Disparity map images, as outputs of a stereo vision system, are known as an effective approach in applications that need depth information in their procedure. One example of such applications is extracting planes with arbitrary attributes from a scene using the concept of iso-disparity strips. The width and direction of strips depend on the plane direction and position in the 3D space. In this paper, a statistical analysis is performed to model the behavior of these strips. This statistical analysis as well as a frequency analysis reveal that for each group of iso-disparity strips, which are corresponding to a single plane in 3D, the width of strips can be represented by an average value superposed by an Additive Gaussian Noise (AGN). This means that a simple averaging technique can significantly reduce the measurement noise in applications such as ground detection using these strips. Results show that the width of iso-disparity strips can be measured with an average precision of 96% using the presented noise model.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122205070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MVIP 2020 Table of Authors MVIP 2020作者表
Pub Date : 2020-02-01 DOI: 10.1109/mvip49855.2020.9116925
{"title":"MVIP 2020 Table of Authors","authors":"","doi":"10.1109/mvip49855.2020.9116925","DOIUrl":"https://doi.org/10.1109/mvip49855.2020.9116925","url":null,"abstract":"","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132638494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Siamese Networks with Transfer Learning for Face Recognition on Small-Samples Datasets 基于Siamese网络和迁移学习的小样本人脸识别
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116915
Mohsen Heidari, Kazim Fouladi-Ghaleh
Nowadays, computer-based face recognition is a mature and reliable mechanism that is significantly used in many access control scenarios along with other biometric methods. Face recognition consists of two subtasks including Face Verification and Face Identification. By comparing a pair of images, Face Verification determines whether those images are related to one person or not; and Face Identification has to identify a specific face within a set of available faces in the database. There are many challenges in face recognition such as angle, illumination, pose, facial expression, noise, resolution, occlusion and the few number of one-class samples with several classes. In this paper, we are carrying out face recognition by utilizing transfer learning in a siamese network which consists of two similar CNNs. In the siamese network, a pair of two face images is given to the network as input, then the network extracts the features of this pair of images and finally, it determines whether the pair of images belongs to one person or not by using a similarity criterion. The results show that the proposed model is comparable with advanced models that are trained on datasets containing large numbers of samples. furthermore, it improves the accuracy of face recognition in comparison with methods which are trained using datasets with a few number of samples, and the mentioned accuracy is claimed to be 95.62% on LFW dataset.
目前,基于计算机的人脸识别是一种成熟可靠的机制,与其他生物识别方法一起在许多访问控制场景中得到了广泛的应用。人脸识别包括两个子任务:人脸验证和人脸识别。通过比较一对图像,人脸验证确定这些图像是否与一个人有关;人脸识别必须在数据库中一组可用的人脸中识别出特定的人脸。人脸识别中存在角度、光照、姿态、表情、噪声、分辨率、遮挡等问题,以及一类样本数量少、类数量少等问题。在本文中,我们在由两个相似的cnn组成的暹罗网络中利用迁移学习进行人脸识别。在siamese网络中,将一对两张人脸图像作为输入给网络,然后网络提取这对图像的特征,最后利用相似度准则判断这对图像是否属于同一个人。结果表明,该模型与在包含大量样本的数据集上训练的高级模型具有可比性。此外,与使用少量样本数据集训练的方法相比,该方法提高了人脸识别的准确率,在LFW数据集上的准确率达到95.62%。
{"title":"Using Siamese Networks with Transfer Learning for Face Recognition on Small-Samples Datasets","authors":"Mohsen Heidari, Kazim Fouladi-Ghaleh","doi":"10.1109/MVIP49855.2020.9116915","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116915","url":null,"abstract":"Nowadays, computer-based face recognition is a mature and reliable mechanism that is significantly used in many access control scenarios along with other biometric methods. Face recognition consists of two subtasks including Face Verification and Face Identification. By comparing a pair of images, Face Verification determines whether those images are related to one person or not; and Face Identification has to identify a specific face within a set of available faces in the database. There are many challenges in face recognition such as angle, illumination, pose, facial expression, noise, resolution, occlusion and the few number of one-class samples with several classes. In this paper, we are carrying out face recognition by utilizing transfer learning in a siamese network which consists of two similar CNNs. In the siamese network, a pair of two face images is given to the network as input, then the network extracts the features of this pair of images and finally, it determines whether the pair of images belongs to one person or not by using a similarity criterion. The results show that the proposed model is comparable with advanced models that are trained on datasets containing large numbers of samples. furthermore, it improves the accuracy of face recognition in comparison with methods which are trained using datasets with a few number of samples, and the mentioned accuracy is claimed to be 95.62% on LFW dataset.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114166227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
DeepFaceAR: Deep Face Recognition and Displaying Personal Information via Augmented Reality DeepFaceAR:通过增强现实技术进行深度人脸识别和显示个人信息
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116873
Amin Golnari, H. Khosravi, S. Sanei
Biometric recognition is a popular topic in machine vision. Deep Neural Networks have been recently used in several applications, especially in biometric recognition. In this paper, we combine a Deep Neural Network alongside Augmented Reality to produce a system capable of recognizing the faces of individuals and displaying some information about the individual as an Augmented Reality. We used a dataset containing 1200 face images of 100 faculty members of the Shahrood University of Technology. After training the proposed Deep Network, it reached the recognition accuracy of 99.45%. We also provided some graphical targets for each person that contains his information. When a person is identified by the deep network, the target image provided for augmented reality is aligned with the angle and dimensions of the detected face and displayed on top of it.
生物特征识别是机器视觉领域的一个热门课题。近年来,深度神经网络在生物特征识别领域得到了广泛的应用。在本文中,我们将深度神经网络与增强现实相结合,以产生一个能够识别个人面部并显示个人信息的系统作为增强现实。我们使用的数据集包含了沙赫鲁德理工大学100名教职员工的1200张人脸图像。本文提出的深度网络经过训练后,识别准确率达到99.45%。我们还为每个人提供了一些包含其信息的图形目标。当一个人被深度网络识别时,为增强现实提供的目标图像与被检测到的面部的角度和尺寸对齐,并显示在其上方。
{"title":"DeepFaceAR: Deep Face Recognition and Displaying Personal Information via Augmented Reality","authors":"Amin Golnari, H. Khosravi, S. Sanei","doi":"10.1109/MVIP49855.2020.9116873","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116873","url":null,"abstract":"Biometric recognition is a popular topic in machine vision. Deep Neural Networks have been recently used in several applications, especially in biometric recognition. In this paper, we combine a Deep Neural Network alongside Augmented Reality to produce a system capable of recognizing the faces of individuals and displaying some information about the individual as an Augmented Reality. We used a dataset containing 1200 face images of 100 faculty members of the Shahrood University of Technology. After training the proposed Deep Network, it reached the recognition accuracy of 99.45%. We also provided some graphical targets for each person that contains his information. When a person is identified by the deep network, the target image provided for augmented reality is aligned with the angle and dimensions of the detected face and displayed on top of it.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126131215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Modeling of Pruning Techniques for Simplifying Deep Neural Networks 简化深度神经网络的剪枝技术建模
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116891
Morteza Mousa Pasandi, M. Hajabdollahi, N. Karimi, S. Samavi
Convolutional Neural Networks (CNNs) suffer from different issues such as computational complexity and the number of parameters. In recent years pruning techniques are employed to reduce the number of operations and model size in CNNs. Different pruning methods are proposed, which are based on pruning the connections, channels, and filters. Various techniques and tricks accompany pruning methods, and there is not a unifying framework to model all the pruning methods. In this paper pruning methods are investigated, and a general model which is contained the majority of pruning techniques is proposed. The advantages and disadvantages of the pruning methods can be identified, and all of them can be summarized under this model. The final goal of this model can be providing a specific method for all the pruning methods with different structures and applications.
卷积神经网络(cnn)面临着计算复杂性和参数数量等不同的问题。近年来,人们采用剪枝技术来减少cnn的操作次数和模型大小。提出了基于连接、通道和滤波器的剪枝方法。各种技术和技巧伴随着修剪方法,并且没有一个统一的框架来建模所有修剪方法。本文对各种剪枝方法进行了研究,提出了一个包含大多数剪枝技术的通用模型。可以识别出各种修剪方法的优缺点,并在此模型下进行总结。该模型的最终目标是为所有具有不同结构和应用的剪枝方法提供一种特定的方法。
{"title":"Modeling of Pruning Techniques for Simplifying Deep Neural Networks","authors":"Morteza Mousa Pasandi, M. Hajabdollahi, N. Karimi, S. Samavi","doi":"10.1109/MVIP49855.2020.9116891","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116891","url":null,"abstract":"Convolutional Neural Networks (CNNs) suffer from different issues such as computational complexity and the number of parameters. In recent years pruning techniques are employed to reduce the number of operations and model size in CNNs. Different pruning methods are proposed, which are based on pruning the connections, channels, and filters. Various techniques and tricks accompany pruning methods, and there is not a unifying framework to model all the pruning methods. In this paper pruning methods are investigated, and a general model which is contained the majority of pruning techniques is proposed. The advantages and disadvantages of the pruning methods can be identified, and all of them can be summarized under this model. The final goal of this model can be providing a specific method for all the pruning methods with different structures and applications.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124848469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Image Watermarking by Q Learning and Matrix Factorization 基于Q学习和矩阵分解的图像水印
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116871
M. Alizadeh, H. Sajedi, B. BabaAli
Today, with the advancement of technology and the widespread use of the internet, watermarking techniques are being developed to protect copyright and data security. The methods proposed for watermarking can be divided into two main categories: spatial domain watermarking, and frequency domain watermarking. Often matrix transformation methods are merged with another method to select the right place to hide. In this paper, a non-blind watermarking id presented. In order to embed watermark Least Significant Bit (LSB) replacement and QR matrix factorization are exploited. Q learning is used to select the appropriate host blocks. The Peak Signal-to-Noise Ratio(PSNR) of the watermarked image and the extracted watermark image is considered as the reward function. The proposed method has been improved over the algorithms mentioned above with no learning methods and achieved a mean PSNR values of 56.61 dB and 55.77 dB for QR matrix factorization and LSB replacemnet embedding method respectively.
今天,随着科技的进步和互联网的广泛使用,水印技术正在发展,以保护版权和数据安全。目前提出的水印方法主要分为两大类:空域水印和频域水印。通常将矩阵变换方法与另一种方法合并以选择正确的隐藏位置。本文提出了一种非盲水印算法。为了嵌入水印,采用了最小有效位替换和QR矩阵分解。Q学习用于选择合适的主机块。将水印图像和提取的水印图像的峰值信噪比(PSNR)作为奖励函数。该方法在没有学习方法的情况下对上述算法进行了改进,QR矩阵分解法和LSB替代嵌入法的平均PSNR分别达到56.61 dB和55.77 dB。
{"title":"Image Watermarking by Q Learning and Matrix Factorization","authors":"M. Alizadeh, H. Sajedi, B. BabaAli","doi":"10.1109/MVIP49855.2020.9116871","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116871","url":null,"abstract":"Today, with the advancement of technology and the widespread use of the internet, watermarking techniques are being developed to protect copyright and data security. The methods proposed for watermarking can be divided into two main categories: spatial domain watermarking, and frequency domain watermarking. Often matrix transformation methods are merged with another method to select the right place to hide. In this paper, a non-blind watermarking id presented. In order to embed watermark Least Significant Bit (LSB) replacement and QR matrix factorization are exploited. Q learning is used to select the appropriate host blocks. The Peak Signal-to-Noise Ratio(PSNR) of the watermarked image and the extracted watermark image is considered as the reward function. The proposed method has been improved over the algorithms mentioned above with no learning methods and achieved a mean PSNR values of 56.61 dB and 55.77 dB for QR matrix factorization and LSB replacemnet embedding method respectively.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120921028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Occluded Visual Object Recognition Using Deep Conditional Generative Adversarial Nets and Feedforward Convolutional Neural Networks 基于深度条件生成对抗网络和前馈卷积神经网络的遮挡视觉目标识别
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116887
Vahid Reza Khazaie, Alireza Akhavanpour, R. Ebrahimpour
Core object recognition is the task of recognizing objects without regard to any variations in the conditions like pose, illumination or any other structural modifications. This task is solved through the feedforward processing of information in the human visual system. Deep neural networks can perform like humans in this task. However, we do not know how object recognition under more challenging conditions like occlusion is solved. Some computational models imply that recurrent processing might be a solution to the beyond core object recognition task. The other potential mechanism for solving occlusion is to reconstruct the occluded part of the object taking advantage of generative models. Here we used Conditional Generative Adversarial Networks for reconstruction. For reasonable size occlusion, we were able to remove the effect of occlusion and we recovered the performance of the base model. We showed getting the benefit of GANs for reconstruction and adding information by generative models can cause a better performance in the object recognition task under occlusion.
核心对象识别的任务是识别对象,而不考虑任何变化的条件,如姿势,照明或任何其他结构的修改。这个任务是通过人类视觉系统对信息的前馈处理来解决的。深度神经网络可以像人类一样完成这项任务。然而,我们不知道在遮挡等更具挑战性的条件下如何解决目标识别问题。一些计算模型暗示循环处理可能是超核心对象识别任务的解决方案。另一种解决遮挡的潜在机制是利用生成模型重建被遮挡的物体部分。这里我们使用条件生成对抗网络进行重建。对于合理大小的遮挡,我们能够去除遮挡的影响,并恢复基础模型的性能。研究结果表明,利用gan进行重建并通过生成模型添加信息可以在遮挡下的目标识别任务中获得更好的性能。
{"title":"Occluded Visual Object Recognition Using Deep Conditional Generative Adversarial Nets and Feedforward Convolutional Neural Networks","authors":"Vahid Reza Khazaie, Alireza Akhavanpour, R. Ebrahimpour","doi":"10.1109/MVIP49855.2020.9116887","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116887","url":null,"abstract":"Core object recognition is the task of recognizing objects without regard to any variations in the conditions like pose, illumination or any other structural modifications. This task is solved through the feedforward processing of information in the human visual system. Deep neural networks can perform like humans in this task. However, we do not know how object recognition under more challenging conditions like occlusion is solved. Some computational models imply that recurrent processing might be a solution to the beyond core object recognition task. The other potential mechanism for solving occlusion is to reconstruct the occluded part of the object taking advantage of generative models. Here we used Conditional Generative Adversarial Networks for reconstruction. For reasonable size occlusion, we were able to remove the effect of occlusion and we recovered the performance of the base model. We showed getting the benefit of GANs for reconstruction and adding information by generative models can cause a better performance in the object recognition task under occlusion.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131118549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Classified and Comparative Study of 2-D Convolvers 二维卷积的分类与比较研究
Pub Date : 2020-02-01 DOI: 10.1109/MVIP49855.2020.9116874
Mahdi Kalbasi, Hooman Nikmehr
Two-dimensional (2-D) convolution is a common operation in a wide range of signal and image processing applications such as edge detection, sharpening, and blurring. In the hardware implementation of these applications, 2d convolution is one of the most challenging parts because it is a compute-intensive and memory-intensive operation. To address these challenges, several design techniques such as pipelining, constant multiplication, and time-sharing have been applied in the literature which leads to convolvers with different implementation features. In this paper, based on design techniques, we classify these convolvers into four classes named Non-Pipelined Convolver, Reduced-Bandwidth Pipelined Convolver, Multiplier-Less Pipelined Convolver, and Time-Shared Convolver. Then, implementation features of these classes, such as critical path delay, memory bandwidth, and resource utilization, are analyticcally discussed for different convolution kernel sizes. Finally, an instance of each class is captured in Verilog and their features are evaluated by implementing them on a Virtex-7 FPGA and reported confirming the analytical discussions.
二维(2-D)卷积在广泛的信号和图像处理应用中是一种常见的操作,如边缘检测、锐化和模糊。在这些应用程序的硬件实现中,二维卷积是最具挑战性的部分之一,因为它是一个计算密集型和内存密集型的操作。为了应对这些挑战,文献中应用了一些设计技术,如流水线、常数乘法和分时技术,这些技术导致了具有不同实现特征的卷积。在本文中,我们基于设计技术将这些卷积器分为四类:非管道卷积器、减少带宽管道卷积器、无乘法器管道卷积器和分时卷积器。然后,分析讨论了不同卷积核大小下这些类的实现特征,如关键路径延迟、内存带宽和资源利用率。最后,在Verilog中捕获每个类的实例,并通过在Virtex-7 FPGA上实现它们来评估它们的特性,并报告确认分析讨论。
{"title":"A Classified and Comparative Study of 2-D Convolvers","authors":"Mahdi Kalbasi, Hooman Nikmehr","doi":"10.1109/MVIP49855.2020.9116874","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116874","url":null,"abstract":"Two-dimensional (2-D) convolution is a common operation in a wide range of signal and image processing applications such as edge detection, sharpening, and blurring. In the hardware implementation of these applications, 2d convolution is one of the most challenging parts because it is a compute-intensive and memory-intensive operation. To address these challenges, several design techniques such as pipelining, constant multiplication, and time-sharing have been applied in the literature which leads to convolvers with different implementation features. In this paper, based on design techniques, we classify these convolvers into four classes named Non-Pipelined Convolver, Reduced-Bandwidth Pipelined Convolver, Multiplier-Less Pipelined Convolver, and Time-Shared Convolver. Then, implementation features of these classes, such as critical path delay, memory bandwidth, and resource utilization, are analyticcally discussed for different convolution kernel sizes. Finally, an instance of each class is captured in Verilog and their features are evaluated by implementing them on a Virtex-7 FPGA and reported confirming the analytical discussions.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130265463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MVIP 2020 Cover Page MVIP 2020封面
Pub Date : 2020-02-01 DOI: 10.1109/mvip49855.2020.9116892
{"title":"MVIP 2020 Cover Page","authors":"","doi":"10.1109/mvip49855.2020.9116892","DOIUrl":"https://doi.org/10.1109/mvip49855.2020.9116892","url":null,"abstract":"","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134143798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 International Conference on Machine Vision and Image Processing (MVIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1