首页 > 最新文献

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing最新文献

英文 中文
Don't see me, just filter me: towards secure cloud based filtering using Shamir's secret sharing and POB number system 不看我,只过滤我:使用Shamir的秘密共享和POB号码系统进行安全的基于云的过滤
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010036
Priyanka Singh, Nishant Agarwal, B. Raman
Cloud computing paradigm is attracting individuals as well as organizations all round the globe due to its multiple facilities of powerful resources, storage hubs, computational power and cost effective solutions. However, distributed cloud data centers pose the threat of security breaches to the privacy of the data and pose high risks. A secured image data sharing scheme over cloud domain based on the Shamir's secret sharing and permutation ordered binary number system has been proposed here. It distributes the image information into multiple random shares that reveal no information and thus can be stored securely over the distributed cloud data centers. Different image operations could be applied in the encrypted domain over the cloud servers itself. This approach reduces the threat to security which is a major hindrance in utilizing the facilities of the cloud based architecture. Only the authentic owner possessing the secret keys could restore back the original image information from the random shares. Comparative results for both the plain domain as well as the encrypted domain have been presented to validate the efficiency of using the various image filtering operations like motion blur, unsharp masking, weiner, gaussian etc in the encrypted domain over cloud.
云计算模式吸引个人以及组织全球四周由于其多个设施强大的资源,存储中心,计算能力和成本有效的解决方案。然而,分布式云数据中心对数据隐私构成安全漏洞的威胁,存在较高的风险。提出了一种基于Shamir秘密共享和排列有序二进制数系统的云域图像数据安全共享方案。它将图像信息分发到多个随机共享中,这些共享不会泄露任何信息,因此可以安全地存储在分布式云数据中心上。可以在云服务器本身的加密域中应用不同的映像操作。这种方法减少了对安全的威胁,这是利用基于云的架构的设施的主要障碍。只有拥有密钥的真实所有者才能从随机共享中恢复原始图像信息。通过对纯域和加密域的对比结果,验证了在云上加密域中使用各种图像滤波操作(如运动模糊、非锐化掩蔽、维纳、高斯等)的效率。
{"title":"Don't see me, just filter me: towards secure cloud based filtering using Shamir's secret sharing and POB number system","authors":"Priyanka Singh, Nishant Agarwal, B. Raman","doi":"10.1145/3009977.3010036","DOIUrl":"https://doi.org/10.1145/3009977.3010036","url":null,"abstract":"Cloud computing paradigm is attracting individuals as well as organizations all round the globe due to its multiple facilities of powerful resources, storage hubs, computational power and cost effective solutions. However, distributed cloud data centers pose the threat of security breaches to the privacy of the data and pose high risks. A secured image data sharing scheme over cloud domain based on the Shamir's secret sharing and permutation ordered binary number system has been proposed here. It distributes the image information into multiple random shares that reveal no information and thus can be stored securely over the distributed cloud data centers. Different image operations could be applied in the encrypted domain over the cloud servers itself. This approach reduces the threat to security which is a major hindrance in utilizing the facilities of the cloud based architecture. Only the authentic owner possessing the secret keys could restore back the original image information from the random shares. Comparative results for both the plain domain as well as the encrypted domain have been presented to validate the efficiency of using the various image filtering operations like motion blur, unsharp masking, weiner, gaussian etc in the encrypted domain over cloud.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"20 1","pages":"12:1-12:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86250762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
MUSA: a banana database for ripening level determination 测定香蕉成熟程度的数据库
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3009996
Senthilarasi M, Md Mansoor Roomi S, Sheik Naveedh A
Ripening treatment of banana is accomplished globally with controlled ethylene gas, temperature, airflow, humidity and time. During the ripening, the peel colour of banana changes from green to yellow with brown spots. The shelf life of banana has significant quality indices and colour transformations which impacts the banana characteristics like softness, sweet and taste. Therefore, an automatic control system can monitor the ripening level of bananas to maintain the peel colour, firm pulp and texture. Appropriate datasets are required for the experimentation and evaluation of the ripening level determination algorithms. This paper is intended to generate a database for Musa Species (yellow bananas) with different ripening levels such as unripe, ripe and overripe. Rasthali (Musa AAB) and Monthan (Musa ABB) hands are chosen as samples to create the database. MUSA database comprises of 3108 banana images which are acquired at 7 view angles and 12 rotations at a constant illumination. The supremacy of Musa database is tested with the state of art ripening level determination algorithms.
香蕉的成熟处理是在控制乙烯气体、温度、气流、湿度和时间的情况下完成的。在成熟过程中,香蕉的果皮颜色由绿色变为黄色,并带有棕色斑点。香蕉在保质期内的品质指标和颜色变化显著,影响香蕉的柔软度、甜度和口感等特性。因此,一个自动控制系统可以监控香蕉的成熟程度,以保持果皮的颜色,坚固的果肉和质地。成熟水平确定算法的实验和评估需要适当的数据集。本文旨在建立一个不同成熟度(如未成熟、成熟和过熟)的Musa种(黄香蕉)数据库。选择Rasthali (Musa AAB)和Monthan (Musa ABB)的手作为样本来创建数据库。MUSA数据库包括3108张香蕉图像,这些图像是在恒定照明下以7个视角和12个旋转获得的。用最先进的成熟水平确定算法检验了穆萨数据库的优越性。
{"title":"MUSA: a banana database for ripening level determination","authors":"Senthilarasi M, Md Mansoor Roomi S, Sheik Naveedh A","doi":"10.1145/3009977.3009996","DOIUrl":"https://doi.org/10.1145/3009977.3009996","url":null,"abstract":"Ripening treatment of banana is accomplished globally with controlled ethylene gas, temperature, airflow, humidity and time. During the ripening, the peel colour of banana changes from green to yellow with brown spots. The shelf life of banana has significant quality indices and colour transformations which impacts the banana characteristics like softness, sweet and taste. Therefore, an automatic control system can monitor the ripening level of bananas to maintain the peel colour, firm pulp and texture. Appropriate datasets are required for the experimentation and evaluation of the ripening level determination algorithms. This paper is intended to generate a database for Musa Species (yellow bananas) with different ripening levels such as unripe, ripe and overripe. Rasthali (Musa AAB) and Monthan (Musa ABB) hands are chosen as samples to create the database. MUSA database comprises of 3108 banana images which are acquired at 7 view angles and 12 rotations at a constant illumination. The supremacy of Musa database is tested with the state of art ripening level determination algorithms.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"1 1","pages":"71:1-71:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86630148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CRF based method for curb detection using semantic cues and stereo depth 基于CRF的基于语义线索和立体深度的路边检测方法
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010058
Danish Sodhi, Sarthak Upadhyay, Dhaivat Bhatt, K. Krishna, S. Swarup
Curb detection is a critical component of driver assistance and autonomous driving systems. In this paper, we present a discriminative approach to the problem of curb detection under diverse road conditions. We define curbs as the intersection of drivable and non-drivable area which are classified using dense Conditional random fields(CRF). In our method, we fuse output of a neural network used for pixel-wise semantic segmentation with depth and color information from stereo cameras. CRF fuses the output of a deep model and height information available in stereo data and provides improved segmentation. Further we introduce temporal smoothness using a weighted average of Segnet output and output from a probabilistic voxel grid as our unary potential. Finally, we show improvements over the current state of the art neural networks. Our proposed method shows accurate results over large range of variations in curb curvature and appearance, without the need of retraining the model for the specific dataset.
路缘检测是驾驶辅助和自动驾驶系统的关键组成部分。在本文中,我们提出了一种判别方法来解决不同道路条件下的路边检测问题。我们将路缘定义为可行驶区域和不可行驶区域的交集,并使用密集条件随机场(CRF)对其进行分类。在我们的方法中,我们将用于像素语义分割的神经网络输出与来自立体摄像机的深度和颜色信息融合在一起。CRF融合了深度模型的输出和立体数据中的高度信息,并提供了改进的分割。此外,我们使用分段输出和概率体素网格输出的加权平均值作为一元势引入时间平滑性。最后,我们展示了对当前最先进的神经网络的改进。我们提出的方法在路边曲率和外观的大范围变化上显示了准确的结果,而不需要针对特定数据集重新训练模型。
{"title":"CRF based method for curb detection using semantic cues and stereo depth","authors":"Danish Sodhi, Sarthak Upadhyay, Dhaivat Bhatt, K. Krishna, S. Swarup","doi":"10.1145/3009977.3010058","DOIUrl":"https://doi.org/10.1145/3009977.3010058","url":null,"abstract":"Curb detection is a critical component of driver assistance and autonomous driving systems. In this paper, we present a discriminative approach to the problem of curb detection under diverse road conditions. We define curbs as the intersection of drivable and non-drivable area which are classified using dense Conditional random fields(CRF). In our method, we fuse output of a neural network used for pixel-wise semantic segmentation with depth and color information from stereo cameras. CRF fuses the output of a deep model and height information available in stereo data and provides improved segmentation. Further we introduce temporal smoothness using a weighted average of Segnet output and output from a probabilistic voxel grid as our unary potential. Finally, we show improvements over the current state of the art neural networks. Our proposed method shows accurate results over large range of variations in curb curvature and appearance, without the need of retraining the model for the specific dataset.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"27 1","pages":"41:1-41:7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83716481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Exposing splicing forgeries in digital images through dichromatic plane histogram discrepancies 通过二色平面直方图差异揭露数字图像中的拼接伪造
Pub Date : 2016-12-18 DOI: 10.1145/3009977.3010032
A. Mazumdar, P. Bora
One of the common image forgery techniques is the splicing, where parts from different images are copied and pasted onto a single image. This paper proposes a new forensics method for detecting splicing forgeries in images containing human faces. Our approach is based on extracting an illumination-signature from the faces of people present in an image using the dichromatic reflection model (DRM). The dichromatic plane histogram (DPH), which is calculated by applying the 2D Hough Transform on the face images, is used as the illumination-signature. The correlation measure is employed to compute the similarity between the DPHs obtained from different faces present in an image. Finally, a simple threshold on this similarity measure exposes splicing forgeries in the image. Experimental results show the efficacy of the proposed method.
一种常见的图像伪造技术是拼接,将不同图像的部分复制并粘贴到单个图像上。提出了一种新的人脸拼接伪造检测方法。我们的方法是基于使用二色反射模型(DRM)从图像中存在的人的面部提取照明特征。利用对人脸图像进行二维霍夫变换得到的二色平面直方图(DPH)作为光照特征。相关度量用于计算从图像中存在的不同人脸获得的dph之间的相似性。最后,一个简单的阈值对这种相似性度量暴露拼接伪造图像。实验结果表明了该方法的有效性。
{"title":"Exposing splicing forgeries in digital images through dichromatic plane histogram discrepancies","authors":"A. Mazumdar, P. Bora","doi":"10.1145/3009977.3010032","DOIUrl":"https://doi.org/10.1145/3009977.3010032","url":null,"abstract":"One of the common image forgery techniques is the splicing, where parts from different images are copied and pasted onto a single image. This paper proposes a new forensics method for detecting splicing forgeries in images containing human faces. Our approach is based on extracting an illumination-signature from the faces of people present in an image using the dichromatic reflection model (DRM). The dichromatic plane histogram (DPH), which is calculated by applying the 2D Hough Transform on the face images, is used as the illumination-signature. The correlation measure is employed to compute the similarity between the DPHs obtained from different faces present in an image. Finally, a simple threshold on this similarity measure exposes splicing forgeries in the image. Experimental results show the efficacy of the proposed method.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"126 1","pages":"62:1-62:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83722155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Learning to hash-tag videos with Tag2Vec 学习标签视频与Tag2Vec
Pub Date : 2016-12-13 DOI: 10.1145/3009977.3010035
A. Singh, Saurabh Saini, R. Shah, P J Narayanan
User-given tags or labels are valuable resources for semantic understanding of visual media such as images and videos. Recently, a new type of labeling mechanism known as hash-tags have become increasingly popular on social media sites. In this paper, we study the problem of generating relevant and useful hash-tags for short video clips. Traditional data-driven approaches for tag enrichment and recommendation use direct visual similarity for label transfer and propagation. We attempt to learn a direct low-cost mapping from video to hash-tags using a two step training process. We first employ a natural language processing (NLP) technique, skip-gram models with neural network training to learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a corpus of ∼ 10 million hash-tags. We then train an embedding function to map video features to the low-dimensional Tag2vec space. We learn this embedding for 29 categories of short video clips with hash-tags. A query video without any tag-information can then be directly mapped to the vector space of tags using the learned embedding and relevant tags can be found by performing a simple nearest-neighbor retrieval in the Tag2Vec space. We validate the relevance of the tags suggested by our system qualitatively and quantitatively with a user study.
用户给出的标签或标签是对图像和视频等视觉媒体进行语义理解的宝贵资源。最近,一种被称为标签的新型标签机制在社交媒体网站上越来越流行。在本文中,我们研究了为短视频片段生成相关和有用的标签的问题。传统的数据驱动的标签丰富和推荐方法使用直接的视觉相似性来进行标签转移和传播。我们尝试使用两步训练过程来学习从视频到标签的直接低成本映射。我们首先采用自然语言处理(NLP)技术,使用神经网络训练的skip-gram模型,使用约1000万个哈希标签的语料库学习哈希标签的低维向量表示(Tag2Vec)。然后我们训练一个嵌入函数将视频特征映射到低维Tag2vec空间。我们在29类带有标签的短视频片段中学习了这种嵌入。然后,使用学习嵌入将没有任何标签信息的查询视频直接映射到标签向量空间,并通过在Tag2Vec空间中执行简单的最近邻检索来找到相关标签。我们通过用户研究定性和定量地验证我们系统建议的标签的相关性。
{"title":"Learning to hash-tag videos with Tag2Vec","authors":"A. Singh, Saurabh Saini, R. Shah, P J Narayanan","doi":"10.1145/3009977.3010035","DOIUrl":"https://doi.org/10.1145/3009977.3010035","url":null,"abstract":"User-given tags or labels are valuable resources for semantic understanding of visual media such as images and videos. Recently, a new type of labeling mechanism known as hash-tags have become increasingly popular on social media sites. In this paper, we study the problem of generating relevant and useful hash-tags for short video clips. Traditional data-driven approaches for tag enrichment and recommendation use direct visual similarity for label transfer and propagation. We attempt to learn a direct low-cost mapping from video to hash-tags using a two step training process. We first employ a natural language processing (NLP) technique, skip-gram models with neural network training to learn a low-dimensional vector representation of hash-tags (Tag2Vec) using a corpus of ∼ 10 million hash-tags. We then train an embedding function to map video features to the low-dimensional Tag2vec space. We learn this embedding for 29 categories of short video clips with hash-tags. A query video without any tag-information can then be directly mapped to the vector space of tags using the learned embedding and relevant tags can be found by performing a simple nearest-neighbor retrieval in the Tag2Vec space. We validate the relevance of the tags suggested by our system qualitatively and quantitatively with a user study.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"20 1","pages":"94:1-94:8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78553642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deep fusion of visual signatures for client-server facial analysis 客户端-服务器面部分析的深度融合视觉签名
Pub Date : 2016-11-01 DOI: 10.1145/3009977.3010062
Binod Bhattarai, Gaurav Sharma, F. Jurie
Facial analysis is a key technology for enabling human-machine interaction. In this context, we present a client-server framework, where a client transmits the signature of a face to be analyzed to the server, and, in return, the server sends back various information describing the face e.g. is the person male or female, is she/he bald, does he have a mustache, etc. We assume that a client can compute one (or a combination) of visual features; from very simple and efficient features, like Local Binary Patterns, to more complex and computationally heavy, like Fisher Vectors and CNN based, depending on the computing resources available. The challenge addressed in this paper is to design a common universal representation such that a single merged signature is transmitted to the server, whatever be the type and number of features computed by the client, ensuring nonetheless an optimal performance. Our solution is based on learning of a common optimal subspace for aligning the different face features and merging them into a universal signature. We have validated the proposed method on the challenging CelebA dataset, on which our method outperforms existing state-of-art methods when rich representation is available at test time, while giving competitive performance when only simple signatures (like LBP) are available at test time due to resource constraints on the client.
人脸分析是实现人机交互的关键技术。在这种情况下,我们提出了一个客户端-服务器框架,其中客户端将要分析的面部签名发送给服务器,作为回报,服务器返回描述该面部的各种信息,例如该人是男性还是女性,她/他是否秃顶,他是否有胡子,等等。我们假设客户端可以计算一个(或组合)的视觉特征;从非常简单和高效的特征,如局部二进制模式,到更复杂和计算量大的特征,如Fisher Vectors和基于CNN的特征,这取决于可用的计算资源。本文所解决的挑战是设计一个通用的表示,以便将单个合并签名传输到服务器,无论客户端计算的特征类型和数量如何,同时确保最佳性能。我们的解决方案是基于学习一个共同的最优子空间,用于对齐不同的面部特征并将它们合并到一个通用签名中。我们已经在具有挑战性的CelebA数据集上验证了所提出的方法,在测试时,当丰富的表示可用时,我们的方法优于现有的最先进的方法,而在测试时,由于客户端的资源限制,只有简单的签名(如LBP)可用时,我们的方法具有竞争力的性能。
{"title":"Deep fusion of visual signatures for client-server facial analysis","authors":"Binod Bhattarai, Gaurav Sharma, F. Jurie","doi":"10.1145/3009977.3010062","DOIUrl":"https://doi.org/10.1145/3009977.3010062","url":null,"abstract":"Facial analysis is a key technology for enabling human-machine interaction. In this context, we present a client-server framework, where a client transmits the signature of a face to be analyzed to the server, and, in return, the server sends back various information describing the face e.g. is the person male or female, is she/he bald, does he have a mustache, etc. We assume that a client can compute one (or a combination) of visual features; from very simple and efficient features, like Local Binary Patterns, to more complex and computationally heavy, like Fisher Vectors and CNN based, depending on the computing resources available. The challenge addressed in this paper is to design a common universal representation such that a single merged signature is transmitted to the server, whatever be the type and number of features computed by the client, ensuring nonetheless an optimal performance. Our solution is based on learning of a common optimal subspace for aligning the different face features and merging them into a universal signature. We have validated the proposed method on the challenging CelebA dataset, on which our method outperforms existing state-of-art methods when rich representation is available at test time, while giving competitive performance when only simple signatures (like LBP) are available at test time due to resource constraints on the client.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"60 1","pages":"42:1-42:8"},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85538317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Realtime motion detection based on the spatio-temporal median filter using GPU integral histograms 基于GPU积分直方图的时空中值滤波实时运动检测
Pub Date : 2012-12-16 DOI: 10.1145/2425333.2425352
M. Poostchi, K. Palaniappan, F. Bunyak, G. Seetharaman
Motion detection using background modeling is a widely used technique in object tracking. To meet the demands of real-time multi-target tracking applications in large and/or high resolution imagery fast parallel algorithms for motion detection are desirable. One common method for background modeling is to use an adaptive 3D median filter that is updated appropriately based on the video sequence. We describe a parallel 3D spatiotemporal median filter algorithm implemented in CUDA for many core Graphics Processing Unit (GPU) architectures using the integral histogram as a building block to support adaptive window sizes. Both 2D and 3D median filters are also widely used in many other computer vision tasks like denoising, segmentation, and recognition. Although fast sequential median algorithms exist, improving performance using parallelization is attractive to reduce the time needed for motion detection in order to support more complex processing in multi-target tracking systems, large high resolution aerial video imagery and 3D volumetric processing. Results show the frame rate of the GPU implementation was 60 times faster than the CPU version for a 1K x 1K image reaching 49 fr/sec and 21 times faster for 512 x 512 frame sizes reaching 194 fr/sec. We characterize performance of the parallel 3D median filter for different image sizes and varying number of histogram bins and show selected results for motion detection.
基于背景建模的运动检测是一种广泛应用于目标跟踪的技术。为了满足大型和/或高分辨率图像中实时多目标跟踪应用的需求,需要快速并行运动检测算法。背景建模的一种常用方法是使用基于视频序列适当更新的自适应3D中值滤波器。我们描述了在CUDA中实现的并行3D时空中值滤波算法,用于许多核心图形处理单元(GPU)架构,使用积分直方图作为构建块来支持自适应窗口大小。2D和3D中值滤波器也广泛用于许多其他计算机视觉任务,如去噪,分割和识别。虽然已经存在快速顺序中值算法,但为了支持多目标跟踪系统、大型高分辨率航空视频图像和3D体积处理中更复杂的处理,使用并行化来提高性能以减少运动检测所需的时间是有吸引力的。结果表明,在1K x 1K图像达到49帧/秒时,GPU实现的帧率比CPU版本快60倍,在512 x 512帧大小达到194帧/秒时,GPU实现的帧率比CPU版本快21倍。我们描述了并行3D中值滤波器在不同图像大小和不同直方图箱数下的性能,并显示了用于运动检测的选择结果。
{"title":"Realtime motion detection based on the spatio-temporal median filter using GPU integral histograms","authors":"M. Poostchi, K. Palaniappan, F. Bunyak, G. Seetharaman","doi":"10.1145/2425333.2425352","DOIUrl":"https://doi.org/10.1145/2425333.2425352","url":null,"abstract":"Motion detection using background modeling is a widely used technique in object tracking. To meet the demands of real-time multi-target tracking applications in large and/or high resolution imagery fast parallel algorithms for motion detection are desirable. One common method for background modeling is to use an adaptive 3D median filter that is updated appropriately based on the video sequence. We describe a parallel 3D spatiotemporal median filter algorithm implemented in CUDA for many core Graphics Processing Unit (GPU) architectures using the integral histogram as a building block to support adaptive window sizes. Both 2D and 3D median filters are also widely used in many other computer vision tasks like denoising, segmentation, and recognition. Although fast sequential median algorithms exist, improving performance using parallelization is attractive to reduce the time needed for motion detection in order to support more complex processing in multi-target tracking systems, large high resolution aerial video imagery and 3D volumetric processing. Results show the frame rate of the GPU implementation was 60 times faster than the CPU version for a 1K x 1K image reaching 49 fr/sec and 21 times faster for 512 x 512 frame sizes reaching 194 fr/sec. We characterize performance of the parallel 3D median filter for different image sizes and varying number of histogram bins and show selected results for motion detection.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"23 1","pages":"19"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73442186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Objective evaluation of noisy multimodal medical image fusion using Daubechies complex wavelet transform Daubechies复小波变换在多模态医学图像融合中的应用
Pub Date : 2012-12-16 DOI: 10.1145/2425333.2425405
Rajiv Singh, A. Khare
Medical image fusion needs proper attention as images obtained from medical instruments are of poor contrast and corrupted by blur and noise due to imperfection of image capturing devices. Thus, objective evaluation of medical image fusion techniques has become an important task in noisy domain. Therefore, in the present work, we have proposed maximum selection and energy based fusion rules for the evaluation of noisy multimodal medical image fusion using Daubechies complex wavelet transform (DCxWT). Unlike, traditional real valued wavelet transforms, which suffered from shift sensitivity and did not provide any phase information, DCxWT is shift invariant and provides phase information through its imaginary coefficients. Shift invariance and availability of phase information properties of DCxWT have been found useful for fusion of multimodal medical images. The experiments have been performed over several set of noisy medical images at multiple levels of noise for the proposed fusion scheme. Further, the proposed fusion scheme has been tested up to the maximum level of Gaussian, salt & pepper and speckle noise. Objective evaluation of the proposed fusion scheme is performed with fusion factor, fusion symmetry, entropy, standard deviation and edge information metrics. Results have been shown for two sets of multimodal medical images for the proposed method with maximum and energy based fusion rules, and comparison has been done with Lifting wavelet transform (LWT) and Stationary wavelet transform (SWT) based fusion methods. Comparative analysis of the proposed method with LWT and SWT based fusion methods at different noise levels shows the superiority of the proposed scheme. Moreover, the plots of different fusion metrics against the maximum level of Gaussian, salt & pepper and speckle noise show the robustness of the proposed fusion method against noise.
由于图像采集设备的不完善,从医疗器械获取的图像对比度较差,容易受到模糊和噪声的影响,需要引起医学图像融合的重视。因此,对医学图像融合技术进行客观评价已成为噪声领域的重要课题。因此,在本工作中,我们提出了基于最大选择和能量的融合规则,用于使用dabechies复小波变换(DCxWT)对噪声多模态医学图像融合的评估。传统的实值小波变换具有位移敏感性,不提供任何相位信息,而DCxWT是位移不变性的,通过虚系数提供相位信息。DCxWT的位移不变性和相位信息的可用性对多模态医学图像的融合非常有用。实验结果表明,本文提出的融合方案适用于多噪声医学图像。此外,所提出的融合方案已在高斯噪声、椒盐噪声和斑点噪声的最大水平下进行了测试。利用融合因子、融合对称性、熵、标准差和边缘信息等指标对融合方案进行客观评价。对两组多模态医学图像进行了基于最大和能量规则的融合,并与基于提升小波变换(LWT)和基于平稳小波变换(SWT)的融合方法进行了比较。将该方法与基于LWT和SWT的不同噪声水平下的融合方法进行了对比分析,结果表明了该方法的优越性。此外,不同融合指标对高斯噪声、盐胡椒噪声和斑点噪声的最大水平的影响图表明了所提出的融合方法对噪声的鲁棒性。
{"title":"Objective evaluation of noisy multimodal medical image fusion using Daubechies complex wavelet transform","authors":"Rajiv Singh, A. Khare","doi":"10.1145/2425333.2425405","DOIUrl":"https://doi.org/10.1145/2425333.2425405","url":null,"abstract":"Medical image fusion needs proper attention as images obtained from medical instruments are of poor contrast and corrupted by blur and noise due to imperfection of image capturing devices. Thus, objective evaluation of medical image fusion techniques has become an important task in noisy domain. Therefore, in the present work, we have proposed maximum selection and energy based fusion rules for the evaluation of noisy multimodal medical image fusion using Daubechies complex wavelet transform (DCxWT). Unlike, traditional real valued wavelet transforms, which suffered from shift sensitivity and did not provide any phase information, DCxWT is shift invariant and provides phase information through its imaginary coefficients. Shift invariance and availability of phase information properties of DCxWT have been found useful for fusion of multimodal medical images. The experiments have been performed over several set of noisy medical images at multiple levels of noise for the proposed fusion scheme. Further, the proposed fusion scheme has been tested up to the maximum level of Gaussian, salt & pepper and speckle noise. Objective evaluation of the proposed fusion scheme is performed with fusion factor, fusion symmetry, entropy, standard deviation and edge information metrics. Results have been shown for two sets of multimodal medical images for the proposed method with maximum and energy based fusion rules, and comparison has been done with Lifting wavelet transform (LWT) and Stationary wavelet transform (SWT) based fusion methods. Comparative analysis of the proposed method with LWT and SWT based fusion methods at different noise levels shows the superiority of the proposed scheme. Moreover, the plots of different fusion metrics against the maximum level of Gaussian, salt & pepper and speckle noise show the robustness of the proposed fusion method against noise.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"217 1","pages":"72"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74172425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On the use of regions for semantic image segmentation 区域在语义图像分割中的应用
Pub Date : 2012-12-16 DOI: 10.1145/2425333.2425384
Rui Hu, Diane Larlus, G. Csurka
There is a general trend in recent methods to use image regions (i.e. super-pixels) obtained in an unsupervised way to enhance the semantic image segmentation task. This paper proposes a detailed study on the role and the benefit of using these regions, at different steps of the segmentation process. For the purpose of this benchmark, we propose a simple system for semantic segmentation that uses a hierarchy of regions. A patch based system with similar settings is compared, which allows us to evaluate the contribution of each component of the system. Both systems are evaluated on the standard MSRC-21 dataset and obtain competitive results. We show that the proposed region based system can achieve good results without any complex regularization, while its patch based counterpart becomes competitive when using image prior and regularization methods. The latter benefit more from a CRF based regularization, yielding to state-of-the-art results with simple constraints based only on the leaf regions exploited in the pairwise potential.
在最近的方法中,使用以无监督方式获得的图像区域(即超像素)来增强语义图像分割任务是一个普遍的趋势。本文提出了在分割过程的不同步骤中使用这些区域的作用和好处的详细研究。为了这个基准测试,我们提出了一个简单的使用区域层次结构的语义分割系统。与类似设置的基于补丁的系统进行比较,这使我们能够评估系统中每个组件的贡献。在标准MSRC-21数据集上对两种系统进行了评估,并获得了具有竞争力的结果。我们的研究表明,基于区域的系统在没有任何复杂正则化的情况下可以获得良好的效果,而基于补丁的系统在使用图像先验和正则化方法时会变得有竞争力。后者更多地受益于基于CRF的正则化,产生最先进的结果,仅基于在成对势中利用的叶区域的简单约束。
{"title":"On the use of regions for semantic image segmentation","authors":"Rui Hu, Diane Larlus, G. Csurka","doi":"10.1145/2425333.2425384","DOIUrl":"https://doi.org/10.1145/2425333.2425384","url":null,"abstract":"There is a general trend in recent methods to use image regions (i.e. super-pixels) obtained in an unsupervised way to enhance the semantic image segmentation task. This paper proposes a detailed study on the role and the benefit of using these regions, at different steps of the segmentation process. For the purpose of this benchmark, we propose a simple system for semantic segmentation that uses a hierarchy of regions. A patch based system with similar settings is compared, which allows us to evaluate the contribution of each component of the system. Both systems are evaluated on the standard MSRC-21 dataset and obtain competitive results. We show that the proposed region based system can achieve good results without any complex regularization, while its patch based counterpart becomes competitive when using image prior and regularization methods. The latter benefit more from a CRF based regularization, yielding to state-of-the-art results with simple constraints based only on the leaf regions exploited in the pairwise potential.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"183 1","pages":"51"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77103893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Matte based generation of land cover maps 基于亚光的土地覆盖地图生成
Pub Date : 2012-12-16 DOI: 10.1145/2425333.2425373
K. Bahirat, S. Chaudhuri
A novel supervised technique for the generation of spatially consistent land cover maps based on class-matting is presented in this paper. This method takes advantage of both standard supervised classification technique and natural image matting. It adaptively exploits the spatial contextual information contained in the neighborhood of each pixel through the use of image matting to reduce the incongruence inherent in pixel-wise, radiometric classification of multi-spectral remote sensing data, providing a more spatially homogeneous land-cover map besides yielding a better accuracy. In order to make image matting possible for N-class land cover map generation, we extend the basic alpha matting problem into N independent matting problems, each conforming to one particular class. The user input required for the alpha matting algorithm in terms of initially identifying a few sample regions belonging to a particular class (known as the foreground object in matting) is obtained automatically using the supervised ML classifier. Experimental results obtained on multispectral data sets confirm the effectiveness of the proposed system.
提出了一种基于类抠图的有监督生成空间一致性土地覆盖图的新方法。该方法结合了标准监督分类技术和自然图像抠图技术。它通过使用图像抠图自适应地利用包含在每个像素附近的空间上下文信息,以减少多光谱遥感数据在像素方面固有的不一致,辐射分类,提供一个空间上更均匀的土地覆盖地图,除了产生更好的精度。为了使生成N类土地覆盖图的图像抠图成为可能,我们将基本的alpha抠图问题扩展为N个独立的抠图问题,每个问题符合一个特定的类别。alpha抠图算法所需的用户输入,在最初识别属于特定类的几个样本区域(称为抠图中的前景对象)方面,使用有监督的ML分类器自动获得。在多光谱数据集上的实验结果证实了该系统的有效性。
{"title":"Matte based generation of land cover maps","authors":"K. Bahirat, S. Chaudhuri","doi":"10.1145/2425333.2425373","DOIUrl":"https://doi.org/10.1145/2425333.2425373","url":null,"abstract":"A novel supervised technique for the generation of spatially consistent land cover maps based on class-matting is presented in this paper. This method takes advantage of both standard supervised classification technique and natural image matting. It adaptively exploits the spatial contextual information contained in the neighborhood of each pixel through the use of image matting to reduce the incongruence inherent in pixel-wise, radiometric classification of multi-spectral remote sensing data, providing a more spatially homogeneous land-cover map besides yielding a better accuracy. In order to make image matting possible for N-class land cover map generation, we extend the basic alpha matting problem into N independent matting problems, each conforming to one particular class. The user input required for the alpha matting algorithm in terms of initially identifying a few sample regions belonging to a particular class (known as the foreground object in matting) is obtained automatically using the supervised ML classifier. Experimental results obtained on multispectral data sets confirm the effectiveness of the proposed system.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"266 1","pages":"40"},"PeriodicalIF":0.0,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77165103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1