首页 > 最新文献

2016 IEEE International Conference on Image Processing (ICIP)最新文献

英文 中文
Quality assessment of monocular 3D inference 单眼三维推理的质量评价
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7532508
Jorge Hernández
Recently proliferation of 3D inference methods shows an important alternative to perceive in 3D of real world from single images. The quality evaluation of 3D estimated from inference methods has been demonstrated using dataset with 3D ground truth data. However in real scenarios, the 3D inference quality is complete unknown. In this work, we present a new quality assessment of 3D monocular inference. First, we define the notion of quality index for 3D inference data. Then, we present a weighted linear model of similarity metrics to estimate quality index. The method is based on hand crafted similarity measures among image representations of RGB image and 3D inferred data. We demonstrate the effectiveness of our proposed method using public datasets and 3D inference methods of state of the art.
近年来,三维推理方法的发展为从单幅图像中感知现实世界的三维图像提供了一种重要的替代方法。利用三维地面真值数据集论证了基于推理方法的三维估计的质量评价。然而,在实际场景中,3D推理的质量是完全未知的。在这项工作中,我们提出了一种新的3D单目推理质量评估方法。首先,我们定义了三维推理数据质量指标的概念。然后,我们提出了一个加权的相似度线性模型来估计质量指标。该方法基于RGB图像和3D推断数据的图像表示之间手工制作的相似性度量。我们使用公共数据集和最先进的3D推理方法证明了我们提出的方法的有效性。
{"title":"Quality assessment of monocular 3D inference","authors":"Jorge Hernández","doi":"10.1109/ICIP.2016.7532508","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532508","url":null,"abstract":"Recently proliferation of 3D inference methods shows an important alternative to perceive in 3D of real world from single images. The quality evaluation of 3D estimated from inference methods has been demonstrated using dataset with 3D ground truth data. However in real scenarios, the 3D inference quality is complete unknown. In this work, we present a new quality assessment of 3D monocular inference. First, we define the notion of quality index for 3D inference data. Then, we present a weighted linear model of similarity metrics to estimate quality index. The method is based on hand crafted similarity measures among image representations of RGB image and 3D inferred data. We demonstrate the effectiveness of our proposed method using public datasets and 3D inference methods of state of the art.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"2016 1","pages":"1002-1006"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86310751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Redundant frame structure using M-frame for interactive light field streaming 冗余框架结构,采用m -框架进行交互光场流
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7532582
B. Motz, Gene Cheung, Antonio Ortega
A light field (LF) is a 2D array of closely spaced viewpoint images of a static 3D scene. In an interactive LF streaming (ILFS) scenario, a user successively requests desired neighboring viewpoints for observation, and in response the server must transmit pre-encoded data for correct decoding of the requested viewpoint images. Designing frame structures for ILFS is challenging, since at encoding time it is not known what navigation path a user will take, making differential coding very difficult to employ. In this paper, leveraging on a recent work on the merge operator - a new distributed source coding technique that efficiently merges differences among a set of side information (SI) frames into an identical reconstruction - we design redundant frame structures that facilitate ILFS, trading off expected transmission cost with total storage size. Specifically, we first propose a new view interaction model that captures view navigation tendencies of typical users. Assuming a flexible one-frame buffer at the decoder, we then derive a set of recursive equations that compute the expected transmission cost for a navigation lifetime of T views, given the proposed interaction model and a pre-encoded frame structure. Finally, we propose an algorithm that greedily builds a redundant frame structure, minimizing a weighted sum of expected transmission cost and total storage size. Experimental results show that our proposed algorithm generates frame structures with better transmission / storage tradeoffs than competing schemes.
光场(LF)是静态3D场景中紧密间隔的视点图像的二维阵列。在交互式LF流(ILFS)场景中,用户连续请求所需的邻近视点进行观察,作为响应,服务器必须传输预编码的数据以正确解码所请求的视点图像。为ILFS设计框架结构是具有挑战性的,因为在编码时不知道用户将采取什么导航路径,使得差分编码非常难以使用。在本文中,利用最近对合并算子的研究——一种新的分布式源编码技术,有效地将一组侧信息(SI)帧之间的差异合并到一个相同的重构中——我们设计了冗余的帧结构,促进了ILFS,权衡了预期的传输成本和总存储大小。具体而言,我们首先提出了一种新的视图交互模型,该模型捕捉了典型用户的视图导航倾向。假设解码器有一个灵活的一帧缓冲区,然后我们推导出一组递归方程,计算T视图导航生命周期的预期传输成本,给定所提出的交互模型和预编码的帧结构。最后,我们提出了一种贪婪地构建冗余帧结构的算法,最小化期望传输成本和总存储大小的加权总和。实验结果表明,该算法生成的帧结构比竞争方案具有更好的传输/存储权衡。
{"title":"Redundant frame structure using M-frame for interactive light field streaming","authors":"B. Motz, Gene Cheung, Antonio Ortega","doi":"10.1109/ICIP.2016.7532582","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532582","url":null,"abstract":"A light field (LF) is a 2D array of closely spaced viewpoint images of a static 3D scene. In an interactive LF streaming (ILFS) scenario, a user successively requests desired neighboring viewpoints for observation, and in response the server must transmit pre-encoded data for correct decoding of the requested viewpoint images. Designing frame structures for ILFS is challenging, since at encoding time it is not known what navigation path a user will take, making differential coding very difficult to employ. In this paper, leveraging on a recent work on the merge operator - a new distributed source coding technique that efficiently merges differences among a set of side information (SI) frames into an identical reconstruction - we design redundant frame structures that facilitate ILFS, trading off expected transmission cost with total storage size. Specifically, we first propose a new view interaction model that captures view navigation tendencies of typical users. Assuming a flexible one-frame buffer at the decoder, we then derive a set of recursive equations that compute the expected transmission cost for a navigation lifetime of T views, given the proposed interaction model and a pre-encoded frame structure. Finally, we propose an algorithm that greedily builds a redundant frame structure, minimizing a weighted sum of expected transmission cost and total storage size. Experimental results show that our proposed algorithm generates frame structures with better transmission / storage tradeoffs than competing schemes.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"40 1","pages":"1369-1373"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87330787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Spectral slopes for automated classification of land cover in landsat images 用于陆地卫星图像中土地覆盖自动分类的光谱斜率
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7533182
S. M. Aswatha, J. Mukhopadhyay, P. Biswas
In the literature, various techniques for supervised/ semi-supervised classification of satellite imageries require manual selection of samples for each class. In this paper, we propose a spectral-slope based classification technique, which automates the process of initial labeling of a set of sample points. These are subsequently used in a supervised classifier as training samples and it performs the task of classification over all the pixels in the image. We demonstrate the effectiveness of our proposed classification technique in summarizing the changes in temporal image sets. For selecting the training samples from the satellite imageries, a set of rules is proposed by using the spectral-slope properties. We classify the land-cover into three classes, namely, water, vegetation, and vegetation-void, and validate the classification results using very high resolution satellite imagery. The approach has also been used in the analysis of images acquired by different sensors operating under similar wavelength ranges.
在文献中,对卫星图像进行监督/半监督分类的各种技术需要手动选择每个类别的样本。在本文中,我们提出了一种基于光谱斜率的分类技术,该技术可以自动地对一组样本点进行初始标记。这些随后在监督分类器中用作训练样本,并对图像中的所有像素执行分类任务。我们证明了我们提出的分类技术在总结时间图像集变化方面的有效性。为了从卫星图像中选择训练样本,利用光谱斜率特性提出了一套规则。我们将土地覆盖分为三类,即水体、植被和植被空洞,并使用超高分辨率卫星图像对分类结果进行验证。该方法也被用于分析不同传感器在相似波长范围下获得的图像。
{"title":"Spectral slopes for automated classification of land cover in landsat images","authors":"S. M. Aswatha, J. Mukhopadhyay, P. Biswas","doi":"10.1109/ICIP.2016.7533182","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533182","url":null,"abstract":"In the literature, various techniques for supervised/ semi-supervised classification of satellite imageries require manual selection of samples for each class. In this paper, we propose a spectral-slope based classification technique, which automates the process of initial labeling of a set of sample points. These are subsequently used in a supervised classifier as training samples and it performs the task of classification over all the pixels in the image. We demonstrate the effectiveness of our proposed classification technique in summarizing the changes in temporal image sets. For selecting the training samples from the satellite imageries, a set of rules is proposed by using the spectral-slope properties. We classify the land-cover into three classes, namely, water, vegetation, and vegetation-void, and validate the classification results using very high resolution satellite imagery. The approach has also been used in the analysis of images acquired by different sensors operating under similar wavelength ranges.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"131 1","pages":"4354-4358"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89115505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Learning a perceptual manifold for image set classification 学习用于图像集分类的感知流形
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7533198
Sriram Kumar, A. Savakis
We present a biologically motivated manifold learning framework for image set classification inspired by Independent Component Analysis for Grassmann manifolds. A Grassmann manifold is a collection of linear subspaces, such that each subspace is mapped on a single point on the manifold. We propose constructing Grassmann subspaces using Independent Component Analysis for robustness and improved class separation. The independent components capture spatially local information similar to Gabor-like filters within each subspace resulting in better classification accuracy. We further utilize linear discriminant analysis or sparse representation classification on the Grassmann manifold to achieve robust classification performance. We demonstrate the efficacy of our approach for image set classification on face and object recognition datasets.
我们提出了一个生物驱动的流形学习框架,该框架受格拉斯曼流形独立分量分析的启发,用于图像集分类。格拉斯曼流形是线性子空间的集合,使得每个子空间映射到流形上的一个点上。我们提出使用独立成分分析构造Grassmann子空间以增强鲁棒性和改进的类分离。独立组件捕获空间局部信息,类似于每个子空间中的类gabor过滤器,从而获得更好的分类精度。我们进一步利用格拉斯曼流形上的线性判别分析或稀疏表示分类来实现鲁棒分类性能。我们证明了我们的方法在人脸和物体识别数据集上的图像集分类的有效性。
{"title":"Learning a perceptual manifold for image set classification","authors":"Sriram Kumar, A. Savakis","doi":"10.1109/ICIP.2016.7533198","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533198","url":null,"abstract":"We present a biologically motivated manifold learning framework for image set classification inspired by Independent Component Analysis for Grassmann manifolds. A Grassmann manifold is a collection of linear subspaces, such that each subspace is mapped on a single point on the manifold. We propose constructing Grassmann subspaces using Independent Component Analysis for robustness and improved class separation. The independent components capture spatially local information similar to Gabor-like filters within each subspace resulting in better classification accuracy. We further utilize linear discriminant analysis or sparse representation classification on the Grassmann manifold to achieve robust classification performance. We demonstrate the efficacy of our approach for image set classification on face and object recognition datasets.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"126 1","pages":"4433-4437"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88995865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Laplacian-guided image decolorization 拉普拉斯引导图像脱色
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7533132
Cosmin Ancuti, C. Ancuti
In this paper we introduce a novel decolorization strategy built on image fusion principles. Decolorization (color-to-grayscale), is an important transformation used in many monochrome image processing applications. We demonstrate that aside from color spatial distribution, local information plays an important role in maintaining the discriminability of the image conversion. Our strategy blends the three color channels R, G, B guided by two weight maps that filter the local transitions and measure the dominant values of the regions using the Laplacian information. In order to minimize artifacts introduced by the weight maps, our fusion approach is designed in a multi-scale fashion, using a Laplacian pyramid decomposition. Additionally, compared with most of the existing techniques our straightforward technique has the advantage to be computationally effective. We demonstrate that our technique is temporal coherent being suitable to decolorize videos. A comprehensive qualitative and also quantitative evaluation based on an objective visual descriptor demonstrates the utility of our decolorization technique.
本文介绍了一种基于图像融合原理的新型脱色策略。脱色(彩色到灰度)是许多单色图像处理应用中使用的重要转换。我们证明除了色彩空间分布外,局部信息在保持图像转换的可分辨性方面起着重要作用。我们的策略混合了三个颜色通道R, G, B,通过两个权重图过滤局部过渡,并使用拉普拉斯信息测量区域的主导值。为了最大限度地减少由权重图引入的伪像,我们的融合方法采用多尺度方式设计,使用拉普拉斯金字塔分解。此外,与大多数现有技术相比,我们的直接技术具有计算效率高的优势。我们证明了我们的技术是时间相干的,适合于脱色视频。基于客观的视觉描述符的综合定性和定量评价证明了我们的脱色技术的实用性。
{"title":"Laplacian-guided image decolorization","authors":"Cosmin Ancuti, C. Ancuti","doi":"10.1109/ICIP.2016.7533132","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533132","url":null,"abstract":"In this paper we introduce a novel decolorization strategy built on image fusion principles. Decolorization (color-to-grayscale), is an important transformation used in many monochrome image processing applications. We demonstrate that aside from color spatial distribution, local information plays an important role in maintaining the discriminability of the image conversion. Our strategy blends the three color channels R, G, B guided by two weight maps that filter the local transitions and measure the dominant values of the regions using the Laplacian information. In order to minimize artifacts introduced by the weight maps, our fusion approach is designed in a multi-scale fashion, using a Laplacian pyramid decomposition. Additionally, compared with most of the existing techniques our straightforward technique has the advantage to be computationally effective. We demonstrate that our technique is temporal coherent being suitable to decolorize videos. A comprehensive qualitative and also quantitative evaluation based on an objective visual descriptor demonstrates the utility of our decolorization technique.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"4107-4111"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81017882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Computer-aided diagnostic tool for early detection of prostate cancer 早期发现前列腺癌的计算机辅助诊断工具
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7532843
Islam Reda, A. Shalaby, F. Khalifa, M. Elmogy, A. Aboulfotouh, M. El-Ghar, Ehsan Hosseini-Asl, N. Werghi, R. Keynton, A. El-Baz
In this paper, we propose a novel non-invasive framework for the early diagnosis of prostate cancer from diffusion-weighted magnetic resonance imaging (DW-MRI). The proposed approach consists of three main steps. In the first step, the prostate is localized and segmented based on a new level-set model. In the second step, the apparent diffusion coefficient (ADC) of the segmented prostate volume is mathematically calculated for different b-values. To preserve continuity, the calculated ADC values are normalized and refined using a Generalized Gauss-Markov Random Field (GGMRF) image model. The cumulative distribution function (CDF) of refined ADC for the prostate tissues at different b-values are then constructed. These CDFs are considered as global features describing water diffusion which can be used to distinguish between benign and malignant tumors. Finally, a deep learning auto-encoder network, trained by a stacked non-negativity constraint algorithm (SNCAE), is used to classify the prostate tumor as benign or malignant based on the CDFs extracted from the previous step. Preliminary experiments on 53 clinical DW-MRI data sets resulted in 100% correct classification, indicating the high accuracy of the proposed framework and holding promise of the proposed CAD system as a reliable non-invasive diagnostic tool.
在本文中,我们提出了一种新的无创框架,用于扩散加权磁共振成像(DW-MRI)早期诊断前列腺癌。建议的方法包括三个主要步骤。第一步,基于新的水平集模型对前列腺进行定位和分割。第二步,对不同的b值,用数学方法计算分割后前列腺体积的表观扩散系数(ADC)。为了保持连续性,计算出的ADC值使用广义高斯-马尔科夫随机场(GGMRF)图像模型进行归一化和细化。然后构造不同b值下前列腺组织精细ADC的累积分布函数(CDF)。这些CDFs被认为是描述水扩散的全局特征,可以用来区分良性和恶性肿瘤。最后,使用堆叠非负性约束算法(堆叠非负性约束算法,SNCAE)训练的深度学习自编码器网络,根据前一步提取的CDFs对前列腺肿瘤进行良性或恶性分类。在53个临床DW-MRI数据集上进行的初步实验结果显示,分类准确率为100%,表明所提出的框架具有很高的准确性,并且所提出的CAD系统有望成为可靠的无创诊断工具。
{"title":"Computer-aided diagnostic tool for early detection of prostate cancer","authors":"Islam Reda, A. Shalaby, F. Khalifa, M. Elmogy, A. Aboulfotouh, M. El-Ghar, Ehsan Hosseini-Asl, N. Werghi, R. Keynton, A. El-Baz","doi":"10.1109/ICIP.2016.7532843","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532843","url":null,"abstract":"In this paper, we propose a novel non-invasive framework for the early diagnosis of prostate cancer from diffusion-weighted magnetic resonance imaging (DW-MRI). The proposed approach consists of three main steps. In the first step, the prostate is localized and segmented based on a new level-set model. In the second step, the apparent diffusion coefficient (ADC) of the segmented prostate volume is mathematically calculated for different b-values. To preserve continuity, the calculated ADC values are normalized and refined using a Generalized Gauss-Markov Random Field (GGMRF) image model. The cumulative distribution function (CDF) of refined ADC for the prostate tissues at different b-values are then constructed. These CDFs are considered as global features describing water diffusion which can be used to distinguish between benign and malignant tumors. Finally, a deep learning auto-encoder network, trained by a stacked non-negativity constraint algorithm (SNCAE), is used to classify the prostate tumor as benign or malignant based on the CDFs extracted from the previous step. Preliminary experiments on 53 clinical DW-MRI data sets resulted in 100% correct classification, indicating the high accuracy of the proposed framework and holding promise of the proposed CAD system as a reliable non-invasive diagnostic tool.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"8 1","pages":"2668-2672"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73946526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Automatic character labeling for camera captured document images 自动字符标签相机捕获的文件图像
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7532967
Wei-liang Fan, K. Kise, M. Iwamura
Character groundtruth for camera captured documents is crucial for training and evaluating advanced OCR algorithms. Manually generating character level groundtruth is a time consuming and costly process. This paper proposes a robust groundtruth generation method based on document retrieval and image registration for camera captured documents. We use an elastic non-rigid alignment method to fit the captured document image which relaxes the flat paper assumption made by conventional solutions. The proposed method allows building very large scale labeled camera captured documents dataset, without any human intervention. We construct a large labeled dataset consisting of 1 million camera captured Chinese character images. Evaluation of samples generated by our approach showed that 99.99% of the images were correctly labeled, even with different distortions specific to cameras such as blur, specularity and perspective distortion.
相机捕获文档的特征真实值对于训练和评估高级OCR算法至关重要。手动生成角色级别的groundtruth是一个耗时且昂贵的过程。针对相机捕获的文档,提出了一种基于文档检索和图像配准的鲁棒基础真值生成方法。我们使用弹性非刚性对齐方法来拟合捕获的文档图像,这打破了传统方法对平面纸张的假设。该方法允许在没有任何人为干预的情况下构建超大规模的标记相机捕获文档数据集。我们构建了一个由100万张相机捕获的汉字图像组成的大型标记数据集。通过我们的方法生成的样本的评估表明,99.99%的图像被正确标记,即使有不同的相机特定的扭曲,如模糊,镜面和透视失真。
{"title":"Automatic character labeling for camera captured document images","authors":"Wei-liang Fan, K. Kise, M. Iwamura","doi":"10.1109/ICIP.2016.7532967","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532967","url":null,"abstract":"Character groundtruth for camera captured documents is crucial for training and evaluating advanced OCR algorithms. Manually generating character level groundtruth is a time consuming and costly process. This paper proposes a robust groundtruth generation method based on document retrieval and image registration for camera captured documents. We use an elastic non-rigid alignment method to fit the captured document image which relaxes the flat paper assumption made by conventional solutions. The proposed method allows building very large scale labeled camera captured documents dataset, without any human intervention. We construct a large labeled dataset consisting of 1 million camera captured Chinese character images. Evaluation of samples generated by our approach showed that 99.99% of the images were correctly labeled, even with different distortions specific to cameras such as blur, specularity and perspective distortion.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"79 1","pages":"3284-3288"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87116759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Where do emotions come from? Predicting the Emotion Stimuli Map 情绪从何而来?预测情绪刺激图
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7532430
Kuan-Chuan Peng, Amir Sadovnik, Andrew C. Gallagher, Tsuhan Chen
Which parts of an image evoke emotions in an observer? To answer this question, we introduce a novel problem in computer vision - predicting an Emotion Stimuli Map (ESM), which describes pixel-wise contribution to evoked emotions. Building a new image database, EmotionROI, as a benchmark for predicting the ESM, we find that the regions selected by saliency and objectness detection do not correctly predict the image regions which evoke emotion. Although objects represent important regions for evoking emotion, parts of the background are also important. Based on this fact, we propose using fully convolutional networks for predicting the ESM. Both qualitative and quantitative experimental results confirm that our method can predict the regions which evoke emotion better than both saliency and objectness detection.
图像的哪些部分能唤起观察者的情感?为了回答这个问题,我们在计算机视觉中引入了一个新的问题——预测情绪刺激图(ESM),它描述了像素对诱发情绪的贡献。建立EmotionROI图像数据库作为预测ESM的基准,我们发现显著性和物体检测选择的区域不能正确预测引起情感的图像区域。虽然物体代表了唤起情感的重要区域,但背景的某些部分也很重要。基于这一事实,我们建议使用全卷积网络来预测ESM。定性和定量实验结果都证实了我们的方法比显著性和客观检测更能预测引起情绪的区域。
{"title":"Where do emotions come from? Predicting the Emotion Stimuli Map","authors":"Kuan-Chuan Peng, Amir Sadovnik, Andrew C. Gallagher, Tsuhan Chen","doi":"10.1109/ICIP.2016.7532430","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532430","url":null,"abstract":"Which parts of an image evoke emotions in an observer? To answer this question, we introduce a novel problem in computer vision - predicting an Emotion Stimuli Map (ESM), which describes pixel-wise contribution to evoked emotions. Building a new image database, EmotionROI, as a benchmark for predicting the ESM, we find that the regions selected by saliency and objectness detection do not correctly predict the image regions which evoke emotion. Although objects represent important regions for evoking emotion, parts of the background are also important. Based on this fact, we propose using fully convolutional networks for predicting the ESM. Both qualitative and quantitative experimental results confirm that our method can predict the regions which evoke emotion better than both saliency and objectness detection.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"569 1","pages":"614-618"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87252616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Video system for human attribute analysis using compact convolutional neural network 视频系统中人的属性分析采用紧凑卷积神经网络
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7532424
Yi Yang, F. Chen, Xiaoming Chen, Yan Dai, Zhenyang Chen, Jiang Ji, Tong Zhao
Convolutional neural networks show their advantage in human attribute analysis (e.g. age, gender and ethnicity). However, they experience issues (e.g. robustness and responsiveness) when deployed in an intelligent video system. We propose one compact CNN model and apply it in our video system motivated by the full consideration of performance and usability. With the proposed web image mining and labelling strategy, we construct a large training set which covers various image conditions. The proposed CNN model successfully achieves a mean absolute error (MAE) of 3.23 years on the Morph 2 dataset, using the same test policy as our counterparts. This is the state-of-the-art score to our knowledge using CNN for age estimation. The proposed video analysis system employs this compact CNN model and demonstrated good performance in both dataset tests and deployment in real-world environments.
卷积神经网络在人类属性分析(如年龄、性别和种族)中显示出优势。然而,当部署在智能视频系统中时,它们会遇到问题(例如鲁棒性和响应性)。在充分考虑性能和可用性的前提下,我们提出了一种紧凑的CNN模型,并将其应用于我们的视频系统。利用所提出的web图像挖掘和标记策略,我们构建了一个涵盖各种图像条件的大型训练集。本文提出的CNN模型在Morph 2数据集上的平均绝对误差(MAE)为3.23年,使用与我们的同类模型相同的测试策略。这是我们使用CNN进行年龄估计的最先进的分数。本文提出的视频分析系统采用这种紧凑的CNN模型,在数据集测试和实际环境部署中都表现出良好的性能。
{"title":"Video system for human attribute analysis using compact convolutional neural network","authors":"Yi Yang, F. Chen, Xiaoming Chen, Yan Dai, Zhenyang Chen, Jiang Ji, Tong Zhao","doi":"10.1109/ICIP.2016.7532424","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532424","url":null,"abstract":"Convolutional neural networks show their advantage in human attribute analysis (e.g. age, gender and ethnicity). However, they experience issues (e.g. robustness and responsiveness) when deployed in an intelligent video system. We propose one compact CNN model and apply it in our video system motivated by the full consideration of performance and usability. With the proposed web image mining and labelling strategy, we construct a large training set which covers various image conditions. The proposed CNN model successfully achieves a mean absolute error (MAE) of 3.23 years on the Morph 2 dataset, using the same test policy as our counterparts. This is the state-of-the-art score to our knowledge using CNN for age estimation. The proposed video analysis system employs this compact CNN model and demonstrated good performance in both dataset tests and deployment in real-world environments.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"53 1","pages":"584-588"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90282199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A weighted total variation approach for the atlas-based reconstruction of brain MR data 基于图谱的脑MR数据重建的加权总变异方法
Pub Date : 2016-09-01 DOI: 10.1109/ICIP.2016.7533177
Mingli Zhang, Kuldeep Kumar, Christian Desrosiers
Compressed sensing is a powerful approach to reconstruct high-quality images using a small number of samples. This paper presents a novel compressed sensing method that uses a probabilistic atlas to impose spatial constraints on the reconstruction of brain magnetic resonance imaging (MRI) data. A weighted total variation (TV) model is proposed to characterize the spatial distribution of gradients in the brain, and incorporate this information in the reconstruction process. Experiments on T1-weighted MR images from the ABIDE dataset show our proposed method to outperform the standard uniform TV model, as well as state-of-the-art approaches, for low sampling rates and high noise levels.
压缩感知是一种利用少量样本重建高质量图像的有效方法。本文提出了一种新的压缩感知方法,该方法利用概率图谱对脑磁共振成像(MRI)数据的重构施加空间约束。提出了一种加权总变异(TV)模型来表征大脑梯度的空间分布,并将该信息纳入重建过程。对来自ABIDE数据集的t1加权MR图像的实验表明,我们提出的方法在低采样率和高噪声水平下优于标准均匀电视模型以及最先进的方法。
{"title":"A weighted total variation approach for the atlas-based reconstruction of brain MR data","authors":"Mingli Zhang, Kuldeep Kumar, Christian Desrosiers","doi":"10.1109/ICIP.2016.7533177","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533177","url":null,"abstract":"Compressed sensing is a powerful approach to reconstruct high-quality images using a small number of samples. This paper presents a novel compressed sensing method that uses a probabilistic atlas to impose spatial constraints on the reconstruction of brain magnetic resonance imaging (MRI) data. A weighted total variation (TV) model is proposed to characterize the spatial distribution of gradients in the brain, and incorporate this information in the reconstruction process. Experiments on T1-weighted MR images from the ABIDE dataset show our proposed method to outperform the standard uniform TV model, as well as state-of-the-art approaches, for low sampling rates and high noise levels.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"18 1","pages":"4329-4333"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89001823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2016 IEEE International Conference on Image Processing (ICIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1