首页 > 最新文献

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)最新文献

英文 中文
Visual place recognition with CNNs: From global to partial cnn视觉位置识别:从全局到局部
Zhe Xin, Xiaoguang Cui, Jixiang Zhang, Yiping Yang, Yanqing Wang
Visual place recognition is one of the most challenging problems in computer vision, due to the large diversities that real-world places can represent. Recently, visual place recognition has become a key part of loop closure detection and topological localization in long-term mobile robot autonomy. In this work, we build up a novel visual place recognition pipeline composed of a first filtering stage followed by a partial reranking process. In the filtering stage, image-wise features are utilized to find a small set of potential places. Afterwards, stable region-wise landmarks are extracted for more accurate matching in the partial reranking process. All global and partial image representations are derived from pre-trained Convolutional Neural Networks (CNNs), and the landmarks are extracted by object proposal techniques. Moreover, a new similarity measurement is provided by considering both spatial and scale distribution of landmarks. Compared with current methods only considering scale distribution, the presented similarity measurement can benefit recognition precision and robustness effectively. Experiments with varied viewpoints and environmental conditions demonstrate that the proposed method achieves superior performance against state-of-the-art methods.
视觉位置识别是计算机视觉中最具挑战性的问题之一,因为现实世界的位置可以代表巨大的多样性。近年来,视觉位置识别已成为移动机器人长期自主中闭环检测和拓扑定位的关键部分。在这项工作中,我们建立了一个新的视觉位置识别管道,该管道由第一个过滤阶段组成,然后是部分重新排序过程。在过滤阶段,利用图像相关的特征来寻找一小部分潜在的位置。然后,提取稳定的区域地标,以便在部分重排序过程中进行更精确的匹配。所有全局和局部图像表示都来自预训练的卷积神经网络(cnn),并通过目标建议技术提取地标。同时考虑了地标的空间分布和尺度分布,提出了一种新的相似度度量方法。与目前仅考虑尺度分布的方法相比,本文提出的相似性度量方法能有效提高识别精度和鲁棒性。在不同视点和环境条件下进行的实验表明,该方法比现有方法具有更好的性能。
{"title":"Visual place recognition with CNNs: From global to partial","authors":"Zhe Xin, Xiaoguang Cui, Jixiang Zhang, Yiping Yang, Yanqing Wang","doi":"10.1109/IPTA.2017.8310121","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310121","url":null,"abstract":"Visual place recognition is one of the most challenging problems in computer vision, due to the large diversities that real-world places can represent. Recently, visual place recognition has become a key part of loop closure detection and topological localization in long-term mobile robot autonomy. In this work, we build up a novel visual place recognition pipeline composed of a first filtering stage followed by a partial reranking process. In the filtering stage, image-wise features are utilized to find a small set of potential places. Afterwards, stable region-wise landmarks are extracted for more accurate matching in the partial reranking process. All global and partial image representations are derived from pre-trained Convolutional Neural Networks (CNNs), and the landmarks are extracted by object proposal techniques. Moreover, a new similarity measurement is provided by considering both spatial and scale distribution of landmarks. Compared with current methods only considering scale distribution, the presented similarity measurement can benefit recognition precision and robustness effectively. Experiments with varied viewpoints and environmental conditions demonstrate that the proposed method achieves superior performance against state-of-the-art methods.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117269388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A closed-form expression for thin lens image irradiance 薄透镜像辐照度的封闭表达式
Robert D. Friedlander, A. Yezzi
Computer vision tasks often have the goal of inferring geometric and radiometric information about a 3D environment given limited sensing resources. It is helpful to develop relationships between these real-world properties and the actual measurements that are taken. To this end we propose a new relationship between object radiance and image irradiance based on power conservation and a thin lens imaging model. The relationship has a closed-form solution for in-focus points and can be solved via numerical integration for points that are not focused. It can be thought of as a generalization of Horn's irradiance equation. Through both numerical simulations and comparison with the intensity values of actual images, our equation is shown to provide better accuracy than Horn's equation. Improvement is most notable for near-focused images where the pinhole imaging model implicit in Horn's derivation breaks down. Outside of this regime, our model validates the use of Horn's approximation.
计算机视觉任务的目标通常是在有限的传感资源下推断三维环境的几何和辐射信息。开发这些真实属性与所采取的实际测量之间的关系是有帮助的。为此,我们提出了一种基于能量守恒和薄透镜成像模型的物体辐射度和像辐射度之间的新关系。该关系对于聚焦点具有封闭解,对于非聚焦点可以通过数值积分求解。它可以被认为是霍恩辐照度方程的推广。通过数值模拟和与实际图像强度值的比较,表明我们的方程比Horn方程具有更好的精度。改进是最显著的近聚焦图像,针孔成像模型隐含在霍恩的推导打破。在这个范围之外,我们的模型验证了霍恩近似的使用。
{"title":"A closed-form expression for thin lens image irradiance","authors":"Robert D. Friedlander, A. Yezzi","doi":"10.1109/IPTA.2017.8310133","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310133","url":null,"abstract":"Computer vision tasks often have the goal of inferring geometric and radiometric information about a 3D environment given limited sensing resources. It is helpful to develop relationships between these real-world properties and the actual measurements that are taken. To this end we propose a new relationship between object radiance and image irradiance based on power conservation and a thin lens imaging model. The relationship has a closed-form solution for in-focus points and can be solved via numerical integration for points that are not focused. It can be thought of as a generalization of Horn's irradiance equation. Through both numerical simulations and comparison with the intensity values of actual images, our equation is shown to provide better accuracy than Horn's equation. Improvement is most notable for near-focused images where the pinhole imaging model implicit in Horn's derivation breaks down. Outside of this regime, our model validates the use of Horn's approximation.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127291796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sample-based regularization for support vector machine classification 基于样本的正则化支持向量机分类
D. Tran, Muhammad-Adeel Waris, M. Gabbouj, Alexandros Iosifidis
In this paper, we propose a new regularization scheme for the well-known Support Vector Machine (SVM) classifier that operates on the training sample level. The proposed approach is motivated by the fact that Maximum Margin-based classification defines decision functions as a linear combination of the selected training data and, thus, the variations on training sample selection directly affect generalization performance. We show that the exploitation of the proposed regularization scheme is well motivated and intuitive. Experimental results show that the proposed regularization scheme outperforms standard SVM in human action recognition tasks as well as classical recognition problems.
在本文中,我们为众所周知的支持向量机(SVM)分类器提出了一种新的正则化方案,该方案在训练样本水平上运行。该方法的动机是基于最大边际分类将决策函数定义为所选训练数据的线性组合,因此训练样本选择的变化直接影响泛化性能。我们证明了所提出的正则化方案的开发是良好的动机和直观的。实验结果表明,该正则化方案在人体动作识别任务和经典识别问题上都优于标准支持向量机。
{"title":"Sample-based regularization for support vector machine classification","authors":"D. Tran, Muhammad-Adeel Waris, M. Gabbouj, Alexandros Iosifidis","doi":"10.1109/IPTA.2017.8310103","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310103","url":null,"abstract":"In this paper, we propose a new regularization scheme for the well-known Support Vector Machine (SVM) classifier that operates on the training sample level. The proposed approach is motivated by the fact that Maximum Margin-based classification defines decision functions as a linear combination of the selected training data and, thus, the variations on training sample selection directly affect generalization performance. We show that the exploitation of the proposed regularization scheme is well motivated and intuitive. Experimental results show that the proposed regularization scheme outperforms standard SVM in human action recognition tasks as well as classical recognition problems.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126377631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lossless light-field compression using reversible colour transformations 使用可逆颜色变换的无损光场压缩
João M. Santos, P. Assunção, L. Cruz, Luis M. N. Tavora, R. Fonseca-Pinto, S. Faria
Recent advances in Light Field acquisition and rendering are pushing research efforts towards increasingly efficient methods to encode this particular type of data. Light Field image compression is of the utmost importance, not only due to the large amount of data required for its representation but also due to quality requirements of many applications and computational photography methods. This paper presents a research study about the impact of reversible colour transformations and alternative data arrangements in Light Field lossless coding. The experimental results indicate that the RCT reversible transform consistently achieves the highest compression performance across all data arrangements and lossless encoders. In particular, the best results are obtained with MRP when encoding the stack of sub-aperture images using a spiral scan order, achieving 6.41 bpp, on average.
光场采集和渲染的最新进展正在推动研究工作朝着更有效的方法来编码这种特殊类型的数据。光场图像压缩是至关重要的,不仅因为它的表示需要大量的数据,而且由于许多应用和计算摄影方法的质量要求。本文研究了光场无损编码中可逆颜色变换和可选数据排列的影响。实验结果表明,RCT可逆变换在所有数据排列和无损编码器中都能保持最高的压缩性能。特别是,当使用螺旋扫描顺序对子孔径图像堆栈进行编码时,MRP获得了最好的结果,平均达到6.41 bpp。
{"title":"Lossless light-field compression using reversible colour transformations","authors":"João M. Santos, P. Assunção, L. Cruz, Luis M. N. Tavora, R. Fonseca-Pinto, S. Faria","doi":"10.1109/IPTA.2017.8310154","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310154","url":null,"abstract":"Recent advances in Light Field acquisition and rendering are pushing research efforts towards increasingly efficient methods to encode this particular type of data. Light Field image compression is of the utmost importance, not only due to the large amount of data required for its representation but also due to quality requirements of many applications and computational photography methods. This paper presents a research study about the impact of reversible colour transformations and alternative data arrangements in Light Field lossless coding. The experimental results indicate that the RCT reversible transform consistently achieves the highest compression performance across all data arrangements and lossless encoders. In particular, the best results are obtained with MRP when encoding the stack of sub-aperture images using a spiral scan order, achieving 6.41 bpp, on average.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133727668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A multiresolution DCT-based blind blur quality measure 基于多分辨率dct的盲模糊质量测量
F. Kerouh, D. Ziou, A. Serir
The paper deals with assessing blur amount in images. Blur is a common artefact that attenuates the high frequency components of an image. The main idea turns on analysing the frequency response at transitions through resolutions. To achieve that, the histogram of the multiresolution DCT coefficients is modelled by using an exponential probability density function (pdf). The steepness of the pdf is used as a cue to characterize the blur effect. Faithful scores are obtained while testing the proposed approach on five image collections. The proposed measure is validated on the JPEG2000 lossy compression algorithm and the Lucy-Richardson iterative deblurring approach.
本文主要研究图像模糊量的评估问题。模糊是一种常见的人工制品,它会衰减图像的高频成分。主要思想是通过分辨率分析跃迁时的频率响应。为了实现这一点,使用指数概率密度函数(pdf)对多分辨率DCT系数的直方图进行建模。pdf的陡峭度被用作描述模糊效果的线索。在五个图像集上测试所提出的方法时获得了忠实分数。在JPEG2000有损压缩算法和Lucy-Richardson迭代去模糊方法上验证了该方法的有效性。
{"title":"A multiresolution DCT-based blind blur quality measure","authors":"F. Kerouh, D. Ziou, A. Serir","doi":"10.1109/IPTA.2017.8310086","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310086","url":null,"abstract":"The paper deals with assessing blur amount in images. Blur is a common artefact that attenuates the high frequency components of an image. The main idea turns on analysing the frequency response at transitions through resolutions. To achieve that, the histogram of the multiresolution DCT coefficients is modelled by using an exponential probability density function (pdf). The steepness of the pdf is used as a cue to characterize the blur effect. Faithful scores are obtained while testing the proposed approach on five image collections. The proposed measure is validated on the JPEG2000 lossy compression algorithm and the Lucy-Richardson iterative deblurring approach.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130929132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-reference image quality assessment using Gabor-based smoothness and latent noise estimation 基于gabor平滑和潜在噪声估计的无参考图像质量评估
Vineet Kumar, R. Chouhan
No-reference image quality assessment is a challenging task due to the absence of a reference image in practical situations to quantify image quality. This paper proposes a new no-reference image quality metric for natural images using latent noise estimation, Gabor response, and contrast deviation. The algorithm employs an extension of gradient-based SSIM into the no-reference application using SVD-based AWGN estimation, and defines attributes such as Gabor-based smoothness and contrast deviation. The proposed metric arrives at an overall quality score by computing a linear weighted summation of the three image attributes. The proposed algorithm has been tested on several public databases (i.e. LIVE, TID 2013 and CSIQ), and the overall results display a noteworthy correlation of nearly 80% with the human visual system.
无参考图像质量评估是一项具有挑战性的任务,因为在实际情况下没有参考图像来量化图像质量。本文提出了一种新的基于潜在噪声估计、Gabor响应和对比度偏差的自然图像无参考质量度量。该算法利用基于svd的AWGN估计将基于梯度的SSIM扩展到无参考应用中,并定义了基于gabor的平滑度和对比度偏差等属性。提出的度量通过计算三个图像属性的线性加权和来获得总体质量分数。该算法在LIVE、TID 2013和CSIQ等多个公共数据库上进行了测试,总体结果显示与人类视觉系统的相关性接近80%。
{"title":"No-reference image quality assessment using Gabor-based smoothness and latent noise estimation","authors":"Vineet Kumar, R. Chouhan","doi":"10.1109/IPTA.2017.8310104","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310104","url":null,"abstract":"No-reference image quality assessment is a challenging task due to the absence of a reference image in practical situations to quantify image quality. This paper proposes a new no-reference image quality metric for natural images using latent noise estimation, Gabor response, and contrast deviation. The algorithm employs an extension of gradient-based SSIM into the no-reference application using SVD-based AWGN estimation, and defines attributes such as Gabor-based smoothness and contrast deviation. The proposed metric arrives at an overall quality score by computing a linear weighted summation of the three image attributes. The proposed algorithm has been tested on several public databases (i.e. LIVE, TID 2013 and CSIQ), and the overall results display a noteworthy correlation of nearly 80% with the human visual system.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115966967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Combining left and right wrist vein images for personal verification 结合左、右手腕静脉图像进行个人验证
Mohamed Cheniti, Z. Akhtar, N. Boukezzoula, T. Falk
Multibiometric systems that fuse information from different sources are able to alleviate limitations of the unimodal biometric systems. In this paper, we propose a multibiometric framework to identify people using their left and right wrist vein patterns. The framework uses a fast and robust preprocessing and feature extraction method. A generic score level fusion approach is proposed to integrate the scores from left and right wrist vein patterns using Dubois and Parad triangular-norm (t-norm). Experiments on the publicly available PUT wrist vein dataset show that the proposed multibiometric framework outperforms the unimodal systems, their fusion using other t-norms techniques, and existing wrist vein recognition methods.
融合不同来源信息的多生物识别系统能够缓解单峰生物识别系统的局限性。在本文中,我们提出了一个多生物识别框架来识别人们使用他们的左手腕和右手腕静脉模式。该框架采用快速、鲁棒的预处理和特征提取方法。提出了一种基于Dubois和Parad三角范数(t-范数)的通用评分水平融合方法来整合左右腕静脉模式的评分。在公开可用的PUT手腕静脉数据集上的实验表明,所提出的多生物识别框架优于单峰系统,使用其他t规范技术进行融合,以及现有的手腕静脉识别方法。
{"title":"Combining left and right wrist vein images for personal verification","authors":"Mohamed Cheniti, Z. Akhtar, N. Boukezzoula, T. Falk","doi":"10.1109/IPTA.2017.8310109","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310109","url":null,"abstract":"Multibiometric systems that fuse information from different sources are able to alleviate limitations of the unimodal biometric systems. In this paper, we propose a multibiometric framework to identify people using their left and right wrist vein patterns. The framework uses a fast and robust preprocessing and feature extraction method. A generic score level fusion approach is proposed to integrate the scores from left and right wrist vein patterns using Dubois and Parad triangular-norm (t-norm). Experiments on the publicly available PUT wrist vein dataset show that the proposed multibiometric framework outperforms the unimodal systems, their fusion using other t-norms techniques, and existing wrist vein recognition methods.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115174589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Handwriting gender recognition system based on the one-class support vector machines 基于一类支持向量机的手写性别识别系统
Y. Guerbai, Y. Chibani, Bilal Hadjadji
Handwriting gender recognition becomes considerable matter for the document analysis community, due to its effective use in practical applications. This paper addresses the problem of classifying handwriting data with respect to gender. From the state of the art, only a few studies have been carried out in this field. Thus, we propose a new framework for classifying the gender from the handwriting document using the curvelet transform and a classification method based on One-Class Support Vector Machine (OC-SVM). In order to improve the robustness of the proposed system, multiple OC-SVM classifiers are combined according to the type of distance used into the kernel. Experimental results conducted on IAM datasets show the effective use of the OC-SVM for handwriting gender recognition comparatively to the state of the art.
由于在实际应用中的有效应用,笔迹性别识别已成为文档分析界的一个重要问题。本文解决了基于性别的手写数据分类问题。从目前的情况来看,在这个领域只进行了很少的研究。为此,我们提出了一种基于曲波变换和单类支持向量机(OC-SVM)的手写文档性别分类新框架。为了提高系统的鲁棒性,根据使用到核的距离类型,将多个OC-SVM分类器组合起来。在IAM数据集上进行的实验结果表明,相对于目前的技术水平,OC-SVM在手写性别识别中的使用是有效的。
{"title":"Handwriting gender recognition system based on the one-class support vector machines","authors":"Y. Guerbai, Y. Chibani, Bilal Hadjadji","doi":"10.1109/IPTA.2017.8310136","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310136","url":null,"abstract":"Handwriting gender recognition becomes considerable matter for the document analysis community, due to its effective use in practical applications. This paper addresses the problem of classifying handwriting data with respect to gender. From the state of the art, only a few studies have been carried out in this field. Thus, we propose a new framework for classifying the gender from the handwriting document using the curvelet transform and a classification method based on One-Class Support Vector Machine (OC-SVM). In order to improve the robustness of the proposed system, multiple OC-SVM classifiers are combined according to the type of distance used into the kernel. Experimental results conducted on IAM datasets show the effective use of the OC-SVM for handwriting gender recognition comparatively to the state of the art.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122899791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multispectral single-sensor RGB-NIR imaging: New challenges and opportunities 多光谱单传感器RGB-NIR成像:新的挑战和机遇
Xavier Soria Poma, A. Sappa, A. Akbarinia
Multispectral images captured with a single sensor camera have become an attractive alternative for numerous computer vision applications. However, in order to fully exploit their potentials, the color restoration problem (RGB representation) should be addressed. This problem is more evident in outdoor scenarios containing vegetation, living beings, or specular materials. The problem of color distortion emerges from the sensitivity of sensors due to the overlap of visible and near infrared spectral bands. This paper empirically evaluates the variability of the near infrared (NIR) information with respect to the changes of light throughout the day. A tiny neural network is proposed to restore the RGB color representation from the given RGBN (Red, Green, Blue, NIR) images. In order to evaluate the proposed algorithm, different experiments on a RGBN outdoor dataset are conducted, which include various challenging cases. The obtained result shows the challenge and the importance of addressing color restoration in single sensor multispectral images.
用单个传感器相机捕获的多光谱图像已经成为许多计算机视觉应用的一个有吸引力的替代方案。然而,为了充分发挥其潜力,必须解决颜色还原问题(RGB表示)。这个问题在包含植被、生物或高光材料的室外场景中更为明显。由于可见光和近红外光谱带的重叠,传感器的灵敏度出现了颜色失真的问题。本文对近红外(NIR)信息随全天光线变化的可变性进行了实证评估。提出了一种小型神经网络,从给定的RGBN(红、绿、蓝、近红外)图像中恢复RGB颜色表示。为了评估所提出的算法,在RGBN室外数据集上进行了不同的实验,其中包括各种具有挑战性的案例。结果表明了在单传感器多光谱图像中解决颜色恢复问题的挑战和重要性。
{"title":"Multispectral single-sensor RGB-NIR imaging: New challenges and opportunities","authors":"Xavier Soria Poma, A. Sappa, A. Akbarinia","doi":"10.1109/IPTA.2017.8310105","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310105","url":null,"abstract":"Multispectral images captured with a single sensor camera have become an attractive alternative for numerous computer vision applications. However, in order to fully exploit their potentials, the color restoration problem (RGB representation) should be addressed. This problem is more evident in outdoor scenarios containing vegetation, living beings, or specular materials. The problem of color distortion emerges from the sensitivity of sensors due to the overlap of visible and near infrared spectral bands. This paper empirically evaluates the variability of the near infrared (NIR) information with respect to the changes of light throughout the day. A tiny neural network is proposed to restore the RGB color representation from the given RGBN (Red, Green, Blue, NIR) images. In order to evaluate the proposed algorithm, different experiments on a RGBN outdoor dataset are conducted, which include various challenging cases. The obtained result shows the challenge and the importance of addressing color restoration in single sensor multispectral images.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116053380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Pixelwise classification for music document analysis 用于音乐文档分析的像素分类
Jorge Calvo-Zaragoza, Gabriel Vigliensoni, Ichiro Fujinaga
Content within musical documents not only contains music symbol but also include different elements such as staff lines, text, or frontispieces. Before attempting to automatically recognize components in these layers, it is necessary to perform an analysis of the musical documents in order to detect and classify each of these constituent parts. The obstacle for this analysis is the high heterogeneity amongst music collections, especially with ancient documents, which makes it difficult to devise methods that can be generalizable to a broader range of sources. In this paper we propose a data-driven document analysis framework based on machine learning that focuses on classifying regions of interest at pixel level. For that, we make use of Convolutional Neural Networks trained to infer the category of each pixel. The main advantage of this approach is that it can be applied regardless of the type of document provided, as long as training data is available. Since this work represents first efforts in that direction, our experimentation focuses on reporting a baseline classification using our framework. The experiments show promising performance, achieving an accuracy around 90% in two corpora of old music documents.
音乐文档中的内容不仅包含音乐符号,还包括不同的元素,如五线谱、文本或扉页。在尝试自动识别这些层中的组成部分之前,有必要对音乐文档进行分析,以便检测和分类每个组成部分。这种分析的障碍是音乐收藏的高度异质性,特别是古代文献,这使得很难设计出可以推广到更广泛来源的方法。在本文中,我们提出了一个基于机器学习的数据驱动文档分析框架,该框架侧重于在像素级别对感兴趣的区域进行分类。为此,我们使用经过训练的卷积神经网络来推断每个像素的类别。这种方法的主要优点是,只要有可用的训练数据,无论所提供的文档类型如何,都可以应用该方法。由于这项工作代表了该方向的第一次努力,我们的实验集中在使用我们的框架报告基线分类上。实验结果表明,在两个旧音乐文档的语料库中,准确率达到90%左右。
{"title":"Pixelwise classification for music document analysis","authors":"Jorge Calvo-Zaragoza, Gabriel Vigliensoni, Ichiro Fujinaga","doi":"10.1109/IPTA.2017.8310134","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310134","url":null,"abstract":"Content within musical documents not only contains music symbol but also include different elements such as staff lines, text, or frontispieces. Before attempting to automatically recognize components in these layers, it is necessary to perform an analysis of the musical documents in order to detect and classify each of these constituent parts. The obstacle for this analysis is the high heterogeneity amongst music collections, especially with ancient documents, which makes it difficult to devise methods that can be generalizable to a broader range of sources. In this paper we propose a data-driven document analysis framework based on machine learning that focuses on classifying regions of interest at pixel level. For that, we make use of Convolutional Neural Networks trained to infer the category of each pixel. The main advantage of this approach is that it can be applied regardless of the type of document provided, as long as training data is available. Since this work represents first efforts in that direction, our experimentation focuses on reporting a baseline classification using our framework. The experiments show promising performance, achieving an accuracy around 90% in two corpora of old music documents.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128589680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1