首页 > 最新文献

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)最新文献

英文 中文
Image retrieval based on saliency for urban image contents 基于显著性的城市图像内容检索
Kamel Guissous, V. Gouet-Brunet
With the increase of image datasets size and of descriptors complexity in Content-Based Image Retrieval (CBIR) and Computer Vision, it is essential to find a way to limit the amount of manipulated data, while keeping its quality. Instead of treating the entire image, the selection of regions which hold the essence of information is a relevant option to reach this goal. As the visual saliency aims at highlighting the areas of the image which are the most important for a given task, in this paper we propose to exploit visual saliency maps to prune the most salient image features. A novel visual saliency approach based on the local distribution analysis of the edges orientation, particularly dedicated to structured contents, such as street view images of urban environments, is proposed. It is evaluated for CBIR according to three criteria: quality of retrieval, volume of manipulated features and computation time. The proposal can be exploited into various applications involving large sets of local visual features; here it is experimented within two applications: cross-domain image retrieval and image-based vehicle localisation.
在基于内容的图像检索(CBIR)和计算机视觉中,随着图像数据集规模和描述符复杂度的增加,如何在保证数据质量的前提下限制被操作数据的数量是一个非常重要的问题。而不是处理整个图像,选择具有信息本质的区域是实现这一目标的相关选项。由于视觉显着性旨在突出图像中对给定任务最重要的区域,因此在本文中,我们建议利用视觉显着性映射来修剪最显著的图像特征。提出了一种新的基于边缘方向局部分布分析的视觉显著性方法,特别适用于结构化内容,如城市环境街景图像。根据三个标准来评估CBIR:检索质量,操作特征量和计算时间。该方案可用于涉及大量局部视觉特征的各种应用;本文在两个应用中进行了实验:跨域图像检索和基于图像的车辆定位。
{"title":"Image retrieval based on saliency for urban image contents","authors":"Kamel Guissous, V. Gouet-Brunet","doi":"10.1109/IPTA.2017.8310131","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310131","url":null,"abstract":"With the increase of image datasets size and of descriptors complexity in Content-Based Image Retrieval (CBIR) and Computer Vision, it is essential to find a way to limit the amount of manipulated data, while keeping its quality. Instead of treating the entire image, the selection of regions which hold the essence of information is a relevant option to reach this goal. As the visual saliency aims at highlighting the areas of the image which are the most important for a given task, in this paper we propose to exploit visual saliency maps to prune the most salient image features. A novel visual saliency approach based on the local distribution analysis of the edges orientation, particularly dedicated to structured contents, such as street view images of urban environments, is proposed. It is evaluated for CBIR according to three criteria: quality of retrieval, volume of manipulated features and computation time. The proposal can be exploited into various applications involving large sets of local visual features; here it is experimented within two applications: cross-domain image retrieval and image-based vehicle localisation.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127542734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Weighted hybrid features for person re-identification 基于加权混合特征的人再识别
Saba Mumtaz, Naima Mubariz, S. Saleem, M. Fraz
In video-surveillance, person re-identification is described as the task of recognizing distinct individuals over a network of cameras. It is an extremely challenging task since visual appearances of people can change significantly when viewed in different cameras. Many person re-identification methods offer distinct advantages over each other in terms of robustness to lighting, scale and pose variations. Keeping this consideration in mind, this paper proposes an effective new person reidentification model which incorporates several recent state-of-the-art feature extraction methodologies such as GOG, WHOS and LOMO features into a single framework. Effectiveness of each feature type is estimated and optimal weights for the similarity measurements are assigned through a multiple metric learning method. The proposed re-identification approach is then tested on multiple benchmark person re-identification datasets where it outperforms many other state-of-the-art methodologies.
在视频监控中,人员再识别被描述为通过摄像头网络识别不同个体的任务。这是一项极具挑战性的任务,因为在不同的镜头下,人的视觉外观会发生很大的变化。许多人的再识别方法在光照、规模和姿势变化方面提供了明显的优势。考虑到这一点,本文提出了一个有效的新的人物再识别模型,该模型将几种最新的最先进的特征提取方法(如GOG, WHOS和LOMO特征)整合到一个框架中。通过多度量学习方法,估计了每种特征类型的有效性,并分配了相似性度量的最优权重。然后在多个基准人员再识别数据集上对所提出的重新识别方法进行了测试,其性能优于许多其他最先进的方法。
{"title":"Weighted hybrid features for person re-identification","authors":"Saba Mumtaz, Naima Mubariz, S. Saleem, M. Fraz","doi":"10.1109/IPTA.2017.8310107","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310107","url":null,"abstract":"In video-surveillance, person re-identification is described as the task of recognizing distinct individuals over a network of cameras. It is an extremely challenging task since visual appearances of people can change significantly when viewed in different cameras. Many person re-identification methods offer distinct advantages over each other in terms of robustness to lighting, scale and pose variations. Keeping this consideration in mind, this paper proposes an effective new person reidentification model which incorporates several recent state-of-the-art feature extraction methodologies such as GOG, WHOS and LOMO features into a single framework. Effectiveness of each feature type is estimated and optimal weights for the similarity measurements are assigned through a multiple metric learning method. The proposed re-identification approach is then tested on multiple benchmark person re-identification datasets where it outperforms many other state-of-the-art methodologies.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"10 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129128262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Detecting exercise-induced fatigue using thermal imaging and deep learning 利用热成像和深度学习检测运动引起的疲劳
Miguel Bordallo López, Carlos R. del-Blanco, N. García
Fatigue has adverse effects in both physical and cognitive abilities. Hence, automatically detecting exercise-induced fatigue is of importance, especially in order to assist in the planning of effort and resting during exercise sessions. Thermal imaging and facial analysis provide a mean to detect changes in the human body unobtrusively and in variant conditions of pose and illumination. In this context, this paper proposes the automatic detection of exercise-induced fatigue using thermal cameras and facial images, analyzing them using deep convolutional neural networks. Our results indicate that classification of fatigued individuals is possible, obtaining an accuracy that reaches over 80% when utilizing single thermal images.
疲劳对身体和认知能力都有不利影响。因此,自动检测运动引起的疲劳是很重要的,特别是为了帮助计划运动期间的努力和休息。热成像和面部分析提供了一种在姿势和照明的不同条件下不显眼地检测人体变化的方法。在此背景下,本文提出了使用热像仪和面部图像自动检测运动引起的疲劳,并使用深度卷积神经网络对其进行分析。我们的研究结果表明,疲劳个体的分类是可能的,当使用单张热图像时,获得的准确率达到80%以上。
{"title":"Detecting exercise-induced fatigue using thermal imaging and deep learning","authors":"Miguel Bordallo López, Carlos R. del-Blanco, N. García","doi":"10.1109/IPTA.2017.8310151","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310151","url":null,"abstract":"Fatigue has adverse effects in both physical and cognitive abilities. Hence, automatically detecting exercise-induced fatigue is of importance, especially in order to assist in the planning of effort and resting during exercise sessions. Thermal imaging and facial analysis provide a mean to detect changes in the human body unobtrusively and in variant conditions of pose and illumination. In this context, this paper proposes the automatic detection of exercise-induced fatigue using thermal cameras and facial images, analyzing them using deep convolutional neural networks. Our results indicate that classification of fatigued individuals is possible, obtaining an accuracy that reaches over 80% when utilizing single thermal images.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125728046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Dynamic ensemble selection VS K-NN: Why and when dynamic selection obtains higher classification performance? 动态集成选择VS K-NN:动态选择为何以及何时获得更高的分类性能?
Rafael M. O. Cruz, Hiba H. Zakane, R. Sabourin, George D. C. Cavalcanti
Multiple classifier systems focus on the combination of classifiers to obtain better performance than a single robust one. These systems unfold three major phases: pool generation, selection and integration. One of the most promising MCS approaches is Dynamic Selection (DS), which relies on finding the most competent classifier or ensemble of classifiers to predict each test sample. The majority of the DS techniques are based on the K-Nearest Neighbors (K-NN) definition, and the quality of the neighborhood has a huge impact on the performance of DS methods. In this paper, we perform an analysis comparing the classification results of DS techniques and the K-NN classifier under different conditions. Experiments are performed on 18 state-of-the-art DS techniques over 30 classification datasets and results show that DS methods present a significant boost in classification accuracy even though they use the same neighborhood as the K-NN. The reasons behind the outperformance of DS techniques over the K-NN classifier reside in the fact that DS techniques can deal with samples with a high degree of instance hardness (samples that are located close to the decision border) as opposed to the K-NN. In this paper, not only we explain why DS techniques achieve higher classification performance than the K-NN but also when DS should be used.
多分类器系统关注于分类器的组合,以获得比单个分类器更好的性能。这些系统展开了三个主要阶段:池的产生、选择和整合。最有前途的MCS方法之一是动态选择(DS),它依赖于找到最胜任的分类器或分类器集合来预测每个测试样本。大多数DS技术都是基于k近邻(K-NN)定义的,邻域的质量对DS方法的性能有很大的影响。在本文中,我们对DS技术和K-NN分类器在不同条件下的分类结果进行了分析比较。在30个分类数据集上对18种最先进的DS技术进行了实验,结果表明,即使DS方法使用与K-NN相同的邻域,DS方法也能显著提高分类精度。DS技术优于K-NN分类器的原因在于,与K-NN相比,DS技术可以处理具有高度实例硬度的样本(位于决策边界附近的样本)。在本文中,我们不仅解释了为什么DS技术比K-NN具有更高的分类性能,而且还解释了在什么情况下应该使用DS。
{"title":"Dynamic ensemble selection VS K-NN: Why and when dynamic selection obtains higher classification performance?","authors":"Rafael M. O. Cruz, Hiba H. Zakane, R. Sabourin, George D. C. Cavalcanti","doi":"10.1109/IPTA.2017.8310100","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310100","url":null,"abstract":"Multiple classifier systems focus on the combination of classifiers to obtain better performance than a single robust one. These systems unfold three major phases: pool generation, selection and integration. One of the most promising MCS approaches is Dynamic Selection (DS), which relies on finding the most competent classifier or ensemble of classifiers to predict each test sample. The majority of the DS techniques are based on the K-Nearest Neighbors (K-NN) definition, and the quality of the neighborhood has a huge impact on the performance of DS methods. In this paper, we perform an analysis comparing the classification results of DS techniques and the K-NN classifier under different conditions. Experiments are performed on 18 state-of-the-art DS techniques over 30 classification datasets and results show that DS methods present a significant boost in classification accuracy even though they use the same neighborhood as the K-NN. The reasons behind the outperformance of DS techniques over the K-NN classifier reside in the fact that DS techniques can deal with samples with a high degree of instance hardness (samples that are located close to the decision border) as opposed to the K-NN. In this paper, not only we explain why DS techniques achieve higher classification performance than the K-NN but also when DS should be used.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134620546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Deriving high-level scene descriptions from deep scene CNN features 从深度场景CNN特征中提取高级场景描述
Akram Bayat, M. Pomplun
In this paper, we generate two computational models in order to estimate two dominant global properties (naturalness and openness) for representing a scene based on its global spatial structure. Naturalness and openness are two dominant perceptual properties within a multidimensional space in which semantically similar scenes (e.g., corridor and hallway) are assigned to nearby points. In this model space, the representation of a real-world scene is based on the overall shape of a scene but not on local object information. We introduce the use of a deep convolutional neural network for generating features that are well-suited for estimating the two global properties of a visual scene. The extracted features are integrated in an efficient way and fed into a linear support vector machine (SVM) to classify naturalness versus man-madeness and openness versus closedness. These two global properties (naturalness and openness) of an input image can be predicted from activations in the lowest layer of the convolutional neural network which has been trained for a scene recognition task. The consistent results of computational models in full and restricted spatial frequency ranges suggest that the representation of an image in the lowest layer of the deep scene CNN contains holistic information of the images as it leads to highest accuracy in modelling the global shape of the scene.
在本文中,我们生成了两个计算模型,以估计基于全局空间结构表示场景的两个主要全局属性(自然性和开放性)。在多维空间中,语义相似的场景(如走廊和走廊)被分配到附近的点,自然性和开放性是两个主要的感知属性。在这个模型空间中,真实场景的表示是基于场景的整体形状,而不是局部对象信息。我们介绍了使用深度卷积神经网络来生成非常适合于估计视觉场景的两个全局属性的特征。将提取的特征以有效的方式整合并馈送到线性支持向量机(SVM)中,对自然与人为、开放与封闭进行分类。输入图像的这两个全局属性(自然性和开放性)可以通过卷积神经网络最低层的激活来预测,卷积神经网络已经为场景识别任务进行了训练。计算模型在全空间频率范围和有限空间频率范围内的一致结果表明,深度场景CNN的最低层图像表示包含图像的整体信息,因为它在模拟场景的全局形状方面具有最高的准确性。
{"title":"Deriving high-level scene descriptions from deep scene CNN features","authors":"Akram Bayat, M. Pomplun","doi":"10.1109/IPTA.2017.8310111","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310111","url":null,"abstract":"In this paper, we generate two computational models in order to estimate two dominant global properties (naturalness and openness) for representing a scene based on its global spatial structure. Naturalness and openness are two dominant perceptual properties within a multidimensional space in which semantically similar scenes (e.g., corridor and hallway) are assigned to nearby points. In this model space, the representation of a real-world scene is based on the overall shape of a scene but not on local object information. We introduce the use of a deep convolutional neural network for generating features that are well-suited for estimating the two global properties of a visual scene. The extracted features are integrated in an efficient way and fed into a linear support vector machine (SVM) to classify naturalness versus man-madeness and openness versus closedness. These two global properties (naturalness and openness) of an input image can be predicted from activations in the lowest layer of the convolutional neural network which has been trained for a scene recognition task. The consistent results of computational models in full and restricted spatial frequency ranges suggest that the representation of an image in the lowest layer of the deep scene CNN contains holistic information of the images as it leads to highest accuracy in modelling the global shape of the scene.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116969890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multi-modal data fusion for pain intensity assessment and classification 多模态数据融合用于疼痛强度评估和分类
Patrick Thiam, F. Schwenker
In this work, an assessment of several fusion architectures is undertaken within the scope of the development of a pain intensity classification system. The assessment is based on the recently recorded SenseEmotion Database [1], which consists of several individuals subjected to three gradually increasing levels of pain intensity, induced through temperature elevation (heat stimulation) under controlled conditions. Several modalities, including audio, video, respiration, electrocardiography, electromyography and electrodermal activity, were synchronously recorded during the experiments. A broad spectrum of descriptors is extracted from each of the involved modalities, followed by an assessment of the combination of the extracted descriptors through several fusion architectures. Experimental validation suggests that the choice of an appropriate fusion architecture, which is able to significantly improve over the performance of the best single modality, mainly depends on the amount of data available for the training of the classification architecture.
在这项工作中,在开发疼痛强度分类系统的范围内,对几种融合架构进行了评估。该评估基于最近记录的SenseEmotion数据库[1],该数据库由几个个体组成,他们在受控条件下通过温度升高(热刺激)引起三种逐渐增加的疼痛强度。实验过程中同步记录音频、视频、呼吸、心电图、肌电图和皮电活动等多种模式。从每个涉及的模态中提取广泛的描述符,然后通过几个融合架构评估提取的描述符的组合。实验验证表明,选择合适的融合体系结构是否能够显著提高最佳单一模态的性能,主要取决于可用于分类体系结构训练的数据量。
{"title":"Multi-modal data fusion for pain intensity assessment and classification","authors":"Patrick Thiam, F. Schwenker","doi":"10.1109/IPTA.2017.8310115","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310115","url":null,"abstract":"In this work, an assessment of several fusion architectures is undertaken within the scope of the development of a pain intensity classification system. The assessment is based on the recently recorded SenseEmotion Database [1], which consists of several individuals subjected to three gradually increasing levels of pain intensity, induced through temperature elevation (heat stimulation) under controlled conditions. Several modalities, including audio, video, respiration, electrocardiography, electromyography and electrodermal activity, were synchronously recorded during the experiments. A broad spectrum of descriptors is extracted from each of the involved modalities, followed by an assessment of the combination of the extracted descriptors through several fusion architectures. Experimental validation suggests that the choice of an appropriate fusion architecture, which is able to significantly improve over the performance of the best single modality, mainly depends on the amount of data available for the training of the classification architecture.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"8 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123201315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Coarse-to-fine texture analysis for inner cell mass identification in human blastocyst microscopic images 人囊胚显微图像内细胞团鉴别的粗-细纹理分析
Reza Moradi Rad, Parvaneh Saeedi, J. Au, J. Havelock
Accurate identification of different components of a developing human embryo play crucial roles in assessing the quality of such embryo. One of the most important components of a day-5 human embryo is Inner Cell Mass (ICM). ICM is a part of an embryo that will eventually develop into a fetus. In this paper, an automatic coarse-to-fine texture based approach presented to identify regions of an embryo corresponding to the ICM. First, blastocyst area corresponding to the textured regions is recognized using Gabor and DCT features. Next, two ICM localization approaches are introduced to identify a rough estimate of the ICM location. Finally, the boundaries of the ICM region is finalized using a region based level-set. Experimental results on a data set of 220 day-5 human embryo images confirm that the proposed method is capable of identifying ICM with average Precision, Recall, and Jaccard Index of 78.7%, 86.8%, and 70.3%, respectively.
准确鉴定发育中的人类胚胎的不同成分对评估胚胎的质量起着至关重要的作用。第5天人类胚胎最重要的组成部分之一是内细胞团(ICM)。ICM是胚胎的一部分,最终将发育成胎儿。本文提出了一种基于粗到细纹理的自动识别胚胎ICM对应区域的方法。首先,利用Gabor和DCT特征识别纹理区域对应的囊胚区域;接下来,介绍了两种ICM定位方法来确定ICM位置的粗略估计。最后,使用基于区域的水平集确定ICM区域的边界。在220个5天的人类胚胎图像数据集上的实验结果证实,该方法能够识别ICM,平均Precision、Recall和Jaccard Index分别为78.7%、86.8%和70.3%。
{"title":"Coarse-to-fine texture analysis for inner cell mass identification in human blastocyst microscopic images","authors":"Reza Moradi Rad, Parvaneh Saeedi, J. Au, J. Havelock","doi":"10.1109/IPTA.2017.8310152","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310152","url":null,"abstract":"Accurate identification of different components of a developing human embryo play crucial roles in assessing the quality of such embryo. One of the most important components of a day-5 human embryo is Inner Cell Mass (ICM). ICM is a part of an embryo that will eventually develop into a fetus. In this paper, an automatic coarse-to-fine texture based approach presented to identify regions of an embryo corresponding to the ICM. First, blastocyst area corresponding to the textured regions is recognized using Gabor and DCT features. Next, two ICM localization approaches are introduced to identify a rough estimate of the ICM location. Finally, the boundaries of the ICM region is finalized using a region based level-set. Experimental results on a data set of 220 day-5 human embryo images confirm that the proposed method is capable of identifying ICM with average Precision, Recall, and Jaccard Index of 78.7%, 86.8%, and 70.3%, respectively.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121639748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Comparing keyframe summaries of egocentric videos: Closest-to-centroid baseline 比较自我中心视频的关键帧摘要:最接近质心基线
L. Kuncheva, Paria Yousefi, J. Almeida
Evaluation of keyframe video summaries is a notoriously difficult problem. So far, there is no consensus on guidelines, protocols, benchmarks and baseline models. This study contributes in three ways: (1) We propose a new baseline model for creating a keyframe summary, called Closest-to-Centroid, and show that it is a better contestant compared to the two most popular baselines: uniform sampling and choosing the mid-event frame. (2) We also propose a method for matching the visual appearance of keyframes, suitable for comparing summaries of egocentric videos and lifelogging photostreams. (3) We examine 24 image feature spaces (different descriptors) including colour, texture, shape, motion and a feature space extracted by a pre-trained convolutional neural network (CNN). Our results using the four egocentric videos in the UTE database favour low-level shape and colour feature spaces for use with CC.
关键帧视频摘要的评估是一个众所周知的难题。到目前为止,在指导方针、协议、基准和基线模型上还没有达成共识。本研究在三个方面做出了贡献:(1)我们提出了一个新的基线模型,用于创建关键帧摘要,称为“最接近质心”,并表明与两种最流行的基线(均匀采样和选择中间事件帧)相比,它是一个更好的竞争者。(2)我们还提出了一种匹配关键帧视觉外观的方法,适合于比较以自我为中心的视频和记录生活的照片流的摘要。(3)我们研究了24个图像特征空间(不同的描述符),包括颜色、纹理、形状、运动和由预训练的卷积神经网络(CNN)提取的特征空间。我们使用UTE数据库中的四个自我中心视频的结果有利于与CC一起使用的低级形状和颜色特征空间。
{"title":"Comparing keyframe summaries of egocentric videos: Closest-to-centroid baseline","authors":"L. Kuncheva, Paria Yousefi, J. Almeida","doi":"10.1109/IPTA.2017.8310123","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310123","url":null,"abstract":"Evaluation of keyframe video summaries is a notoriously difficult problem. So far, there is no consensus on guidelines, protocols, benchmarks and baseline models. This study contributes in three ways: (1) We propose a new baseline model for creating a keyframe summary, called Closest-to-Centroid, and show that it is a better contestant compared to the two most popular baselines: uniform sampling and choosing the mid-event frame. (2) We also propose a method for matching the visual appearance of keyframes, suitable for comparing summaries of egocentric videos and lifelogging photostreams. (3) We examine 24 image feature spaces (different descriptors) including colour, texture, shape, motion and a feature space extracted by a pre-trained convolutional neural network (CNN). Our results using the four egocentric videos in the UTE database favour low-level shape and colour feature spaces for use with CC.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114164676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Recursive 3D scene estimation with multiple camera pairs 多相机对的三维场景递归估计
Torsten Engler, Hans-Joachim Wünsche
In this paper we present the recursive estimation of static scenes with multiple stereo camera pairs. The estimation is based on a point cloud created from the disparities of the cameras. The focus lies on reducing erroneous measurements while obtaining a comparatively dense measurement in real time. While recursive scene estimation via stereo cameras has been presented several times before, the estimation has never been exploited in the measurement algorithm. We propose the usage of the current scene estimation in the disparity measurement to increase robustness, denseness and outlier rejection. A scene prior is created for each measurement using OpenGL taking occlusions, camera positions and existence probability into account. Additionally, multiple stereo pairs with different alignment provide distinct information. Each disparity measurement benefits from the complete scene knowledge the other stereo camera pairs provide. The creation of new points for the point cloud is based on a scaled version of the current scene and allows for simple trade-off between computational effort and point cloud denseness.
本文提出了多对立体相机静态场景的递归估计方法。估计是基于由相机的差异产生的点云。重点在于减少错误的测量,同时获得相对密集的实时测量。虽然以前已经多次提出了基于立体摄像机的递归场景估计,但该估计从未在测量算法中得到利用。我们建议在视差测量中使用当前场景估计来增加鲁棒性、密度和异常值抑制。使用OpenGL为每个测量创建一个场景先验,将遮挡、相机位置和存在概率考虑在内。此外,具有不同排列方式的多个立体对提供不同的信息。每个视差测量受益于完整的场景知识,其他立体相机对提供。点云的新点的创建基于当前场景的缩放版本,并且允许在计算工作量和点云密度之间进行简单的权衡。
{"title":"Recursive 3D scene estimation with multiple camera pairs","authors":"Torsten Engler, Hans-Joachim Wünsche","doi":"10.1109/IPTA.2017.8310129","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310129","url":null,"abstract":"In this paper we present the recursive estimation of static scenes with multiple stereo camera pairs. The estimation is based on a point cloud created from the disparities of the cameras. The focus lies on reducing erroneous measurements while obtaining a comparatively dense measurement in real time. While recursive scene estimation via stereo cameras has been presented several times before, the estimation has never been exploited in the measurement algorithm. We propose the usage of the current scene estimation in the disparity measurement to increase robustness, denseness and outlier rejection. A scene prior is created for each measurement using OpenGL taking occlusions, camera positions and existence probability into account. Additionally, multiple stereo pairs with different alignment provide distinct information. Each disparity measurement benefits from the complete scene knowledge the other stereo camera pairs provide. The creation of new points for the point cloud is based on a scaled version of the current scene and allows for simple trade-off between computational effort and point cloud denseness.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114717798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
EEG source imaging based on spatial and temporal graph structures 基于时空图结构的脑电源成像
Jing Qin, Feng Liu, Shouyi Wang, J. Rosenberger
EEG serves as an essential tool for brain source localization due to its high temporal resolution. However, the inference of brain activities from the EEG data is, in general, a challenging ill-posed inverse problem. To better retrieve task related discriminative source patches from strong spontaneous background signals, we propose a novel EEG source imaging model based on spatial and temporal graph structures. In particular, graph fractional-order total variation (gFOTV) is used to enhance spatial smoothness, and the label information of brain state is enclosed in a temporal graph regularization term to guarantee intra-class consistency of estimated sources. The proposed model is efficiently solved by the alternating direction method of multipliers (ADMM). A two-stage algorithm is proposed as well to further improve the result. Numerical experiments have shown that our method localizes source extents more effectively than the benchmark methods.
脑电图具有较高的时间分辨率,是脑源定位的重要工具。然而,从脑电图数据推断大脑活动通常是一个具有挑战性的不适定逆问题。为了更好地从强自发背景信号中提取任务相关的判别源补丁,提出了一种基于时空图结构的脑电源成像模型。其中,利用图分数阶总变差(gFOTV)增强了空间平滑性,并将脑状态的标签信息封装在时序图正则化项中,保证了估计源的类内一致性。采用乘法器交替方向法(ADMM)对该模型进行了有效求解。为了进一步改进结果,提出了一种两阶段算法。数值实验表明,该方法比基准方法更有效地定位了声源范围。
{"title":"EEG source imaging based on spatial and temporal graph structures","authors":"Jing Qin, Feng Liu, Shouyi Wang, J. Rosenberger","doi":"10.1109/IPTA.2017.8310089","DOIUrl":"https://doi.org/10.1109/IPTA.2017.8310089","url":null,"abstract":"EEG serves as an essential tool for brain source localization due to its high temporal resolution. However, the inference of brain activities from the EEG data is, in general, a challenging ill-posed inverse problem. To better retrieve task related discriminative source patches from strong spontaneous background signals, we propose a novel EEG source imaging model based on spatial and temporal graph structures. In particular, graph fractional-order total variation (gFOTV) is used to enhance spatial smoothness, and the label information of brain state is enclosed in a temporal graph regularization term to guarantee intra-class consistency of estimated sources. The proposed model is efficiently solved by the alternating direction method of multipliers (ADMM). A two-stage algorithm is proposed as well to further improve the result. Numerical experiments have shown that our method localizes source extents more effectively than the benchmark methods.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124828747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1