首页 > 最新文献

2017 IEEE International Conference on Computer Vision Workshops (ICCVW)最新文献

英文 中文
Visual Tracking of Small Animals in Cluttered Natural Environments Using a Freely Moving Camera 在杂乱的自然环境中使用自由移动相机的小动物视觉跟踪
Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.335
B. Risse, M. Mangan, B. Webb, Luca Del Pero
Image-based tracking of animals in their natural habitats can provide rich behavioural data, but is very challenging due to complex and dynamic background and target appearances. We present an effective method to recover the positions of terrestrial animals in cluttered environments from video sequences filmed using a freely moving monocular camera. The method uses residual motion cues to detect the targets and is thus robust to different lighting conditions and requires no a-priori appearance model of the animal or environment. The detection is globally optimised based on an inference problem formulation using factor graphs. This handles ambiguities such as occlusions and intersections and provides automatic initialisation. Furthermore, this formulation allows a seamless integration of occasional user input for the most difficult situations, so that the effect of a few manual position estimates are smoothly distributed over long sequences. Testing our system against a benchmark dataset featuring small targets in natural scenes, we obtain 96% accuracy for fully automated tracking. We also demonstrate reliable tracking in a new data set that includes different targets (insects, vertebrates or artificial objects) in a variety of environments (desert, jungle, meadows, urban) using different imaging devices (day / night vision cameras, smart phones) and modalities (stationary, hand-held, drone operated).
基于图像的动物自然栖息地跟踪可以提供丰富的行为数据,但由于复杂和动态的背景和目标外观,这是非常具有挑战性的。我们提出了一种有效的方法,从使用自由移动的单目摄像机拍摄的视频序列中恢复杂乱环境中陆生动物的位置。该方法使用残余运动线索来检测目标,因此对不同的光照条件具有鲁棒性,并且不需要先验的动物或环境外观模型。检测是基于使用因子图的推理问题公式进行全局优化的。这样可以处理歧义,如遮挡和交叉,并提供自动初始化。此外,该公式允许在最困难的情况下无缝集成偶尔的用户输入,以便一些手动位置估计的效果在长序列中平滑分布。针对自然场景中具有小目标的基准数据集测试我们的系统,我们获得了96%的全自动跟踪准确率。我们还展示了可靠的跟踪在一个新的数据集中,包括不同的目标(昆虫,脊椎动物或人工物体)在各种环境(沙漠,丛林,草地,城市)使用不同的成像设备(昼/夜视相机,智能手机)和模式(固定,手持,无人机操作)。
{"title":"Visual Tracking of Small Animals in Cluttered Natural Environments Using a Freely Moving Camera","authors":"B. Risse, M. Mangan, B. Webb, Luca Del Pero","doi":"10.1109/ICCVW.2017.335","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.335","url":null,"abstract":"Image-based tracking of animals in their natural habitats can provide rich behavioural data, but is very challenging due to complex and dynamic background and target appearances. We present an effective method to recover the positions of terrestrial animals in cluttered environments from video sequences filmed using a freely moving monocular camera. The method uses residual motion cues to detect the targets and is thus robust to different lighting conditions and requires no a-priori appearance model of the animal or environment. The detection is globally optimised based on an inference problem formulation using factor graphs. This handles ambiguities such as occlusions and intersections and provides automatic initialisation. Furthermore, this formulation allows a seamless integration of occasional user input for the most difficult situations, so that the effect of a few manual position estimates are smoothly distributed over long sequences. Testing our system against a benchmark dataset featuring small targets in natural scenes, we obtain 96% accuracy for fully automated tracking. We also demonstrate reliable tracking in a new data set that includes different targets (insects, vertebrates or artificial objects) in a variety of environments (desert, jungle, meadows, urban) using different imaging devices (day / night vision cameras, smart phones) and modalities (stationary, hand-held, drone operated).","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114726560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
SnapNet-R: Consistent 3D Multi-view Semantic Labeling for Robotics SnapNet-R:用于机器人的一致3D多视图语义标记
Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.85
J. Guerry, Alexandre Boulch, B. L. Saux, J. Moras, A. Plyer, David Filliat
In this paper we present a new approach for semantic recognition in the context of robotics. When a robot evolves in its environment, it gets 3D information given either by its sensors or by its own motion through 3D reconstruction. Our approach uses (i) 3D-coherent synthesis of scene observations and (ii) mix them in a multi-view framework for 3D labeling. (iii) This is efficient locally (for 2D semantic segmentation) and globally (for 3D structure labeling). This allows to add semantics to the observed scene that goes beyond simple image classification, as shown on challenging datasets such as SUNRGBD or the 3DRMS Reconstruction Challenge.
在本文中,我们提出了一种新的机器人语境下的语义识别方法。当机器人在其环境中进化时,它通过传感器或自身运动通过3D重建获得3D信息。我们的方法使用(i)场景观测的3D相干合成和(ii)将它们混合在3D标记的多视图框架中。(iii)这是有效的局部(2D语义分割)和全局(3D结构标记)。这允许在观察到的场景中添加语义,而不仅仅是简单的图像分类,如SUNRGBD或3DRMS重建挑战等具有挑战性的数据集所示。
{"title":"SnapNet-R: Consistent 3D Multi-view Semantic Labeling for Robotics","authors":"J. Guerry, Alexandre Boulch, B. L. Saux, J. Moras, A. Plyer, David Filliat","doi":"10.1109/ICCVW.2017.85","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.85","url":null,"abstract":"In this paper we present a new approach for semantic recognition in the context of robotics. When a robot evolves in its environment, it gets 3D information given either by its sensors or by its own motion through 3D reconstruction. Our approach uses (i) 3D-coherent synthesis of scene observations and (ii) mix them in a multi-view framework for 3D labeling. (iii) This is efficient locally (for 2D semantic segmentation) and globally (for 3D structure labeling). This allows to add semantics to the observed scene that goes beyond simple image classification, as shown on challenging datasets such as SUNRGBD or the 3DRMS Reconstruction Challenge.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126552917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
Fully Convolutional Network and Region Proposal for Instance Identification with Egocentric Vision 自我中心视觉实例识别的全卷积网络和区域建议
Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.281
Maxime Portaz, Matthias Kohl, G. Quénot, J. Chevallet
This paper presents a novel approach for egocentric image retrieval and object detection. This approach uses fully convolutional networks (FCN) to obtain region proposals without the need for an additional component in the network and training. It is particularly suited for small datasets with low object variability. The proposed network can be trained end-to-end and produces an effective global descriptor as an image representation. Additionally, it can be built upon any type of CNN pre-trained for classification. Through multiple experiments on two egocentric image datasets taken from museum visits, we show that the descriptor obtained using our proposed network outperforms those from previous state-of-the-art approaches. It is also just as memory-efficient, making it adapted to mobile devices such as an augmented museum audio-guide.
提出了一种以自我为中心的图像检索和目标检测方法。该方法使用全卷积网络(FCN)来获得区域建议,而不需要在网络中添加额外的组件和训练。它特别适用于具有低对象可变性的小数据集。所提出的网络可以端到端训练,并产生有效的全局描述符作为图像表示。此外,它可以建立在任何类型的CNN预训练分类。通过对从博物馆参观中获取的两个以自我为中心的图像数据集进行多次实验,我们表明使用我们提出的网络获得的描述符优于以前最先进的方法。它也同样节省内存,使其适合移动设备,如增强博物馆音频导览。
{"title":"Fully Convolutional Network and Region Proposal for Instance Identification with Egocentric Vision","authors":"Maxime Portaz, Matthias Kohl, G. Quénot, J. Chevallet","doi":"10.1109/ICCVW.2017.281","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.281","url":null,"abstract":"This paper presents a novel approach for egocentric image retrieval and object detection. This approach uses fully convolutional networks (FCN) to obtain region proposals without the need for an additional component in the network and training. It is particularly suited for small datasets with low object variability. The proposed network can be trained end-to-end and produces an effective global descriptor as an image representation. Additionally, it can be built upon any type of CNN pre-trained for classification. Through multiple experiments on two egocentric image datasets taken from museum visits, we show that the descriptor obtained using our proposed network outperforms those from previous state-of-the-art approaches. It is also just as memory-efficient, making it adapted to mobile devices such as an augmented museum audio-guide.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134041516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Results and Analysis of ChaLearn LAP Multi-modal Isolated and Continuous Gesture Recognition, and Real Versus Fake Expressed Emotions Challenges ChaLearn LAP多模态孤立和连续手势识别的结果与分析,以及真实与虚假表达情绪的挑战
Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.377
Jun Wan, Sergio Escalera, G. Anbarjafari, H. Escalante, Xavier Baró, Isabelle M Guyon, Meysam Madadi, J. Allik, Jelena Gorbova, Chi Lin, Yiliang Xie
We analyze the results of the 2017 ChaLearn Looking at People Challenge at ICCV The challenge comprised three tracks: (1) large-scale isolated (2) continuous gesture recognition, and (3) real versus fake expressed emotions tracks. It is the second round for both gesture recognition challenges, which were held first in the context of the ICPR 2016 workshop on "multimedia challenges beyond visual analysis". In this second round, more participants joined the competitions, and the performances considerably improved compared to the first round. Particularly, the best recognition accuracy of isolated gesture recognition has improved from 56.90% to 67.71% in the IsoGD test set, and Mean Jaccard Index (MJI) of continuous gesture recognition has improved from 0.2869 to 0.6103 in the ConGD test set. The third track is the first challenge on real versus fake expressed emotion classification, including six emotion categories, for which a novel database was introduced. The first place was shared between two teams who achieved 67.70% averaged recognition rate on the test set. The data of the three tracks, the participants' code and method descriptions are publicly available to allow researchers to keep making progress in the field.
我们在ICCV上分析了2017年challearn look at People挑战的结果,该挑战包括三个方面:(1)大规模孤立的(2)连续的手势识别,以及(3)真实与虚假表达的情绪轨迹。这是手势识别挑战的第二轮,第一轮是在ICPR 2016“超越视觉分析的多媒体挑战”研讨会的背景下举行的。在第二轮比赛中,更多的参与者参加了比赛,成绩比第一轮有了很大的提高。特别是,在IsoGD测试集中,孤立手势识别的最佳识别准确率从56.90%提高到67.71%,在cond测试集中,连续手势识别的平均Jaccard指数(MJI)从0.2869提高到0.6103。第三轨道是真实与虚假表达情感分类的第一个挑战,包括六个情感类别,为此引入了一个新的数据库。第一名由两个团队共享,他们在测试集中的平均识别率达到67.70%。三个轨道的数据、参与者的代码和方法描述都是公开的,以便研究人员在该领域不断取得进展。
{"title":"Results and Analysis of ChaLearn LAP Multi-modal Isolated and Continuous Gesture Recognition, and Real Versus Fake Expressed Emotions Challenges","authors":"Jun Wan, Sergio Escalera, G. Anbarjafari, H. Escalante, Xavier Baró, Isabelle M Guyon, Meysam Madadi, J. Allik, Jelena Gorbova, Chi Lin, Yiliang Xie","doi":"10.1109/ICCVW.2017.377","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.377","url":null,"abstract":"We analyze the results of the 2017 ChaLearn Looking at People Challenge at ICCV The challenge comprised three tracks: (1) large-scale isolated (2) continuous gesture recognition, and (3) real versus fake expressed emotions tracks. It is the second round for both gesture recognition challenges, which were held first in the context of the ICPR 2016 workshop on \"multimedia challenges beyond visual analysis\". In this second round, more participants joined the competitions, and the performances considerably improved compared to the first round. Particularly, the best recognition accuracy of isolated gesture recognition has improved from 56.90% to 67.71% in the IsoGD test set, and Mean Jaccard Index (MJI) of continuous gesture recognition has improved from 0.2869 to 0.6103 in the ConGD test set. The third track is the first challenge on real versus fake expressed emotion classification, including six emotion categories, for which a novel database was introduced. The first place was shared between two teams who achieved 67.70% averaged recognition rate on the test set. The data of the three tracks, the participants' code and method descriptions are publicly available to allow researchers to keep making progress in the field.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"330 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134263313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Visualizing Apparent Personality Analysis with Deep Residual Networks 用深度残差网络可视化表观人格分析
Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.367
Yağmur Güçlütürk, Umut Güçlü, Marc Pérez, H. Escalante, Xavier Baró, C. Andújar, Isabelle M Guyon, Julio C. S. Jacques Junior, Meysam Madadi, Sergio Escalera, M. Gerven, R. Lier
Automatic prediction of personality traits is a subjective task that has recently received much attention. Specifically, automatic apparent personality trait prediction from multimodal data has emerged as a hot topic within the filed of computer vision and, more particularly, the so called "looking at people" sub-field. Considering "apparent" personality traits as opposed to real ones considerably reduces the subjectivity of the task. The real world applications are encountered in a wide range of domains, including entertainment, health, human computer interaction, recruitment and security. Predictive models of personality traits are useful for individuals in many scenarios (e.g., preparing for job interviews, preparing for public speaking). However, these predictions in and of themselves might be deemed to be untrustworthy without human understandable supportive evidence. Through a series of experiments on a recently released benchmark dataset for automatic apparent personality trait prediction, this paper characterizes the audio and visual information that is used by a state-of-the-art model while making its predictions, so as to provide such supportive evidence by explaining predictions made. Additionally, the paper describes a new web application, which gives feedback on apparent personality traits of its users by combining model predictions with their explanations.
人格特征的自动预测是一项最近受到广泛关注的主观任务。具体来说,从多模态数据中自动预测表观人格特质已经成为计算机视觉领域的一个热门话题,更具体地说,是所谓的“看人”子领域。考虑“表面的”人格特征而不是真实的人格特征,大大降低了任务的主观性。现实世界的应用程序涉及广泛的领域,包括娱乐、健康、人机交互、招聘和安全。性格特征的预测模型在很多情况下对个人都很有用(例如,准备工作面试,准备公开演讲)。然而,如果没有人类可以理解的支持证据,这些预测本身可能被认为是不可信的。本文通过在最近发布的一个自动表观人格特质预测基准数据集上的一系列实验,对一个最先进的模型在进行预测时所使用的音频和视觉信息进行表征,从而通过解释所做的预测来提供支持性证据。此外,本文还介绍了一种新的网络应用程序,该应用程序通过将模型预测与用户的解释相结合,对用户的明显个性特征进行反馈。
{"title":"Visualizing Apparent Personality Analysis with Deep Residual Networks","authors":"Yağmur Güçlütürk, Umut Güçlü, Marc Pérez, H. Escalante, Xavier Baró, C. Andújar, Isabelle M Guyon, Julio C. S. Jacques Junior, Meysam Madadi, Sergio Escalera, M. Gerven, R. Lier","doi":"10.1109/ICCVW.2017.367","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.367","url":null,"abstract":"Automatic prediction of personality traits is a subjective task that has recently received much attention. Specifically, automatic apparent personality trait prediction from multimodal data has emerged as a hot topic within the filed of computer vision and, more particularly, the so called \"looking at people\" sub-field. Considering \"apparent\" personality traits as opposed to real ones considerably reduces the subjectivity of the task. The real world applications are encountered in a wide range of domains, including entertainment, health, human computer interaction, recruitment and security. Predictive models of personality traits are useful for individuals in many scenarios (e.g., preparing for job interviews, preparing for public speaking). However, these predictions in and of themselves might be deemed to be untrustworthy without human understandable supportive evidence. Through a series of experiments on a recently released benchmark dataset for automatic apparent personality trait prediction, this paper characterizes the audio and visual information that is used by a state-of-the-art model while making its predictions, so as to provide such supportive evidence by explaining predictions made. Additionally, the paper describes a new web application, which gives feedback on apparent personality traits of its users by combining model predictions with their explanations.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127590574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Adaptive Pooling in Multi-instance Learning for Web Video Annotation Web视频标注多实例学习中的自适应池
Pub Date : 2017-10-22 DOI: 10.1109/ICCVW.2017.46
Dong Liu, Y. Zhou, Xiaoyan Sun, Zhengjun Zha, Wenjun Zeng
Web videos are usually weakly annotated, i.e., a tag is associated to a video once the corresponding concept appears in a frame of this video without indicating when and where it occurs. These weakly annotated tags pose big troubles to many Web video applications, e.g. search and recommendation. In this paper, we present a new Web video annotation approach based on multi-instance learning (MIL) with a learnable pooling function. By formulating the Web video annotation as a MIL problem, we present an end-to-end deep network framework to solve this problem in which the frame (instance) level annotation is estimated from tags given at the video (bag of instances) level via a convolutional neural network (CNN). A learnable pooling function is proposed to adaptively fuse the outputs of the CNN to determine tags at the video level. We further propose a new loss function that consists of both bag-level and instance-level losses, which enables the penalty term to be aware of the internal state of network rather than only an overall loss, thus makes the pooling function learned better and faster. Experimental results demonstrate that our proposed framework is able to not only enhance the accuracy of Web video annotation by outperforming the state-of-the-art Web video annotation methods on the large-scale video dataset FCVID, but also help to infer the most relevant frames in Web videos.
网络视频通常是弱注释的,即一旦相应的概念出现在视频的一帧中,就将标签与该视频关联起来,而不指示其出现的时间和地点。这些弱注释标签给许多Web视频应用带来了很大的麻烦,例如搜索和推荐。本文提出了一种新的基于多实例学习的Web视频标注方法,该方法具有可学习池化功能。通过将Web视频注释表述为MIL问题,我们提出了一个端到端的深度网络框架来解决这个问题,其中通过卷积神经网络(CNN)从视频(实例包)级别给出的标签估计帧(实例)级别的注释。提出了一种可学习池化函数,用于自适应融合CNN的输出以确定视频级别的标签。我们进一步提出了一种新的损失函数,它由袋级和实例级损失组成,使惩罚项能够意识到网络的内部状态,而不仅仅是整体损失,从而使池化函数学习得更好更快。实验结果表明,我们提出的框架不仅能够在大规模视频数据集FCVID上优于当前最先进的Web视频注释方法,提高Web视频注释的准确性,而且有助于推断Web视频中最相关的帧。
{"title":"Adaptive Pooling in Multi-instance Learning for Web Video Annotation","authors":"Dong Liu, Y. Zhou, Xiaoyan Sun, Zhengjun Zha, Wenjun Zeng","doi":"10.1109/ICCVW.2017.46","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.46","url":null,"abstract":"Web videos are usually weakly annotated, i.e., a tag is associated to a video once the corresponding concept appears in a frame of this video without indicating when and where it occurs. These weakly annotated tags pose big troubles to many Web video applications, e.g. search and recommendation. In this paper, we present a new Web video annotation approach based on multi-instance learning (MIL) with a learnable pooling function. By formulating the Web video annotation as a MIL problem, we present an end-to-end deep network framework to solve this problem in which the frame (instance) level annotation is estimated from tags given at the video (bag of instances) level via a convolutional neural network (CNN). A learnable pooling function is proposed to adaptively fuse the outputs of the CNN to determine tags at the video level. We further propose a new loss function that consists of both bag-level and instance-level losses, which enables the penalty term to be aware of the internal state of network rather than only an overall loss, thus makes the pooling function learned better and faster. Experimental results demonstrate that our proposed framework is able to not only enhance the accuracy of Web video annotation by outperforming the state-of-the-art Web video annotation methods on the large-scale video dataset FCVID, but also help to infer the most relevant frames in Web videos.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127290844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
HyKo: A Spectral Dataset for Scene Understanding HyKo:场景理解的光谱数据集
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.39
Christian Winkens, Florian Sattler, Veronika Adams, D. Paulus
We present datasets containing urban traffic and rural road scenes recorded using hyperspectral snap-shot sensors mounted on a moving car. The novel hyperspectral cameras used can capture whole spectral cubes at up to 15 Hz. This emerging new sensor modality enables hyperspectral scene analysis for autonomous driving tasks. Up to the best of the author's knowledge no such dataset has been published so far. The datasets contain synchronized 3-D laser, spectrometer and hyperspectral data. Dense ground truth annotations are provided as semantic labels, material and traversability. The hyperspectral data ranges from visible to near infrared wavelengths. We explain our recoding platform and method, the associated data format along with a code library for easy data consumption. The datasets are publicly available for download.
我们提供了包含城市交通和农村道路场景的数据集,这些数据集使用安装在移动汽车上的高光谱快照传感器记录。使用的新型高光谱相机可以捕获高达15 Hz的整个光谱立方体。这种新兴的新型传感器模式使自动驾驶任务的高光谱场景分析成为可能。据笔者所知,到目前为止还没有这样的数据集发表。数据集包含同步的三维激光、光谱仪和高光谱数据。密集的基础真值注解作为语义标签、材料和可遍历性提供。高光谱数据的范围从可见光到近红外波长。我们解释了我们的编码平台和方法,相关的数据格式以及便于数据使用的代码库。这些数据集可公开下载。
{"title":"HyKo: A Spectral Dataset for Scene Understanding","authors":"Christian Winkens, Florian Sattler, Veronika Adams, D. Paulus","doi":"10.1109/ICCVW.2017.39","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.39","url":null,"abstract":"We present datasets containing urban traffic and rural road scenes recorded using hyperspectral snap-shot sensors mounted on a moving car. The novel hyperspectral cameras used can capture whole spectral cubes at up to 15 Hz. This emerging new sensor modality enables hyperspectral scene analysis for autonomous driving tasks. Up to the best of the author's knowledge no such dataset has been published so far. The datasets contain synchronized 3-D laser, spectrometer and hyperspectral data. Dense ground truth annotations are provided as semantic labels, material and traversability. The hyperspectral data ranges from visible to near infrared wavelengths. We explain our recoding platform and method, the associated data format along with a code library for easy data consumption. The datasets are publicly available for download.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114988610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Computer Vision Problems in Plant Phenotyping, CVPPP 2017: Introduction to the CVPPP 2017 Workshop Papers 植物表型中的计算机视觉问题,CVPPP 2017: CVPPP 2017研讨会论文介绍
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.236
H. Scharr, T. Pridmore, S. Tsaftaris
Plant phenotyping is the identification of effects on the phenotype (i.e., the plant appearance and behavior) as a result of genotype differences (i.e., differences in the genetic code) and the environment. Previously, the process of taking phenotypic measurements has been laborious, costly, and time consuming. In recent years, non-invasive, image-based methods have become more common. These images are recorded by a range of capture devices from small embedded camera systems to multi-million Euro smart-greenhouses, at scales ranging from microscopic images of cells, to entire fields captured by UAV imaging. These images needs to be analyzed in a high throughput, robust, and accurate manner.
植物表型是鉴定基因型差异(即遗传密码的差异)和环境对表型(即植物外观和行为)的影响。以前,进行表型测量的过程是费力的、昂贵的和耗时的。近年来,非侵入性、基于图像的方法变得越来越普遍。这些图像由一系列捕获设备记录,从小型嵌入式摄像系统到数百万欧元的智能温室,从细胞的微观图像到无人机成像捕获的整个领域。这些图像需要以高通量、鲁棒性和准确性的方式进行分析。
{"title":"Computer Vision Problems in Plant Phenotyping, CVPPP 2017: Introduction to the CVPPP 2017 Workshop Papers","authors":"H. Scharr, T. Pridmore, S. Tsaftaris","doi":"10.1109/ICCVW.2017.236","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.236","url":null,"abstract":"Plant phenotyping is the identification of effects on the phenotype (i.e., the plant appearance and behavior) as a result of genotype differences (i.e., differences in the genetic code) and the environment. Previously, the process of taking phenotypic measurements has been laborious, costly, and time consuming. In recent years, non-invasive, image-based methods have become more common. These images are recorded by a range of capture devices from small embedded camera systems to multi-million Euro smart-greenhouses, at scales ranging from microscopic images of cells, to entire fields captured by UAV imaging. These images needs to be analyzed in a high throughput, robust, and accurate manner.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"474 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115313800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Lightweight Monocular Obstacle Avoidance by Salient Feature Fusion 基于显著特征融合的轻型单目避障方法
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.92
Andrea Manno-Kovács, Levente Kovács
We present a monocular obstacle avoidance method based on a novel image feature map built by fusing robust saliency features, to be used in embedded systems on lightweight autonomous vehicles. The fused salient features are a textural-directional Harris based feature map and a relative focus feature map. We present the generation of the fused salient map, along with its application for obstacle avoidance. Evaluations are performed from a saliency point of view, and for the assessment of the method's applicability for obstacle avoidance in simulated environments. The presented results support the usability of the method in embedded systems on lightweight unmanned vehicles.
我们提出了一种基于融合鲁棒显著性特征构建的新型图像特征映射的单目避障方法,该方法将用于轻型自动驾驶汽车的嵌入式系统。融合的显著特征是一个基于纹理方向的Harris特征图和一个相对焦点特征图。我们提出了融合凸点地图的生成,以及它在避障中的应用。从显著性的角度进行评估,并评估该方法在模拟环境中避障的适用性。所提出的结果支持了该方法在轻型无人驾驶车辆嵌入式系统中的可用性。
{"title":"Lightweight Monocular Obstacle Avoidance by Salient Feature Fusion","authors":"Andrea Manno-Kovács, Levente Kovács","doi":"10.1109/ICCVW.2017.92","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.92","url":null,"abstract":"We present a monocular obstacle avoidance method based on a novel image feature map built by fusing robust saliency features, to be used in embedded systems on lightweight autonomous vehicles. The fused salient features are a textural-directional Harris based feature map and a relative focus feature map. We present the generation of the fused salient map, along with its application for obstacle avoidance. Evaluations are performed from a saliency point of view, and for the assessment of the method's applicability for obstacle avoidance in simulated environments. The presented results support the usability of the method in embedded systems on lightweight unmanned vehicles.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126828943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Implicit Correspondence in Signed Distance Field Evolution 符号距离场演化中的隐对应
Pub Date : 2017-10-01 DOI: 10.1109/ICCVW.2017.103
Miroslava Slavcheva, Maximilian Baust, Slobodan Ilic
The level set framework is widely used in geometry processing due to its ability to handle topological changes and the readily accessible shape properties it provides, such as normals and curvature. However, its major drawback is the lack of correspondence preservation throughout the level set evolution. Therefore, data associated with the surface, such as colour, is lost. The objective of this paper is a variational approach for signed distance field evolution which implicitly preserves correspondences. We propose an energy functional based on a novel data term, which aligns the lowest-frequency Laplacian eigenfunction representations of the input and target shapes. As these encode information about natural deformations that the shape can undergo, our strategy manages to prevent data diffusion into the volume. We demonstrate that our system is able to preserve texture throughout articulated motion sequences, and evaluate its geometric accuracy on public data.
水平集框架由于其处理拓扑变化的能力以及它提供的易于访问的形状属性(如法线和曲率)而广泛应用于几何处理。然而,它的主要缺点是在整个水平集演化过程中缺乏对应保存。因此,与表面相关的数据,如颜色,就丢失了。本文的目标是一种隐式保留对应的有符号距离场演化变分方法。我们提出了一个基于新数据项的能量泛函,它将输入和目标形状的最低频率拉普拉斯特征函数表示对齐。由于这些编码信息关于形状可以经历的自然变形,我们的策略设法防止数据扩散到体积中。我们证明了我们的系统能够在整个关节运动序列中保持纹理,并在公共数据上评估其几何精度。
{"title":"Towards Implicit Correspondence in Signed Distance Field Evolution","authors":"Miroslava Slavcheva, Maximilian Baust, Slobodan Ilic","doi":"10.1109/ICCVW.2017.103","DOIUrl":"https://doi.org/10.1109/ICCVW.2017.103","url":null,"abstract":"The level set framework is widely used in geometry processing due to its ability to handle topological changes and the readily accessible shape properties it provides, such as normals and curvature. However, its major drawback is the lack of correspondence preservation throughout the level set evolution. Therefore, data associated with the surface, such as colour, is lost. The objective of this paper is a variational approach for signed distance field evolution which implicitly preserves correspondences. We propose an energy functional based on a novel data term, which aligns the lowest-frequency Laplacian eigenfunction representations of the input and target shapes. As these encode information about natural deformations that the shape can undergo, our strategy manages to prevent data diffusion into the volume. We demonstrate that our system is able to preserve texture throughout articulated motion sequences, and evaluate its geometric accuracy on public data.","PeriodicalId":149766,"journal":{"name":"2017 IEEE International Conference on Computer Vision Workshops (ICCVW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123020272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2017 IEEE International Conference on Computer Vision Workshops (ICCVW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1