基于深度时空随机场的高效视频分割

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI:10.1109/CVPR.2018.00929

Siddhartha Chandra, C. Couprie, Iasonas Kokkinos

{"title":"基于深度时空随机场的高效视频分割","authors":"Siddhartha Chandra, C. Couprie, Iasonas Kokkinos","doi":"10.1109/CVPR.2018.00929","DOIUrl":null,"url":null,"abstract":"In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely-connected spatio-temporal graph by capitalizing on recent advances on deep Gaussian Conditional Random Fields (GCRFs). Our method, called VideoGCRF is (a) efficient, (b) has a unique global minimum, and (c) can be trained end-to-end alongside contemporary deep networks for video understanding. We experiment with multiple connectivity patterns in the temporal domain, and present empirical improvements over strong baselines on the tasks of both semantic and instance segmentation of videos. Our implementation is based on the Caffe2 framework and will be available at https://github.com/siddharthachandra/gcrf-v3.0.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"51 1","pages":"8915-8924"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":"{\"title\":\"Deep Spatio-Temporal Random Fields for Efficient Video Segmentation\",\"authors\":\"Siddhartha Chandra, C. Couprie, Iasonas Kokkinos\",\"doi\":\"10.1109/CVPR.2018.00929\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely-connected spatio-temporal graph by capitalizing on recent advances on deep Gaussian Conditional Random Fields (GCRFs). Our method, called VideoGCRF is (a) efficient, (b) has a unique global minimum, and (c) can be trained end-to-end alongside contemporary deep networks for video understanding. We experiment with multiple connectivity patterns in the temporal domain, and present empirical improvements over strong baselines on the tasks of both semantic and instance segmentation of videos. Our implementation is based on the Caffe2 framework and will be available at https://github.com/siddharthachandra/gcrf-v3.0.\",\"PeriodicalId\":6564,\"journal\":{\"name\":\"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition\",\"volume\":\"51 1\",\"pages\":\"8915-8924\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"49\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2018.00929\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2018.00929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

摘要

在这项工作中，我们引入了一种时间和记忆效率高的结构化预测方法，该方法将神经元的决策在时间和空间上结合起来。我们表明，通过利用深度高斯条件随机场(GCRFs)的最新进展，我们能够在密集连接的时空图上执行精确和有效的推理。我们的方法，称为video - crf (a)高效，(b)具有独特的全局最小值，(c)可以与当代深度网络一起进行端到端视频理解训练。我们在时域中对多种连接模式进行了实验，并在视频的语义和实例分割任务上对强基线进行了经验改进。我们的实现基于Caffe2框架，可以在https://github.com/siddharthachandra/gcrf-v3.0上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep Spatio-Temporal Random Fields for Efficient Video Segmentation

In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely-connected spatio-temporal graph by capitalizing on recent advances on deep Gaussian Conditional Random Fields (GCRFs). Our method, called VideoGCRF is (a) efficient, (b) has a unique global minimum, and (c) can be trained end-to-end alongside contemporary deep networks for video understanding. We experiment with multiple connectivity patterns in the temporal domain, and present empirical improvements over strong baselines on the tasks of both semantic and instance segmentation of videos. Our implementation is based on the Caffe2 framework and will be available at https://github.com/siddharthachandra/gcrf-v3.0.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Multistage Adversarial Losses for Pose-Based Human Image Synthesis Document Enhancement Using Visibility Detection Demo2Vec: Reasoning Object Affordances from Online Videos Planar Shape Detection at Structural Scales Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks