首页 > 最新文献

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Fast and Accurate Online Video Object Segmentation via Tracking Parts 基于跟踪部件的快速准确在线视频目标分割
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00774
Jingchun Cheng, Yi-Hsuan Tsai, Wei-Chih Hung, Shengjin Wang, Ming-Hsuan Yang
Online video object segmentation is a challenging task as it entails to process the image sequence timely and accurately. To segment a target object through the video, numerous CNN-based methods have been developed by heavily finetuning on the object mask in the first frame, which is time-consuming for online applications. In this paper, we propose a fast and accurate video object segmentation algorithm that can immediately start the segmentation process once receiving the images. We first utilize a part-based tracking method to deal with challenging factors such as large deformation, occlusion, and cluttered background. Based on the tracked bounding boxes of parts, we construct a region-of-interest segmentation network to generate part masks. Finally, a similarity-based scoring function is adopted to refine these object parts by comparing them to the visual information in the first frame. Our method performs favorably against state-of-the-art algorithms in accuracy on the DAVIS benchmark dataset, while achieving much faster runtime performance.
在线视频目标分割是一项具有挑战性的任务,因为它需要及时、准确地处理图像序列。为了在视频中分割目标物体,许多基于cnn的方法都是在第一帧中对目标蒙版进行大量微调,这对于在线应用来说非常耗时。本文提出了一种快速准确的视频目标分割算法,可以在接收到图像后立即开始分割过程。我们首先利用基于零件的跟踪方法来处理具有挑战性的因素,如大变形,遮挡和杂乱的背景。基于跟踪的零件边界框,构造感兴趣区域分割网络生成零件蒙版。最后,采用基于相似性的评分函数,通过与第一帧的视觉信息进行比较,对这些目标部分进行细化。在DAVIS基准数据集上,我们的方法在准确性上优于最先进的算法,同时实现了更快的运行时性能。
{"title":"Fast and Accurate Online Video Object Segmentation via Tracking Parts","authors":"Jingchun Cheng, Yi-Hsuan Tsai, Wei-Chih Hung, Shengjin Wang, Ming-Hsuan Yang","doi":"10.1109/CVPR.2018.00774","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00774","url":null,"abstract":"Online video object segmentation is a challenging task as it entails to process the image sequence timely and accurately. To segment a target object through the video, numerous CNN-based methods have been developed by heavily finetuning on the object mask in the first frame, which is time-consuming for online applications. In this paper, we propose a fast and accurate video object segmentation algorithm that can immediately start the segmentation process once receiving the images. We first utilize a part-based tracking method to deal with challenging factors such as large deformation, occlusion, and cluttered background. Based on the tracked bounding boxes of parts, we construct a region-of-interest segmentation network to generate part masks. Finally, a similarity-based scoring function is adopted to refine these object parts by comparing them to the visual information in the first frame. Our method performs favorably against state-of-the-art algorithms in accuracy on the DAVIS benchmark dataset, while achieving much faster runtime performance.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"25 1","pages":"7415-7424"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75071450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 215
Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning 走向驾驶场景理解:一个学习驾驶员行为和因果推理的数据集
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00803
Vasili Ramanishka, Yi-Ting Chen, Teruhisa Misu, Kate Saenko
Driving Scene understanding is a key ingredient for intelligent transportation systems. To achieve systems that can operate in a complex physical and social environment, they need to understand and learn how humans drive and interact with traffic scenes. We present the Honda Research Institute Driving Dataset (HDD), a challenging dataset to enable research on learning driver behavior in real-life environments. The dataset includes 104 hours of real human driving in the San Francisco Bay Area collected using an instrumented vehicle equipped with different sensors. We provide a detailed analysis of HDD with a comparison to other driving datasets. A novel annotation methodology is introduced to enable research on driver behavior understanding from untrimmed data sequences. As the first step, baseline algorithms for driver behavior detection are trained and tested to demonstrate the feasibility of the proposed task.
驾驶场景理解是智能交通系统的关键组成部分。为了实现可以在复杂的物理和社会环境中运行的系统,他们需要理解和学习人类如何驾驶以及如何与交通场景交互。我们展示了本田研究所驾驶数据集(HDD),这是一个具有挑战性的数据集,可以在现实环境中学习驾驶员行为。该数据集包括在旧金山湾区使用配备不同传感器的仪表车辆收集的104小时真人驾驶数据。我们提供了一个详细的分析硬盘与其他驱动数据集的比较。提出了一种新的标注方法,用于从未修剪的数据序列中理解驾驶员行为。作为第一步,对驾驶员行为检测的基线算法进行训练和测试,以证明所提出任务的可行性。
{"title":"Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning","authors":"Vasili Ramanishka, Yi-Ting Chen, Teruhisa Misu, Kate Saenko","doi":"10.1109/CVPR.2018.00803","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00803","url":null,"abstract":"Driving Scene understanding is a key ingredient for intelligent transportation systems. To achieve systems that can operate in a complex physical and social environment, they need to understand and learn how humans drive and interact with traffic scenes. We present the Honda Research Institute Driving Dataset (HDD), a challenging dataset to enable research on learning driver behavior in real-life environments. The dataset includes 104 hours of real human driving in the San Francisco Bay Area collected using an instrumented vehicle equipped with different sensors. We provide a detailed analysis of HDD with a comparison to other driving datasets. A novel annotation methodology is introduced to enable research on driver behavior understanding from untrimmed data sequences. As the first step, baseline algorithms for driver behavior detection are trained and tested to demonstrate the feasibility of the proposed task.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"26 1","pages":"7699-7707"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75441246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 210
Through-Wall Human Pose Estimation Using Radio Signals 利用无线电信号的穿墙人体姿态估计
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00768
Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, A. Torralba, D. Katabi
This paper demonstrates accurate human pose estimation through walls and occlusions. We leverage the fact that wireless signals in the WiFi frequencies traverse walls and reflect off the human body. We introduce a deep neural network approach that parses such radio signals to estimate 2D poses. Since humans cannot annotate radio signals, we use state-of-the-art vision model to provide cross-modal supervision. Specifically, during training the system uses synchronized wireless and visual inputs, extracts pose information from the visual stream, and uses it to guide the training process. Once trained, the network uses only the wireless signal for pose estimation. We show that, when tested on visible scenes, the radio-based system is almost as accurate as the vision-based system used to train it. Yet, unlike vision-based pose estimation, the radio-based system can estimate 2D poses through walls despite never trained on such scenarios. Demo videos are available at our website.
本文演示了通过墙壁和遮挡的准确人体姿态估计。我们利用WiFi频率的无线信号穿过墙壁并被人体反射的事实。我们引入了一种深度神经网络方法来解析这些无线电信号来估计二维姿态。由于人类无法注释无线电信号,我们使用最先进的视觉模型来提供跨模态监督。具体来说,在训练过程中,系统使用同步的无线和视觉输入,从视觉流中提取姿势信息,并用它来指导训练过程。经过训练后,网络只使用无线信号进行姿态估计。我们表明,当在可见场景中进行测试时,基于无线电的系统几乎与用于训练它的基于视觉的系统一样准确。然而,与基于视觉的姿势估计不同,基于无线电的系统可以隔着墙壁估计2D姿势,尽管从未接受过此类场景的训练。演示视频可以在我们的网站上找到。
{"title":"Through-Wall Human Pose Estimation Using Radio Signals","authors":"Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, A. Torralba, D. Katabi","doi":"10.1109/CVPR.2018.00768","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00768","url":null,"abstract":"This paper demonstrates accurate human pose estimation through walls and occlusions. We leverage the fact that wireless signals in the WiFi frequencies traverse walls and reflect off the human body. We introduce a deep neural network approach that parses such radio signals to estimate 2D poses. Since humans cannot annotate radio signals, we use state-of-the-art vision model to provide cross-modal supervision. Specifically, during training the system uses synchronized wireless and visual inputs, extracts pose information from the visual stream, and uses it to guide the training process. Once trained, the network uses only the wireless signal for pose estimation. We show that, when tested on visible scenes, the radio-based system is almost as accurate as the vision-based system used to train it. Yet, unlike vision-based pose estimation, the radio-based system can estimate 2D poses through walls despite never trained on such scenarios. Demo videos are available at our website.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"95 1","pages":"7356-7365"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75452796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 435
Discovering Point Lights with Intensity Distance Fields 发现点光与强度距离场
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00694
Edward Zhang, Michael F. Cohen, B. Curless
We introduce the light localization problem. A scene is illuminated by a set of unobserved isotropic point lights. Given the geometry, materials, and illuminated, appearance of the scene, the light localization problem is to completely recover the number, positions, and intensities of the lights. We first present a scene transform that identifies likely light positions. Based on this transform, we develop an iterative algorithm to locate remaining lights and determine all light intensities. We demonstrate the success of this method in a large set of 2D synthetic scenes, and show that it extends to 3D, in both synthetic scenes and real-world scenes.
我们介绍了光定位问题。场景由一组不可见的各向同性点灯照亮。给定场景的几何、材料和照明外观,光定位问题是完全恢复光的数量、位置和强度。我们首先提出一个场景变换,识别可能的光位置。基于这种变换,我们开发了一种迭代算法来定位剩余的光并确定所有的光强度。我们在大量的2D合成场景中证明了这种方法的成功,并表明它可以在合成场景和现实场景中扩展到3D。
{"title":"Discovering Point Lights with Intensity Distance Fields","authors":"Edward Zhang, Michael F. Cohen, B. Curless","doi":"10.1109/CVPR.2018.00694","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00694","url":null,"abstract":"We introduce the light localization problem. A scene is illuminated by a set of unobserved isotropic point lights. Given the geometry, materials, and illuminated, appearance of the scene, the light localization problem is to completely recover the number, positions, and intensities of the lights. We first present a scene transform that identifies likely light positions. Based on this transform, we develop an iterative algorithm to locate remaining lights and determine all light intensities. We demonstrate the success of this method in a large set of 2D synthetic scenes, and show that it extends to 3D, in both synthetic scenes and real-world scenes.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"8 1","pages":"6635-6643"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75455062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Reflection Removal for Large-Scale 3D Point Clouds 大规模3D点云的反射去除
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00483
J. Yun, Jae-Young Sim
Large-scale 3D point clouds (LS3DPCs) captured by terrestrial LiDAR scanners often exhibit reflection artifacts by glasses, which degrade the performance of related computer vision techniques. In this paper, we propose an efficient reflection removal algorithm for LS3DPCs. We first partition the unit sphere into local surface patches which are then classified into the ordinary patches and the glass patches according to the number of echo pulses from emitted laser pulses. Then we estimate the glass region of dominant reflection artifacts by measuring the reliability. We also detect and remove the virtual points using the conditions of the reflection symmetry and the geometric similarity. We test the performance of the proposed algorithm on LS3DPCs capturing real-world outdoor scenes, and show that the proposed algorithm estimates valid glass regions faithfully and removes the virtual points caused by reflection artifacts successfully.
地面激光雷达扫描仪捕获的大规模3D点云(ls3dpc)通常会出现眼镜反射伪影,从而降低了相关计算机视觉技术的性能。本文提出了一种有效的LS3DPCs反射去除算法。首先将单位球体划分为局部表面斑块,然后根据激光脉冲的回波脉冲数将其划分为普通斑块和玻璃斑块。然后通过测量可靠性来估计主反射伪影的玻璃区域。利用反射对称和几何相似条件检测和去除虚点。我们在捕捉真实室外场景的ls3dpc上测试了该算法的性能,结果表明,该算法能够真实地估计有效的玻璃区域,并成功地消除了反射伪影引起的虚拟点。
{"title":"Reflection Removal for Large-Scale 3D Point Clouds","authors":"J. Yun, Jae-Young Sim","doi":"10.1109/CVPR.2018.00483","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00483","url":null,"abstract":"Large-scale 3D point clouds (LS3DPCs) captured by terrestrial LiDAR scanners often exhibit reflection artifacts by glasses, which degrade the performance of related computer vision techniques. In this paper, we propose an efficient reflection removal algorithm for LS3DPCs. We first partition the unit sphere into local surface patches which are then classified into the ordinary patches and the glass patches according to the number of echo pulses from emitted laser pulses. Then we estimate the glass region of dominant reflection artifacts by measuring the reliability. We also detect and remove the virtual points using the conditions of the reflection symmetry and the geometric similarity. We test the performance of the proposed algorithm on LS3DPCs capturing real-world outdoor scenes, and show that the proposed algorithm estimates valid glass regions faithfully and removes the virtual points caused by reflection artifacts successfully.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"29 1","pages":"4597-4605"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77973057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Self-Calibrating Polarising Radiometric Calibration 自校准偏振辐射校准
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00299
Daniel Teo, Boxin Shi, Yinqiang Zheng, Sai-Kit Yeung
We present a self-calibrating polarising radiometric calibration method. From a set of images taken from a single viewpoint under different unknown polarising angles, we recover the inverse camera response function and the polarising angles relative to the first angle. The problem is solved in an integrated manner, recovering both of the unknowns simultaneously. The method exploits the fact that the intensity of polarised light should vary sinusoidally as the polarising filter is rotated, provided that the response is linear. It offers the first solution to demonstrate the possibility of radiometric calibration through polarisation. We evaluate the accuracy of our proposed method using synthetic data and real world objects captured using different cameras. The self-calibrated results were found to be comparable with those from multiple exposure sequence.
提出了一种自定标偏振辐射定标方法。从一组不同未知偏振角下的单一视点图像中,我们恢复了相机反响应函数和相对于第一个角度的偏振角。问题以综合的方式解决,同时恢复两个未知数。该方法利用了这样一个事实,即当偏振滤光片旋转时,偏振光的强度应该呈正弦变化,只要响应是线性的。它提供了第一个解决方案,以证明通过极化辐射校准的可能性。我们使用合成数据和使用不同相机捕获的真实世界物体来评估我们提出的方法的准确性。发现自校准结果与多次曝光序列的结果相当。
{"title":"Self-Calibrating Polarising Radiometric Calibration","authors":"Daniel Teo, Boxin Shi, Yinqiang Zheng, Sai-Kit Yeung","doi":"10.1109/CVPR.2018.00299","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00299","url":null,"abstract":"We present a self-calibrating polarising radiometric calibration method. From a set of images taken from a single viewpoint under different unknown polarising angles, we recover the inverse camera response function and the polarising angles relative to the first angle. The problem is solved in an integrated manner, recovering both of the unknowns simultaneously. The method exploits the fact that the intensity of polarised light should vary sinusoidally as the polarising filter is rotated, provided that the response is linear. It offers the first solution to demonstrate the possibility of radiometric calibration through polarisation. We evaluate the accuracy of our proposed method using synthetic data and real world objects captured using different cameras. The self-calibrated results were found to be comparable with those from multiple exposure sequence.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"256 1","pages":"2831-2839"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73112827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Document Enhancement Using Visibility Detection 文档增强使用可见性检测
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00252
Netanel Kligler, S. Katz, A. Tal
This paper re-visits classical problems in document enhancement. Rather than proposing a new algorithm for a specific problem, we introduce a novel general approach. The key idea is to modify any state-of-the-art algorithm, by providing it with new information (input), improving its own results. Interestingly, this information is based on a solution to a seemingly unrelated problem of visibility detection in R3. We show that a simple representation of an image as a 3D point cloud, gives visibility detection on this cloud a new interpretation. What does it mean for a point to be visible? Although this question has been widely studied within computer vision, it has always been assumed that the point set is a sampling of a real scene. We show that the answer to this question in our context reveals unique and useful information about the image. We demonstrate the benefit of this idea for document binarization and for unshadowing.
本文重新探讨了文献增强中的经典问题。我们不是针对特定问题提出新的算法,而是引入一种新的通用方法。关键思想是修改任何最先进的算法,通过提供新的信息(输入),改进它自己的结果。有趣的是,这些信息是基于R3中一个看似无关的可见性检测问题的解决方案。我们展示了将图像简单地表示为3D点云,从而为该云的可见性检测提供了一种新的解释。一个点是可见的意味着什么?尽管这个问题在计算机视觉领域已经得到了广泛的研究,但人们总是假设点集是真实场景的采样。我们表明,在我们的上下文中,这个问题的答案揭示了关于图像的独特和有用的信息。我们将演示这种思想对文档二值化和去阴影的好处。
{"title":"Document Enhancement Using Visibility Detection","authors":"Netanel Kligler, S. Katz, A. Tal","doi":"10.1109/CVPR.2018.00252","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00252","url":null,"abstract":"This paper re-visits classical problems in document enhancement. Rather than proposing a new algorithm for a specific problem, we introduce a novel general approach. The key idea is to modify any state-of-the-art algorithm, by providing it with new information (input), improving its own results. Interestingly, this information is based on a solution to a seemingly unrelated problem of visibility detection in R3. We show that a simple representation of an image as a 3D point cloud, gives visibility detection on this cloud a new interpretation. What does it mean for a point to be visible? Although this question has been widely studied within computer vision, it has always been assumed that the point set is a sampling of a real scene. We show that the answer to this question in our context reveals unique and useful information about the image. We demonstrate the benefit of this idea for document binarization and for unshadowing.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"2019 1","pages":"2374-2382"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72667501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Learning Globally Optimized Object Detector via Policy Gradient 通过策略梯度学习全局优化的目标检测器
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00648
Yongming Rao, Dahua Lin, Jiwen Lu, Jie Zhou
In this paper, we propose a simple yet effective method to learn globally optimized detector for object detection, which is a simple modification to the standard cross-entropy gradient inspired by the REINFORCE algorithm. In our approach, the cross-entropy gradient is adaptively adjusted according to overall mean Average Precision (mAP) of the current state for each detection candidate, which leads to more effective gradient and global optimization of detection results, and brings no computational overhead. Benefiting from more precise gradients produced by the global optimization method, our framework significantly improves state-of-the-art object detectors. Furthermore, since our method is based on scores and bounding boxes without modification on the architecture of object detector, it can be easily applied to off-the-shelf modern object detection frameworks.
在本文中,我们提出了一种简单而有效的学习全局优化检测器的方法,该方法是受增强算法启发对标准交叉熵梯度的简单修改。在我们的方法中,交叉熵梯度根据每个候选检测的当前状态的总体平均精度(mAP)自适应调整,从而实现更有效的梯度和检测结果的全局优化,并且不会带来计算开销。得益于全局优化方法产生的更精确的梯度,我们的框架显着提高了最先进的目标检测器。此外,由于我们的方法是基于分数和边界框而不修改目标检测器的架构,因此可以很容易地应用于现成的现代目标检测框架。
{"title":"Learning Globally Optimized Object Detector via Policy Gradient","authors":"Yongming Rao, Dahua Lin, Jiwen Lu, Jie Zhou","doi":"10.1109/CVPR.2018.00648","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00648","url":null,"abstract":"In this paper, we propose a simple yet effective method to learn globally optimized detector for object detection, which is a simple modification to the standard cross-entropy gradient inspired by the REINFORCE algorithm. In our approach, the cross-entropy gradient is adaptively adjusted according to overall mean Average Precision (mAP) of the current state for each detection candidate, which leads to more effective gradient and global optimization of detection results, and brings no computational overhead. Benefiting from more precise gradients produced by the global optimization method, our framework significantly improves state-of-the-art object detectors. Furthermore, since our method is based on scores and bounding boxes without modification on the architecture of object detector, it can be easily applied to off-the-shelf modern object detection frameworks.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"2 1","pages":"6190-6198"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73776515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup PairedCycleGAN:不对称风格的化妆和卸妆转换
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00012
Huiwen Chang, Jingwan Lu, F. Yu, Adam Finkelstein
This paper introduces an automatic method for editing a portrait photo so that the subject appears to be wearing makeup in the style of another person in a reference photo. Our unsupervised learning approach relies on a new framework of cycle-consistent generative adversarial networks. Different from the image domain transfer problem, our style transfer problem involves two asymmetric functions: a forward function encodes example-based style transfer, whereas a backward function removes the style. We construct two coupled networks to implement these functions - one that transfers makeup style and a second that can remove makeup - such that the output of their successive application to an input photo will match the input. The learned style network can then quickly apply an arbitrary makeup style to an arbitrary photo. We demonstrate the effectiveness on a broad range of portraits and styles.
本文介绍了一种自动编辑人像照片的方法,使拍摄对象看起来像参考照片中另一个人的风格化妆。我们的无监督学习方法依赖于循环一致生成对抗网络的新框架。与图像域转移问题不同,我们的风格转移问题涉及两个不对称函数:一个向前函数编码基于示例的风格转移,而一个向后函数删除风格。我们构建了两个耦合网络来实现这些功能——一个传输化妆风格,另一个可以去除化妆——这样它们对输入照片的连续应用的输出将与输入相匹配。学习后的风格网络可以快速地对任意一张照片应用任意的化妆风格。我们在广泛的肖像和风格上展示了有效性。
{"title":"PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup","authors":"Huiwen Chang, Jingwan Lu, F. Yu, Adam Finkelstein","doi":"10.1109/CVPR.2018.00012","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00012","url":null,"abstract":"This paper introduces an automatic method for editing a portrait photo so that the subject appears to be wearing makeup in the style of another person in a reference photo. Our unsupervised learning approach relies on a new framework of cycle-consistent generative adversarial networks. Different from the image domain transfer problem, our style transfer problem involves two asymmetric functions: a forward function encodes example-based style transfer, whereas a backward function removes the style. We construct two coupled networks to implement these functions - one that transfers makeup style and a second that can remove makeup - such that the output of their successive application to an input photo will match the input. The learned style network can then quickly apply an arbitrary makeup style to an arbitrary photo. We demonstrate the effectiveness on a broad range of portraits and styles.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"65 1","pages":"40-48"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84349314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 233
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation 基于深度时空随机场的高效视频分割
Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00929
Siddhartha Chandra, C. Couprie, Iasonas Kokkinos
In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely-connected spatio-temporal graph by capitalizing on recent advances on deep Gaussian Conditional Random Fields (GCRFs). Our method, called VideoGCRF is (a) efficient, (b) has a unique global minimum, and (c) can be trained end-to-end alongside contemporary deep networks for video understanding. We experiment with multiple connectivity patterns in the temporal domain, and present empirical improvements over strong baselines on the tasks of both semantic and instance segmentation of videos. Our implementation is based on the Caffe2 framework and will be available at https://github.com/siddharthachandra/gcrf-v3.0.
在这项工作中,我们引入了一种时间和记忆效率高的结构化预测方法,该方法将神经元的决策在时间和空间上结合起来。我们表明,通过利用深度高斯条件随机场(GCRFs)的最新进展,我们能够在密集连接的时空图上执行精确和有效的推理。我们的方法,称为video - crf (a)高效,(b)具有独特的全局最小值,(c)可以与当代深度网络一起进行端到端视频理解训练。我们在时域中对多种连接模式进行了实验,并在视频的语义和实例分割任务上对强基线进行了经验改进。我们的实现基于Caffe2框架,可以在https://github.com/siddharthachandra/gcrf-v3.0上获得。
{"title":"Deep Spatio-Temporal Random Fields for Efficient Video Segmentation","authors":"Siddhartha Chandra, C. Couprie, Iasonas Kokkinos","doi":"10.1109/CVPR.2018.00929","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00929","url":null,"abstract":"In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely-connected spatio-temporal graph by capitalizing on recent advances on deep Gaussian Conditional Random Fields (GCRFs). Our method, called VideoGCRF is (a) efficient, (b) has a unique global minimum, and (c) can be trained end-to-end alongside contemporary deep networks for video understanding. We experiment with multiple connectivity patterns in the temporal domain, and present empirical improvements over strong baselines on the tasks of both semantic and instance segmentation of videos. Our implementation is based on the Caffe2 framework and will be available at https://github.com/siddharthachandra/gcrf-v3.0.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"51 1","pages":"8915-8924"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84773381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
期刊
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1