Real-Time Segmentation with Appearance, Motion and Geometry

Mennatullah Siam, Sara Elkerdawy, M. Gamal, Moemen Abdel-Razek, Martin Jägersand, Hong Zhang
{"title":"Real-Time Segmentation with Appearance, Motion and Geometry","authors":"Mennatullah Siam, Sara Elkerdawy, M. Gamal, Moemen Abdel-Razek, Martin Jägersand, Hong Zhang","doi":"10.1109/IROS.2018.8594088","DOIUrl":null,"url":null,"abstract":"Real-time Segmentation is of crucial importance to robotics related applications such as autonomous driving, driving assisted systems, and traffic monitoring from unmanned aerial vehicles imagery. We propose a novel two-stream convolutional network for motion segmentation, which exploits flow and geometric cues to balance the accuracy and computational efficiency trade-offs. The geometric cues take advantage of the domain knowledge of the application. In case of mostly planar scenes from high altitude unmanned aerial vehicles (UAVs), homography compensated flow is used. While in the case of urban scenes in autonomous driving, with GPS/IMU sensory data available, sparse projected depth estimates and odometry information are used. The network provides 4.7⨯ speedup over the state of the art networks in motion segmentation from 153ms to 36ms, at the expense of a reduction in the segmentation accuracy in terms of pixel boundaries. This enables the network to perform real-time on a Jetson T⨯2. In order to recuperate some of the accuracy loss, geometric priors is used while still achieving a much improved computational efficiency with respect to the state-of-the-art. The usage of geometric priors improved the segmentation in UAV imagery by 5.2 % using the metric of IoU over the baseline network. While on KITTI-MoSeg the sparse depth estimates improved the segmentation by 12.5 % over the baseline. Our proposed motion segmentation solution is verified on the popular KITTI and VIVID datasets, with additional labels we have produced. The code for our work is publicly available at11https://github.com/MSiam/RTMotSeg_Geom","PeriodicalId":6640,"journal":{"name":"2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)","volume":"2021 1","pages":"5793-5800"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS.2018.8594088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Real-time Segmentation is of crucial importance to robotics related applications such as autonomous driving, driving assisted systems, and traffic monitoring from unmanned aerial vehicles imagery. We propose a novel two-stream convolutional network for motion segmentation, which exploits flow and geometric cues to balance the accuracy and computational efficiency trade-offs. The geometric cues take advantage of the domain knowledge of the application. In case of mostly planar scenes from high altitude unmanned aerial vehicles (UAVs), homography compensated flow is used. While in the case of urban scenes in autonomous driving, with GPS/IMU sensory data available, sparse projected depth estimates and odometry information are used. The network provides 4.7⨯ speedup over the state of the art networks in motion segmentation from 153ms to 36ms, at the expense of a reduction in the segmentation accuracy in terms of pixel boundaries. This enables the network to perform real-time on a Jetson T⨯2. In order to recuperate some of the accuracy loss, geometric priors is used while still achieving a much improved computational efficiency with respect to the state-of-the-art. The usage of geometric priors improved the segmentation in UAV imagery by 5.2 % using the metric of IoU over the baseline network. While on KITTI-MoSeg the sparse depth estimates improved the segmentation by 12.5 % over the baseline. Our proposed motion segmentation solution is verified on the popular KITTI and VIVID datasets, with additional labels we have produced. The code for our work is publicly available at11https://github.com/MSiam/RTMotSeg_Geom
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
实时分割与外观,运动和几何
实时分割对于自动驾驶、驾驶辅助系统和无人机图像交通监控等机器人相关应用至关重要。我们提出了一种新的双流卷积网络用于运动分割,它利用流和几何线索来平衡精度和计算效率的权衡。几何线索利用了应用程序的领域知识。对于高空无人机拍摄的多为平面场景,采用了单应性补偿流。而在自动驾驶的城市场景中,使用GPS/IMU传感数据,使用稀疏投影深度估计和里程计信息。该网络在运动分割方面提供了4.7个加速,从153ms到36ms,代价是在像素边界方面降低了分割精度。这使得网络能够在Jetson T上执行实时操作。为了恢复一些精度损失,使用几何先验,同时仍然实现了相对于最新技术的大大提高的计算效率。在基线网络上使用IoU度量,几何先验的使用将无人机图像的分割提高了5.2%。而在KITTI-MoSeg上,稀疏深度估计比基线分割率提高了12.5%。我们提出的运动分割解决方案在流行的KITTI和VIVID数据集上进行了验证,并附带了我们制作的附加标签。我们工作的代码可以在11https://github.com/MSiam/RTMotSeg_Geom上公开获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On-Chip Virtual Vortex Gear and Its Application Classification of Hanging Garments Using Learned Features Extracted from 3D Point Clouds Deep Sequential Models for Sampling-Based Planning An Adjustable Force Sensitive Sensor with an Electromagnet for a Soft, Distributed, Digital 3-axis Skin Sensor Sliding-Layer Laminates: A Robotic Material Enabling Robust and Adaptable Undulatory Locomotion
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1