基于场景几何的语义图像分割

Sotirios Papadopoulos, Ioannis Mademlis, I. Pitas
{"title":"基于场景几何的语义图像分割","authors":"Sotirios Papadopoulos, Ioannis Mademlis, I. Pitas","doi":"10.1109/ICAS49788.2021.9551117","DOIUrl":null,"url":null,"abstract":"Semantic image segmentation is an important functionality in various applications, such as robotic vision for autonomous cars, drones, etc. Modern Convolutional Neural Networks (CNNs) process input RGB images and predict per-pixel semantic classes. Depth maps have been successfully utilized to increase accuracy over RGB-only input. They can be used as an additional input channel complementing the RGB image, or they may be estimated by an extra neural branch under a multitask training setting. Contrary to these approaches, in this paper we explore a novel regularizer that penalizes differences between semantic and self-supervised depth predictions on presumed object boundaries during CNN training. The proposed method does not resort to multitask training (which may require a more complex CNN backbone to avoid underfitting), does not rely on RGB-D or stereoscopic 3D training data and does not require known or estimated depth maps during inference. Quantitative evaluation on a public scene parsing video dataset for autonomous driving indicates enhanced semantic segmentation accuracy with zero inference runtime overhead.","PeriodicalId":287105,"journal":{"name":"2021 IEEE International Conference on Autonomous Systems (ICAS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Semantic Image Segmentation Guided By Scene Geometry\",\"authors\":\"Sotirios Papadopoulos, Ioannis Mademlis, I. Pitas\",\"doi\":\"10.1109/ICAS49788.2021.9551117\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic image segmentation is an important functionality in various applications, such as robotic vision for autonomous cars, drones, etc. Modern Convolutional Neural Networks (CNNs) process input RGB images and predict per-pixel semantic classes. Depth maps have been successfully utilized to increase accuracy over RGB-only input. They can be used as an additional input channel complementing the RGB image, or they may be estimated by an extra neural branch under a multitask training setting. Contrary to these approaches, in this paper we explore a novel regularizer that penalizes differences between semantic and self-supervised depth predictions on presumed object boundaries during CNN training. The proposed method does not resort to multitask training (which may require a more complex CNN backbone to avoid underfitting), does not rely on RGB-D or stereoscopic 3D training data and does not require known or estimated depth maps during inference. Quantitative evaluation on a public scene parsing video dataset for autonomous driving indicates enhanced semantic segmentation accuracy with zero inference runtime overhead.\",\"PeriodicalId\":287105,\"journal\":{\"name\":\"2021 IEEE International Conference on Autonomous Systems (ICAS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Autonomous Systems (ICAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAS49788.2021.9551117\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Autonomous Systems (ICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAS49788.2021.9551117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

语义图像分割是各种应用中的重要功能,例如自动驾驶汽车的机器人视觉,无人机等。现代卷积神经网络(cnn)处理输入的RGB图像并预测每个像素的语义类。深度图已被成功地用于提高仅rgb输入的准确性。它们可以用作补充RGB图像的额外输入通道,也可以在多任务训练设置下由额外的神经分支进行估计。与这些方法相反,在本文中,我们探索了一种新的正则化器,该正则化器在CNN训练期间对假定对象边界进行语义和自监督深度预测之间的差异进行惩罚。该方法不需要多任务训练(这可能需要更复杂的CNN主干来避免欠拟合),不依赖于RGB-D或立体3D训练数据,在推理过程中不需要已知或估计的深度图。对自动驾驶公共场景解析视频数据集的定量评价表明,在零推理运行时开销的情况下,语义分割精度得到了提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Semantic Image Segmentation Guided By Scene Geometry
Semantic image segmentation is an important functionality in various applications, such as robotic vision for autonomous cars, drones, etc. Modern Convolutional Neural Networks (CNNs) process input RGB images and predict per-pixel semantic classes. Depth maps have been successfully utilized to increase accuracy over RGB-only input. They can be used as an additional input channel complementing the RGB image, or they may be estimated by an extra neural branch under a multitask training setting. Contrary to these approaches, in this paper we explore a novel regularizer that penalizes differences between semantic and self-supervised depth predictions on presumed object boundaries during CNN training. The proposed method does not resort to multitask training (which may require a more complex CNN backbone to avoid underfitting), does not rely on RGB-D or stereoscopic 3D training data and does not require known or estimated depth maps during inference. Quantitative evaluation on a public scene parsing video dataset for autonomous driving indicates enhanced semantic segmentation accuracy with zero inference runtime overhead.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improving Automated Search for Underwater Threats Using Multistatic Sensor Fields by Incorporating Unconfirmed Track Information Matching Models for Crowd-Shipping Considering Shipper’s Acceptance Uncertainty Observational Learning: Imitation Through an Adaptive Probabilistic Approach Simultaneous Calibration of Positions, Orientations, and Time Offsets, Among Multiple Microphone Arrays Modified crop health monitoring and pesticide spraying system using NDVI and Semantic Segmentation: An AGROCOPTER based approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1