LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation

2019 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2019-12-01 DOI:10.1109/DICTA47822.2019.8945975

Taha Emara, H. A. E. Munim, Hazem M. Abbas

引用次数: 55

Abstract

Semantic image segmentation plays a pivotal role in many vision applications including autonomous driving and medical image analysis. Most of the former approaches move towards enhancing the performance in terms of accuracy with a little awareness of computational efficiency. In this paper, we introduce LiteSeg, a lightweight architecture for semantic image segmentation. In this work, we explore a new deeper version of Atrous Spatial Pyramid Pooling module (ASPP) and apply short and long residual connections, and depthwise separable convolution, resulting in a faster and efficient model. LiteSeg architecture is introduced and tested with multiple backbone networks as Darknet19, MobileNet, and ShuffleNet to provide multiple trade-offs between accuracy and computational cost. The proposed model LiteSeg, with MobileNetV2 as a backbone network, achieves an accuracy of 67.81% mean intersection over union at 161 frames per second with 640 × 360 resolution on the Cityscapes dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LiteSeg:一种新的轻量级语义分割卷积神经网络

语义图像分割在包括自动驾驶和医学图像分析在内的许多视觉应用中起着关键作用。大多数前一种方法都是在提高精度方面的性能，而很少考虑计算效率。在本文中，我们介绍了一种轻量级的语义图像分割架构LiteSeg。在这项工作中，我们探索了一个新的更深层次的阿特劳斯空间金字塔池模块(ASPP)，并应用了长短残差连接和深度可分卷积，从而得到了一个更快、更高效的模型。LiteSeg架构在Darknet19、MobileNet和ShuffleNet等多个骨干网络上进行了介绍和测试，以提供精度和计算成本之间的多重权衡。提出的LiteSeg模型以MobileNetV2为骨干网络，在cityscape数据集上以每秒161帧和640 × 360分辨率实现了67.81%的平均相交精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量