{"title":"DeepLab-Rail:基于编码器-解码器结构的铁路场景语义分割网络","authors":"Qingsong Zeng, Linxuan Zhang, Yuan Wang, Xiaolong Luo, Yannan Chen","doi":"10.1117/1.jei.33.4.043038","DOIUrl":null,"url":null,"abstract":"Understanding the perimeter objects and environment changes in railway scenes is crucial for ensuring the safety of train operation. Semantic segmentation is the basis of intelligent perception and scene understanding. Railway scene categories are complex and effective features are challenging to extract. This work proposes a semantic segmentation network DeepLab-Rail based on classic yet effective encoder-decoder structure. It contains a lightweight feature extraction backbone embedded with channel attention (CA) mechanism to keep computational complexity low. To enrich the receptive fields of convolutional modules, we design a parallel and cascade convolution module called compound-atrous spatial pyramid pooling and a combination of dilated convolution ratio is selected through experiments to obtain multi-scale features. To fully use the shallow features and the high-level features, efficient CA mechanism is introduced and also the mixed loss function is designed for the problem of unbalanced label categories of the dataset. Finally, the experimental results on the RailSem19 railway dataset show that the mean intersection over union reaches 65.52% and the PA reaches 88.48%. The segmentation performance of railway confusing facilities, such as signal lights and catenary pillars, has been significantly improved and surpasses other advanced methods to our best knowledge.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"44 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DeepLab-Rail: semantic segmentation network for railway scenes based on encoder-decoder structure\",\"authors\":\"Qingsong Zeng, Linxuan Zhang, Yuan Wang, Xiaolong Luo, Yannan Chen\",\"doi\":\"10.1117/1.jei.33.4.043038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding the perimeter objects and environment changes in railway scenes is crucial for ensuring the safety of train operation. Semantic segmentation is the basis of intelligent perception and scene understanding. Railway scene categories are complex and effective features are challenging to extract. This work proposes a semantic segmentation network DeepLab-Rail based on classic yet effective encoder-decoder structure. It contains a lightweight feature extraction backbone embedded with channel attention (CA) mechanism to keep computational complexity low. To enrich the receptive fields of convolutional modules, we design a parallel and cascade convolution module called compound-atrous spatial pyramid pooling and a combination of dilated convolution ratio is selected through experiments to obtain multi-scale features. To fully use the shallow features and the high-level features, efficient CA mechanism is introduced and also the mixed loss function is designed for the problem of unbalanced label categories of the dataset. Finally, the experimental results on the RailSem19 railway dataset show that the mean intersection over union reaches 65.52% and the PA reaches 88.48%. The segmentation performance of railway confusing facilities, such as signal lights and catenary pillars, has been significantly improved and surpasses other advanced methods to our best knowledge.\",\"PeriodicalId\":54843,\"journal\":{\"name\":\"Journal of Electronic Imaging\",\"volume\":\"44 1\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Electronic Imaging\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1117/1.jei.33.4.043038\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.4.043038","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
理解铁路场景中的周边物体和环境变化对于确保列车运行安全至关重要。语义分割是智能感知和场景理解的基础。铁路场景类别复杂,提取有效特征具有挑战性。本研究基于经典而有效的编码器-解码器结构,提出了语义分割网络 DeepLab-Rail。它包含一个嵌入通道注意(CA)机制的轻量级特征提取骨干网,以保持较低的计算复杂度。为了丰富卷积模块的感受野,我们设计了一种并行和级联卷积模块,称为复合无性空间金字塔池化,并通过实验选择了扩张卷积比的组合,以获得多尺度特征。为了充分利用浅层特征和高层特征,引入了高效的 CA 机制,并针对数据集标签类别不平衡的问题设计了混合损失函数。最后,RailSem19 铁路数据集的实验结果表明,平均交叉率达到 65.52%,PA 达到 88.48%。据我们所知,信号灯和导体支柱等铁路迷惑设施的分割性能得到了显著提高,并超越了其他先进方法。
DeepLab-Rail: semantic segmentation network for railway scenes based on encoder-decoder structure
Understanding the perimeter objects and environment changes in railway scenes is crucial for ensuring the safety of train operation. Semantic segmentation is the basis of intelligent perception and scene understanding. Railway scene categories are complex and effective features are challenging to extract. This work proposes a semantic segmentation network DeepLab-Rail based on classic yet effective encoder-decoder structure. It contains a lightweight feature extraction backbone embedded with channel attention (CA) mechanism to keep computational complexity low. To enrich the receptive fields of convolutional modules, we design a parallel and cascade convolution module called compound-atrous spatial pyramid pooling and a combination of dilated convolution ratio is selected through experiments to obtain multi-scale features. To fully use the shallow features and the high-level features, efficient CA mechanism is introduced and also the mixed loss function is designed for the problem of unbalanced label categories of the dataset. Finally, the experimental results on the RailSem19 railway dataset show that the mean intersection over union reaches 65.52% and the PA reaches 88.48%. The segmentation performance of railway confusing facilities, such as signal lights and catenary pillars, has been significantly improved and surpasses other advanced methods to our best knowledge.
期刊介绍:
The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.