{"title":"DeepLab-Rail: semantic segmentation network for railway scenes based on encoder-decoder structure","authors":"Qingsong Zeng, Linxuan Zhang, Yuan Wang, Xiaolong Luo, Yannan Chen","doi":"10.1117/1.jei.33.4.043038","DOIUrl":null,"url":null,"abstract":"Understanding the perimeter objects and environment changes in railway scenes is crucial for ensuring the safety of train operation. Semantic segmentation is the basis of intelligent perception and scene understanding. Railway scene categories are complex and effective features are challenging to extract. This work proposes a semantic segmentation network DeepLab-Rail based on classic yet effective encoder-decoder structure. It contains a lightweight feature extraction backbone embedded with channel attention (CA) mechanism to keep computational complexity low. To enrich the receptive fields of convolutional modules, we design a parallel and cascade convolution module called compound-atrous spatial pyramid pooling and a combination of dilated convolution ratio is selected through experiments to obtain multi-scale features. To fully use the shallow features and the high-level features, efficient CA mechanism is introduced and also the mixed loss function is designed for the problem of unbalanced label categories of the dataset. Finally, the experimental results on the RailSem19 railway dataset show that the mean intersection over union reaches 65.52% and the PA reaches 88.48%. The segmentation performance of railway confusing facilities, such as signal lights and catenary pillars, has been significantly improved and surpasses other advanced methods to our best knowledge.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"44 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.4.043038","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding the perimeter objects and environment changes in railway scenes is crucial for ensuring the safety of train operation. Semantic segmentation is the basis of intelligent perception and scene understanding. Railway scene categories are complex and effective features are challenging to extract. This work proposes a semantic segmentation network DeepLab-Rail based on classic yet effective encoder-decoder structure. It contains a lightweight feature extraction backbone embedded with channel attention (CA) mechanism to keep computational complexity low. To enrich the receptive fields of convolutional modules, we design a parallel and cascade convolution module called compound-atrous spatial pyramid pooling and a combination of dilated convolution ratio is selected through experiments to obtain multi-scale features. To fully use the shallow features and the high-level features, efficient CA mechanism is introduced and also the mixed loss function is designed for the problem of unbalanced label categories of the dataset. Finally, the experimental results on the RailSem19 railway dataset show that the mean intersection over union reaches 65.52% and the PA reaches 88.48%. The segmentation performance of railway confusing facilities, such as signal lights and catenary pillars, has been significantly improved and surpasses other advanced methods to our best knowledge.
期刊介绍:
The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.