Precise road information extraction is crucial for transportation and intelligent sensing. Recently, the fusion of CNN and Transformer architectures in remote sensing-based road extraction, along with U-shaped semantic segmentation networks, has gained significant attention. However, existing methods rely heavily on global features while overlooking local details, limiting accuracy in complex road scenarios. To address this, we propose Trinityformer-Mamba Network (TriM-Net) to enhance local feature extraction. TriM-Net adopts Trinityformer, a modified Transformer architecture. This architecture optimizes local feature perception and reduces computational overhead by replacing the traditional softmax with an improved self-attention mechanism and a novel normalization method. The feedforward network employs a Kolmogorov-Arnold network (KAN), reducing neuron count while enhancing local detail capture using edge activation functions and the Arnold transform. Additionally, the normalization layer integrates the benefits of BatchNorm and LayerNorm for better performance. Furthermore, TriM-Net incorporates an MT_block built with stacked Mamba networks. By leveraging their internal CausalConv1D and SSM modules, this block enhances modeling and local perception while effectively merging Transformer and CNN information for improved image reconstruction. Experimental results demonstrate TriM-Net’s significant superiority over existing state-of-the-art models. On the LSRV dataset, it outperformed the second-best model with advantages of 2.17% in Precision, 0.34% in Recall, 1.72% in IoU, and 2.09% in F1-score. Similarly, on the Massachusetts Road Dataset, it achieved superior Recall (0.45%), IoU (1.41%), and F1-score (1.07%) over its closest competitor. These substantial improvements highlight TriM-Net’s outstanding performance in road information extraction.
扫码关注我们
求助内容:
应助结果提醒方式:
