FFCANet: a frequency channel fusion coordinate attention mechanism network for lane detection

Shijie Li, Shanhua Yao, Zhonggen Wang, Juan Wu
{"title":"FFCANet: a frequency channel fusion coordinate attention mechanism network for lane detection","authors":"Shijie Li, Shanhua Yao, Zhonggen Wang, Juan Wu","doi":"10.1007/s00371-024-03626-6","DOIUrl":null,"url":null,"abstract":"<p>Lane line detection becomes a challenging task in complex and dynamic driving scenarios. Addressing the limitations of existing lane line detection algorithms, which struggle to balance accuracy and efficiency in complex and changing traffic scenarios, a frequency channel fusion coordinate attention mechanism network (FFCANet) for lane detection is proposed. A residual neural network (ResNet) is used as a feature extraction backbone network. We propose a feature enhancement method with a frequency channel fusion coordinate attention mechanism (FFCA) that captures feature information from different spatial orientations and then uses multiple frequency components to extract detail and texture features of lane lines. A row-anchor-based prediction and classification method treats lane line detection as a problem of selecting lane marking anchors within row-oriented cells predefined by global features, which greatly improves the detection speed and can handle visionless driving scenarios. Additionally, an efficient channel attention (ECA) module is integrated into the auxiliary segmentation branch to capture dynamic dependencies between channels, further enhancing feature extraction capabilities. The performance of the model is evaluated on two publicly available datasets, TuSimple and CULane. Simulation results demonstrate that the average processing time per image frame is 5.0 ms, with an accuracy of 96.09% on the TuSimple dataset and an F1 score of 72.8% on the CULane dataset. The model exhibits excellent robustness in detecting complex scenes while effectively balancing detection accuracy and speed. The source code is available at https://github.com/lsj1012/FFCANet/tree/master</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03626-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Lane line detection becomes a challenging task in complex and dynamic driving scenarios. Addressing the limitations of existing lane line detection algorithms, which struggle to balance accuracy and efficiency in complex and changing traffic scenarios, a frequency channel fusion coordinate attention mechanism network (FFCANet) for lane detection is proposed. A residual neural network (ResNet) is used as a feature extraction backbone network. We propose a feature enhancement method with a frequency channel fusion coordinate attention mechanism (FFCA) that captures feature information from different spatial orientations and then uses multiple frequency components to extract detail and texture features of lane lines. A row-anchor-based prediction and classification method treats lane line detection as a problem of selecting lane marking anchors within row-oriented cells predefined by global features, which greatly improves the detection speed and can handle visionless driving scenarios. Additionally, an efficient channel attention (ECA) module is integrated into the auxiliary segmentation branch to capture dynamic dependencies between channels, further enhancing feature extraction capabilities. The performance of the model is evaluated on two publicly available datasets, TuSimple and CULane. Simulation results demonstrate that the average processing time per image frame is 5.0 ms, with an accuracy of 96.09% on the TuSimple dataset and an F1 score of 72.8% on the CULane dataset. The model exhibits excellent robustness in detecting complex scenes while effectively balancing detection accuracy and speed. The source code is available at https://github.com/lsj1012/FFCANet/tree/master

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FFCANet:用于车道检测的频率信道融合协调注意机制网络
在复杂多变的驾驶场景中,车道线检测成为一项具有挑战性的任务。针对现有车道线检测算法在复杂多变的交通场景中难以兼顾准确性和效率的局限性,提出了一种用于车道检测的频率信道融合协调注意力机制网络(FFCANet)。残差神经网络(ResNet)被用作特征提取骨干网络。我们提出了一种采用频率信道融合坐标注意机制(FFCA)的特征增强方法,该方法可捕获来自不同空间方向的特征信息,然后使用多个频率分量来提取车道线的细节和纹理特征。基于行锚点的预测和分类方法将车道线检测视为在全局特征预定义的面向行的单元内选择车道标记锚点的问题,从而大大提高了检测速度,并可处理无视觉驾驶场景。此外,在辅助分割分支中还集成了高效通道关注(ECA)模块,以捕捉通道之间的动态依赖关系,从而进一步提高特征提取能力。该模型的性能在两个公开的数据集(TuSimple 和 CULane)上进行了评估。仿真结果表明,每帧图像的平均处理时间为 5.0 毫秒,在 TuSimple 数据集上的准确率为 96.09%,在 CULane 数据集上的 F1 分数为 72.8%。该模型在检测复杂场景时表现出卓越的鲁棒性,同时有效平衡了检测精度和速度。源代码见 https://github.com/lsj1012/FFCANet/tree/master
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Advanced deepfake detection with enhanced Resnet-18 and multilayer CNN max pooling Video-driven musical composition using large language model with memory-augmented state space 3D human pose estimation using spatiotemporal hypergraphs and its public benchmark on opera videos Topological structure extraction for computing surface–surface intersection curves Lunet: an enhanced upsampling fusion network with efficient self-attention for semantic segmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1