Application of Split Coordinate Channel Attention Embedding U2Net in Salient Object Detection

IF 1.8 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Algorithms Pub Date : 2024-03-06 DOI:10.3390/a17030109

Yuhuan Wu, Yonghong Wu

{"title":"Application of Split Coordinate Channel Attention Embedding U2Net in Salient Object Detection","authors":"Yuhuan Wu, Yonghong Wu","doi":"10.3390/a17030109","DOIUrl":null,"url":null,"abstract":"Salient object detection (SOD) aims to identify the most visually striking objects in a scene, simulating the function of the biological visual attention system. The attention mechanism in deep learning is commonly used as an enhancement strategy which enables the neural network to concentrate on the relevant parts when processing input data, effectively improving the model’s learning and prediction abilities. Existing saliency object detection methods based on RGB deep learning typically treat all regions equally by using the extracted features, overlooking the fact that different regions have varying contributions to the final predictions. Based on the U2Net algorithm, this paper incorporates the split coordinate channel attention (SCCA) mechanism into the feature extraction stage. SCCA conducts spatial transformation in width and height dimensions to efficiently extract the location information of the target to be detected. While pixel-level semantic segmentation based on annotation has been successful, it assigns the same weight to each pixel which leads to poor performance in detecting the boundary of objects. In this paper, the Canny edge detection loss is incorporated into the loss calculation stage to improve the model’s ability to detect object edges. Based on the DUTS and HKU-IS datasets, experiments confirm that the proposed strategies effectively enhance the model’s detection performance, resulting in a 0.8% and 0.7% increase in the F1-score of U2Net. This paper also compares the traditional attention modules with the newly proposed attention, and the SCCA attention module achieves a top-three performance in prediction time, mean absolute error (MAE), F1-score, and model size on both experimental datasets.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/a17030109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Salient object detection (SOD) aims to identify the most visually striking objects in a scene, simulating the function of the biological visual attention system. The attention mechanism in deep learning is commonly used as an enhancement strategy which enables the neural network to concentrate on the relevant parts when processing input data, effectively improving the model’s learning and prediction abilities. Existing saliency object detection methods based on RGB deep learning typically treat all regions equally by using the extracted features, overlooking the fact that different regions have varying contributions to the final predictions. Based on the U2Net algorithm, this paper incorporates the split coordinate channel attention (SCCA) mechanism into the feature extraction stage. SCCA conducts spatial transformation in width and height dimensions to efficiently extract the location information of the target to be detected. While pixel-level semantic segmentation based on annotation has been successful, it assigns the same weight to each pixel which leads to poor performance in detecting the boundary of objects. In this paper, the Canny edge detection loss is incorporated into the loss calculation stage to improve the model’s ability to detect object edges. Based on the DUTS and HKU-IS datasets, experiments confirm that the proposed strategies effectively enhance the model’s detection performance, resulting in a 0.8% and 0.7% increase in the F1-score of U2Net. This paper also compares the traditional attention modules with the newly proposed attention, and the SCCA attention module achieves a top-three performance in prediction time, mean absolute error (MAE), F1-score, and model size on both experimental datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

分坐标通道注意力嵌入 U2Net 在突出物体检测中的应用

突出物体检测（SOD）旨在模拟生物视觉注意力系统的功能，识别场景中视觉冲击力最强的物体。深度学习中的注意机制通常被用作一种增强策略，它能使神经网络在处理输入数据时专注于相关部分，从而有效提高模型的学习和预测能力。现有的基于 RGB 深度学习的显著性物体检测方法通常利用提取的特征对所有区域一视同仁，忽略了不同区域对最终预测结果的贡献不同这一事实。本文在 U2Net 算法的基础上，将分裂坐标通道注意（SCCA）机制纳入特征提取阶段。SCCA 在宽度和高度维度上进行空间变换，以有效提取待检测目标的位置信息。虽然基于注释的像素级语义分割已经取得了成功，但它为每个像素赋予了相同的权重，导致检测物体边界的性能不佳。本文将 Canny 边缘检测损失纳入损失计算阶段，以提高模型检测物体边缘的能力。基于 DUTS 和 HKU-IS 数据集的实验证实，所提出的策略有效地提高了模型的检测性能，使 U2Net 的 F1 分数分别提高了 0.8% 和 0.7%。本文还将传统注意力模块与新提出的注意力进行了比较，结果表明，SCCA注意力模块在两个实验数据集上的预测时间、平均绝对误差（MAE）、F1-分数和模型大小方面的性能都达到了前三名。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊