Prompt Learning for Light Field Semantic Segmentation in the Consumer-Centric Internet of Intelligent Computing Things

IF 10.9 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Consumer Electronics Pub Date : 2024-07-31 DOI:10.1109/TCE.2024.3436010

Chen Jia;Fan Shi;Xiufeng Liu;Xu Cheng;Zixuan Zhang;Meng Zhao;Shengyong Chen

{"title":"Prompt Learning for Light Field Semantic Segmentation in the Consumer-Centric Internet of Intelligent Computing Things","authors":"Chen Jia;Fan Shi;Xiufeng Liu;Xu Cheng;Zixuan Zhang;Meng Zhao;Shengyong Chen","doi":"10.1109/TCE.2024.3436010","DOIUrl":null,"url":null,"abstract":"Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics and CIoT, such as portrait segmentation, image editing and environmental perception. However, the high dimensionality, redundancy and computational cost of light field 4D data limit its application. To address this challenge, we decode highly interleaved data into multi-scale macro-pixel images and propose a prompt-based light field semantic segmentation network. This network incorporates an efficient transformer architecture to capture and learn global and long-range dependencies. Unlike the previous implicit embedding method, we introduce a visual prompt component called Explicit Angle Prompting (EAP) in the model. The key insight is to adaptively generate crucial angle-based visual prompts explicitly during the training stage, enhancing the understanding of geometric information such as object shape and structure. Furthermore, the self-attention design in the encoder and the multi-scale feature fusion mechanism ensure that the network comprehends the global and contextual information of the image from the perspective of the light field spatial dimensions. Extensive experiments showcase the potential of light field prompt learning for semantic segmentation, demonstrating the model’s capability to efficiently and accurately segment objects. The code will be released.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"70 3","pages":"5493-5505"},"PeriodicalIF":10.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10616212/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics and CIoT, such as portrait segmentation, image editing and environmental perception. However, the high dimensionality, redundancy and computational cost of light field 4D data limit its application. To address this challenge, we decode highly interleaved data into multi-scale macro-pixel images and propose a prompt-based light field semantic segmentation network. This network incorporates an efficient transformer architecture to capture and learn global and long-range dependencies. Unlike the previous implicit embedding method, we introduce a visual prompt component called Explicit Angle Prompting (EAP) in the model. The key insight is to adaptively generate crucial angle-based visual prompts explicitly during the training stage, enhancing the understanding of geometric information such as object shape and structure. Furthermore, the self-attention design in the encoder and the multi-scale feature fusion mechanism ensure that the network comprehends the global and contextual information of the image from the perspective of the light field spatial dimensions. Extensive experiments showcase the potential of light field prompt learning for semantic segmentation, demonstrating the model’s capability to efficiently and accurately segment objects. The code will be released.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

以消费者为中心的智能计算物联网中的光场语义分割提示学习

光场语义分割能够准确识别场景的语义信息，为消费电子和CIoT领域的人像分割、图像编辑、环境感知等各种智能计算任务提供解决方案。然而，光场四维数据的高维数、冗余度和计算成本限制了其应用。为了解决这一挑战，我们将高度交错的数据解码成多尺度宏像素图像，并提出了一种基于提示的光场语义分割网络。该网络结合了一个有效的转换器架构来获取和学习全局和远程依赖关系。与以前的隐式嵌入方法不同，我们在模型中引入了一个名为显式角度提示（Explicit Angle prompts， EAP）的视觉提示组件。关键的洞察力是在训练阶段自适应地产生关键的基于角度的视觉提示，增强对物体形状和结构等几何信息的理解。此外，编码器的自关注设计和多尺度特征融合机制确保了网络从光场空间维度的角度理解图像的全局和上下文信息。大量的实验展示了光场提示学习在语义分割方面的潜力，证明了该模型有效、准确地分割对象的能力。代码将被发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Consumer Electronics 工程技术-电信学

CiteScore

7.70

自引率

9.30%

发文量

审稿时长

3.3 months

期刊介绍： The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.