Prompt Learning for Light Field Semantic Segmentation in the Consumer-Centric Internet of Intelligent Computing Things

IF 10.9 2区 计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Consumer Electronics Pub Date : 2024-07-31 DOI:10.1109/TCE.2024.3436010
Chen Jia;Fan Shi;Xiufeng Liu;Xu Cheng;Zixuan Zhang;Meng Zhao;Shengyong Chen
{"title":"Prompt Learning for Light Field Semantic Segmentation in the Consumer-Centric Internet of Intelligent Computing Things","authors":"Chen Jia;Fan Shi;Xiufeng Liu;Xu Cheng;Zixuan Zhang;Meng Zhao;Shengyong Chen","doi":"10.1109/TCE.2024.3436010","DOIUrl":null,"url":null,"abstract":"Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics and CIoT, such as portrait segmentation, image editing and environmental perception. However, the high dimensionality, redundancy and computational cost of light field 4D data limit its application. To address this challenge, we decode highly interleaved data into multi-scale macro-pixel images and propose a prompt-based light field semantic segmentation network. This network incorporates an efficient transformer architecture to capture and learn global and long-range dependencies. Unlike the previous implicit embedding method, we introduce a visual prompt component called Explicit Angle Prompting (EAP) in the model. The key insight is to adaptively generate crucial angle-based visual prompts explicitly during the training stage, enhancing the understanding of geometric information such as object shape and structure. Furthermore, the self-attention design in the encoder and the multi-scale feature fusion mechanism ensure that the network comprehends the global and contextual information of the image from the perspective of the light field spatial dimensions. Extensive experiments showcase the potential of light field prompt learning for semantic segmentation, demonstrating the model’s capability to efficiently and accurately segment objects. The code will be released.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"70 3","pages":"5493-5505"},"PeriodicalIF":10.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10616212/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics and CIoT, such as portrait segmentation, image editing and environmental perception. However, the high dimensionality, redundancy and computational cost of light field 4D data limit its application. To address this challenge, we decode highly interleaved data into multi-scale macro-pixel images and propose a prompt-based light field semantic segmentation network. This network incorporates an efficient transformer architecture to capture and learn global and long-range dependencies. Unlike the previous implicit embedding method, we introduce a visual prompt component called Explicit Angle Prompting (EAP) in the model. The key insight is to adaptively generate crucial angle-based visual prompts explicitly during the training stage, enhancing the understanding of geometric information such as object shape and structure. Furthermore, the self-attention design in the encoder and the multi-scale feature fusion mechanism ensure that the network comprehends the global and contextual information of the image from the perspective of the light field spatial dimensions. Extensive experiments showcase the potential of light field prompt learning for semantic segmentation, demonstrating the model’s capability to efficiently and accurately segment objects. The code will be released.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
以消费者为中心的智能计算物联网中的光场语义分割提示学习
光场语义分割能够准确识别场景的语义信息,为消费电子和CIoT领域的人像分割、图像编辑、环境感知等各种智能计算任务提供解决方案。然而,光场四维数据的高维数、冗余度和计算成本限制了其应用。为了解决这一挑战,我们将高度交错的数据解码成多尺度宏像素图像,并提出了一种基于提示的光场语义分割网络。该网络结合了一个有效的转换器架构来获取和学习全局和远程依赖关系。与以前的隐式嵌入方法不同,我们在模型中引入了一个名为显式角度提示(Explicit Angle prompts, EAP)的视觉提示组件。关键的洞察力是在训练阶段自适应地产生关键的基于角度的视觉提示,增强对物体形状和结构等几何信息的理解。此外,编码器的自关注设计和多尺度特征融合机制确保了网络从光场空间维度的角度理解图像的全局和上下文信息。大量的实验展示了光场提示学习在语义分割方面的潜力,证明了该模型有效、准确地分割对象的能力。代码将被发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.70
自引率
9.30%
发文量
59
审稿时长
3.3 months
期刊介绍: The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.
期刊最新文献
Context-Preserving and Sparsity-Aware Temporal Graph Network for Unified Face Forgery Detection Adaptive Edge Intelligence Framework for Resource-Constrained IoT in Consumer Electronics PAGM: Partially Aligned Global and Marginal Multi-View Contrastive Clustering for Facial Recognition in Consumer Electronics IEEE Consumer Technology Society Officers and Committee Chairs IoT Applications in Energy-Efficient Consumer Electronics for Smart Cities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1