{"title":"Prompt Learning for Light Field Semantic Segmentation in the Consumer-Centric Internet of Intelligent Computing Things","authors":"Chen Jia;Fan Shi;Xiufeng Liu;Xu Cheng;Zixuan Zhang;Meng Zhao;Shengyong Chen","doi":"10.1109/TCE.2024.3436010","DOIUrl":null,"url":null,"abstract":"Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics and CIoT, such as portrait segmentation, image editing and environmental perception. However, the high dimensionality, redundancy and computational cost of light field 4D data limit its application. To address this challenge, we decode highly interleaved data into multi-scale macro-pixel images and propose a prompt-based light field semantic segmentation network. This network incorporates an efficient transformer architecture to capture and learn global and long-range dependencies. Unlike the previous implicit embedding method, we introduce a visual prompt component called Explicit Angle Prompting (EAP) in the model. The key insight is to adaptively generate crucial angle-based visual prompts explicitly during the training stage, enhancing the understanding of geometric information such as object shape and structure. Furthermore, the self-attention design in the encoder and the multi-scale feature fusion mechanism ensure that the network comprehends the global and contextual information of the image from the perspective of the light field spatial dimensions. Extensive experiments showcase the potential of light field prompt learning for semantic segmentation, demonstrating the model’s capability to efficiently and accurately segment objects. The code will be released.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"70 3","pages":"5493-5505"},"PeriodicalIF":10.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10616212/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics and CIoT, such as portrait segmentation, image editing and environmental perception. However, the high dimensionality, redundancy and computational cost of light field 4D data limit its application. To address this challenge, we decode highly interleaved data into multi-scale macro-pixel images and propose a prompt-based light field semantic segmentation network. This network incorporates an efficient transformer architecture to capture and learn global and long-range dependencies. Unlike the previous implicit embedding method, we introduce a visual prompt component called Explicit Angle Prompting (EAP) in the model. The key insight is to adaptively generate crucial angle-based visual prompts explicitly during the training stage, enhancing the understanding of geometric information such as object shape and structure. Furthermore, the self-attention design in the encoder and the multi-scale feature fusion mechanism ensure that the network comprehends the global and contextual information of the image from the perspective of the light field spatial dimensions. Extensive experiments showcase the potential of light field prompt learning for semantic segmentation, demonstrating the model’s capability to efficiently and accurately segment objects. The code will be released.
期刊介绍:
The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.